Assembling a Database [Archive] - Horse Racing Forum - PaceAdvantage.Com

View Full Version : Assembling a Database

Nosebob

04-15-2007, 07:01 PM

Does anyone have advice on where to start assembing a database for research purposes? I have some old well worn copies of the DRF, but not enough to establish anything meaningful. I have been involved in handicapping for over 20 years, but all my research has been of the manual variety.

I am moderately competent with Excel, but certainly not of professional quality. Also, the thought of downloading daily versions of the DRF long enough to establish a meaningful size database is some what daunting, particularly if there is a faster and better way.

To make responses a bit easier, I will list a few questions in case anyone wants to take on just one part of the answer.

1. Are there commercial databases available, and if so, size and ballpark costs?

2. Any thoughts on system requirements for establishing a database, or operating something commercially available?

3. How many factors should be included in a database?

4. What other things should I be considering?

Any comments from people who have been down this path would be greatly appreciated.

Nosebob

highnote

04-15-2007, 07:22 PM

If your ultimate goal is to do research on horse racing so that you can bet and make a lot of money then my advice would be to study the stock market instead.

However, it sounds like you like racing, so I'll try to answer some of the questions.

You can download a year's worth of data from DRF for about $1,000 per year. Maybe it is less? That would give you an lot of data to study.

If you want to study a particular track or trainer or jockey then you might need to buy more years of data.

If you're serious, then $1,000 for data shouldn't be a big barrier to entry.

That's my two cents. Good luck and good researching.

(I've edited this post about 3 times. I took out some of my rants and raves and just tried to answer your question.)

John

SAL

04-15-2007, 07:23 PM

Try this site:

http://www.jdca-racing.com/

sjk

04-15-2007, 07:28 PM

If you search there are many discussions of databasing. I would think that it is going to take 20-30 minutes or more per day for years and some cash outlay just to assemble the data. After you have the data hundreds or thousands of hours trying to figure out how to use it in a remunerative way.

If you are willing to invest the time it can definitely be worth you while (or not depending on you competence on making it work).

ryesteve

04-15-2007, 09:30 PM

You don't want to reinvent the wheel. I would suggest you read everything on this board you can about JCapper, HSH and HTR, go to their respective websites to learn even more, and then decide which one is best suited to the direction in which you'd like your data mining to proceed.

Nosebob

04-15-2007, 10:51 PM

Thanks for the responses.

Sweetyejohn,

Thanks for the advice (and taking out some of the rants!) This endeavor will be mostly for enjoyment, but it would certainly be more enjoyable if it is a little bit profitable!
---------------------------------------------------------------------
SAL,

I really appreciate the link. That may be just what I am looking for.
--------------------------------------------------------------------
Sjk,

I know you are right about the time commitment required. Over the past 20 years I have probably spent at least several hundred hours trying one thing and another without developing a firm basis for my play. Maybe a more organized approach will help.
---------------------------------------------------------------------
ryesteve,

You are correct that I don’t want to reinvent the wheel. I have read most of the posts on this site for the past 2 years and done searches on the subject of databases. With regard to data mining, I suppose I am like most others in believing if you torture the data long enough, it will tell you whatever you want to hear! :)

Nosebob

ranchwest

04-15-2007, 10:52 PM

#3: I suggest maintaining all of the data that comes from the source. Then you can extract portions of that data in your research if you wish. That will still leave the original data in case you later decide to embark on a different path.