Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > Handicapper's Corner


Reply
 
Thread Tools Rating: Thread Rating: 17 votes, 5.00 average.
Old 10-19-2014, 06:49 PM   #1
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,523
Database and Excel Spreadsheets

I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.

I would like to spend some time doing research, but it seems like such a ridiculously large project, I may be better off paying someone else to do the research for me if I can find someone wiling to do it.

I have access to free data downloads from Formulator. But each race card gives you about 7 data files with different pieces of race information. To manually download one card and then import each of the files into a spreadsheet would probably take 15 minutes a night. Then I still have to download results information, figure out how to link it all together, and then figure out how to do the inquiries that would allow me to study various factors. That's just one track!

I am not familiar with Access, but I have enough DP experience that I'd probably pick it up quickly. But even in that case, even if it makes the inquiry functions etc... a little easier to code and do, it sounds like a lot of time each night just to keep the database.

Is there an easier approach?
__________________
"Unlearning is the highest form of learning"
classhandicapper is offline   Reply With Quote Reply
Old 10-19-2014, 07:01 PM   #2
cj
@TimeformUSfigs
 
cj's Avatar
 
Join Date: Jan 2002
Location: Moore, OK
Posts: 46,816
Quote:
Originally Posted by classhandicapper
I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.

I would like to spend some time doing research, but it seems like such a ridiculously large project, I may be better off paying someone else to do the research for me if I can find someone wiling to do it.

I have access to free data downloads from Formulator. But each race card gives you about 7 data files with different pieces of race information. To manually download one card and then import each of the files into a spreadsheet would probably take 15 minutes a night. Then I still have to download results information, figure out how to link it all together, and then figure out how to do the inquiries that would allow me to study various factors. That's just one track!

I am not familiar with Access, but I have enough DP experience that I'd probably pick it up quickly. But even in that case, even if it makes the inquiry functions etc... a little easier to code and do, it sounds like a lot of time each night just to keep the database.

Is there an easier approach?
First, there is an option for a single file rather than the seven files.

Second, use Access (at the least) instead of Excel. Query power is what you will need. There are free database programs like MySQL, but Access is very simple to use.

Third, you can write a simple script to merge all the exports into one, and then import the file into Access in a few seconds.
cj is offline   Reply With Quote Reply
Old 10-19-2014, 07:06 PM   #3
cj
@TimeformUSfigs
 
cj's Avatar
 
Join Date: Jan 2002
Location: Moore, OK
Posts: 46,816
Quote:
Originally Posted by cj
First, there is an option for a single file rather than the seven files.

Second, use Access (at the least) instead of Excel. Query power is what you will need. There are free database programs like MySQL, but Access is very simple to use.

Third, you can write a simple script to merge all the exports into one, and then import the file into Access in a few seconds.
You don't even need a script to merge the exports, just open a command window and do it in a few seconds. I don't remember the extension of the exports, but something like this:

copy *.drf allexports.txt
cj is offline   Reply With Quote Reply
Old 10-19-2014, 07:34 PM   #4
Tom
The Voice of Reason!
 
Tom's Avatar
 
Join Date: Mar 2001
Location: Canandaigua, New york
Posts: 112,446
I use the HTR exports and the HDW charts for source data, and BRIS files - I have a program that extracts a set of data. I also hand add Beyer numbers to an Excel file for each track I follow. I will export various queries to Excel to work with.

Access is my database , but I export a lot to Excel to do work in. Access is very easy to use. I picked it up pretty quickly with just the tutorials from HTR on how to do it. I use it as is - no vbasic or scripting or anything advanced.

One program I find very helpful is Simple file Joiner - freeware. I use it to join text files - maybe a 100 at a time - all the daily chart files - to make one file to use for importing.
__________________
Who does the Racing Form Detective like in this one?
Tom is offline   Reply With Quote Reply
Old 10-19-2014, 08:57 PM   #5
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,506
Tough questions. Trackmaster is my source for charts, not to be taken as an endorsement as I'm sure what you're already using would be fine too. CJs tip is great to join files, jot it down for yourself, I remember that trick was the only way I was able to push a massive amount into Excel power pivot without it choking.

There are VBA routines you can put together to automate importing a whole folders worth of data into an Access db. If you're willing to work through frustrations (and being a horseplayer I'm sure you are) then much of this stuff is well within capabilities. I would say for anyone even kicking around the idea most definitely make a new years resolution to learn more about Excel, Access, VBA, macros etc. It's rewarding to gain the skills even if it's not just racing you may end up researching other sports or whatever. There are good books out there, tons of tutorials online, I'm sure people on this board will help sometimes too.

For making product I use SQL Server, Access and Excel. For research, I will pull data out of the production SQL DB which at any point is live and current each day, put it into an Access db, play around with queries in there and then last of all bring subsets of that into Excel. So the final stop for research is Excel for me.

How 'many' workbooks varies, probably half-dozen or more along the way. Just to setup for research I'm constantly moving around data because workbooks get huge and bloated from formulas and I want to start fresh sometimes with just 'values'. Formulas in excel create overhead, especially lookups. Sometimes you use 20 columns of formula to get the data how you want it to be and then finally paste the 'values only' to a new workbook to release memory. If you just want to use one app then Excel can practically be used as a db these days with its million row limit since 2007.

It depends on your requirements too, if by research you mean you personally need a live db for researching tomorrows card then it's going to be a challenge to keep it current, again in that case you have the nightly importing piece. If you can use old data like from 2012-2013 and research with that to gain some useful insights then it's more practical.
MJC922 is offline   Reply With Quote Reply
Old 10-20-2014, 06:28 AM   #6
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,506
To add to what I've already stated, you were asking if there's an easier approach to this, and I don't know what's commercially available these days but my guess is no. It's always been a hard road (at least for me) but the desire for answers pushed me on. Having someone else do the research is unthinkable to me even though on the surface it seems possible. I think you'd find out that you'd give some person with a db a list of queries and get back results that make no sense at all to you and that requires further investigation. You really need your own db unless you're looking for really simple things. I would have to know more about what you're looking to find out to make a better call but I can understand if you'd rather not discuss it in public.
MJC922 is offline   Reply With Quote Reply
Old 10-20-2014, 08:56 AM   #7
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,523
Quote:
Originally Posted by cj
First, there is an option for a single file rather than the seven files.
Thanks.

I forgot about that. I used to use that option in the old version of Formulator for your product, but I didn't even realize the new version had it.

The single file looked like a mess when I looked at it because each of the record types has a different layout. I'm having a tough time imagining how importing that mess into a database can be easy.

I will probably have a few questions for everyone later in the process.

One question I would already have is on PP data. Let's assume I load all the PP data for some horses that ran yesterday. Then they run a few weeks later. A lot of the information on their old races will be duplicate. I assume if you import it a second time it just dismisses it.
__________________
"Unlearning is the highest form of learning"

Last edited by classhandicapper; 10-20-2014 at 09:11 AM.
classhandicapper is offline   Reply With Quote Reply
Old 10-20-2014, 09:19 AM   #8
Capper Al
Registered User
 
Capper Al's Avatar
 
Join Date: Dec 2005
Location: MI
Posts: 6,330
I use both ACCESS and EXCEL. ACCESS is the must have. I like EXCEL for reporting. Much of what can be done EXCEL reports can be done just as well in ACCESS reports.
__________________


"The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under bridges, to beg in the streets, and to steal bread."

Anatole France


Capper Al is offline   Reply With Quote Reply
Old 10-20-2014, 09:44 AM   #9
sjk
Registered User
 
Join Date: Feb 2003
Posts: 2,105
I use Access and data from HDW via HSH. Years ago when I had to give up chart parsing due to the format change Dave S wrote some custom code into the then current version of HSH that helped me populate the fields of my existing database.

I have never found any limitations in Access that I could not work around. It is extremely powerful.
sjk is offline   Reply With Quote Reply
Old 10-20-2014, 06:20 PM   #10
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,506
Quote:
Originally Posted by classhandicapper
A lot of the information on their old races will be duplicate. I assume if you import it a second time it just dismisses it.
Removing duplicates will be the least of your problems. Excel even has a dedicated button for it in the ribbon. For example save an import spec for your file, record a macro, walk through running the import spec and then remove duplicates, click stop recording on the macro. The next day repeating the process is nothing more than running the macro. You will have fun -- sometimes
MJC922 is offline   Reply With Quote Reply
Old 10-20-2014, 07:12 PM   #11
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,523
Quote:
Originally Posted by MJC922
You will have fun -- sometimes
Somehow I doubt it.

The alternative is a manual study. I've done that 2 ways.

1. I would create a folder for a specific race (ex. Graded Stakes, Routes, Grass, Older horses). I would include the PPs, result charts, and a handful of notes for every race in that category for a year or two. Then one day when I was really bored I'd review each race with pen and paper to see if I could decipher them.

2. Manually enter the specific data I want to study in a spreadsheet and when the sample was large enough, see what I could figure out. I just did that kind of thing for the recently concluded mule racing meets on the California Fair Circuit. That study was so enlightening I entered some of the data for the previous 3 years also.
__________________
"Unlearning is the highest form of learning"
classhandicapper is offline   Reply With Quote Reply
Old 10-20-2014, 07:20 PM   #12
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,506
Quote:
Originally Posted by MJC922
Removing duplicates will be the least of your problems. Excel even has a dedicated button for it in the ribbon. For example save an import spec for your file, record a macro, walk through running the import spec and then remove duplicates, click stop recording on the macro. The next day repeating the process is nothing more than running the macro. You will have fun -- sometimes
My mistake an import spec is for Access but no worries, Excel's import process can be recorded in a macro along with the duplicate purge. In Access the duplicates should never make it through the import provided you have setup a unique index on the proper field or fields.

Last edited by MJC922; 10-20-2014 at 07:24 PM.
MJC922 is offline   Reply With Quote Reply
Old 10-25-2014, 04:42 AM   #13
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by classhandicapper
I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.
1. JCapper/HDW data files
2. Excel
3. 1 workbook per track, 1 record keeping worksheet (with rankings method summary) per rankings method (11 total), with an additional record keeping sheet for variable rankings methods (customizable "what ifs"/experimentation) not already hard-wired into the auto-recording routine, 1 database worksheet, 1 summary worksheet containing all rankings methods' summaries from their own individual record keeping sheets.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America

Last edited by raybo; 10-25-2014 at 04:44 AM.
raybo is offline   Reply With Quote Reply
Old 10-25-2014, 07:31 AM   #14
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Raw Data

The main data source I am using is BRIS while I also have some data that I collect in real time by scrapping odds feeds; these data are stored in a dropbox following a well know directory structure and specific naming conventions that simplify data discovery and retrieval

VM

Instead of relying on physical boxes, I use Virtual Machines (vagrant), that are easy to build from scratch, in a matter of a few minutes

Data Bases

My primary data bases are mysql, sqlite and mongodb although I also use flat files, which in some cases are best for performance reasons. I use Open Office for research and presentation purposes, although I try to not write lower level code for it

Coding

In most of the cases I use python for coding. Depending on the needs there are some cases where I also use languages like C++, lisp and octave. All the source files are kept in version control (github)
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein

Last edited by DeltaLover; 10-25-2014 at 07:33 AM.
DeltaLover is offline   Reply With Quote Reply
Old 10-25-2014, 08:32 AM   #15
DJofSD
Screw PC
 
Join Date: Jun 2003
Posts: 15,728
DeltaLover, you mentioned Mongo. I just learned about it but not used it. When I look at the capabilities of the database, I don't immediately see where I could use it. What are you doing with it, what do you stuff into it that you can not use a traditional RDBMS to do?
__________________
Truth sounds like hate to those who hate truth.
DJofSD is offline   Reply With Quote Reply
Reply




Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 08:32 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.