Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board


Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board (http://www.paceadvantage.com/forum/index.php)
-   Handicapper's Corner (http://www.paceadvantage.com/forum/forumdisplay.php?f=76)
-   -   Database and Excel Spreadsheets (http://www.paceadvantage.com/forum/showthread.php?t=117713)

classhandicapper 10-19-2014 06:49 PM

Database and Excel Spreadsheets
 
I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.

I would like to spend some time doing research, but it seems like such a ridiculously large project, I may be better off paying someone else to do the research for me if I can find someone wiling to do it.

I have access to free data downloads from Formulator. But each race card gives you about 7 data files with different pieces of race information. To manually download one card and then import each of the files into a spreadsheet would probably take 15 minutes a night. Then I still have to download results information, figure out how to link it all together, and then figure out how to do the inquiries that would allow me to study various factors. That's just one track!

I am not familiar with Access, but I have enough DP experience that I'd probably pick it up quickly. But even in that case, even if it makes the inquiry functions etc... a little easier to code and do, it sounds like a lot of time each night just to keep the database.

Is there an easier approach?

cj 10-19-2014 07:01 PM

Quote:

Originally Posted by classhandicapper
I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.

I would like to spend some time doing research, but it seems like such a ridiculously large project, I may be better off paying someone else to do the research for me if I can find someone wiling to do it.

I have access to free data downloads from Formulator. But each race card gives you about 7 data files with different pieces of race information. To manually download one card and then import each of the files into a spreadsheet would probably take 15 minutes a night. Then I still have to download results information, figure out how to link it all together, and then figure out how to do the inquiries that would allow me to study various factors. That's just one track!

I am not familiar with Access, but I have enough DP experience that I'd probably pick it up quickly. But even in that case, even if it makes the inquiry functions etc... a little easier to code and do, it sounds like a lot of time each night just to keep the database.

Is there an easier approach?

First, there is an option for a single file rather than the seven files.

Second, use Access (at the least) instead of Excel. Query power is what you will need. There are free database programs like MySQL, but Access is very simple to use.

Third, you can write a simple script to merge all the exports into one, and then import the file into Access in a few seconds.

cj 10-19-2014 07:06 PM

Quote:

Originally Posted by cj
First, there is an option for a single file rather than the seven files.

Second, use Access (at the least) instead of Excel. Query power is what you will need. There are free database programs like MySQL, but Access is very simple to use.

Third, you can write a simple script to merge all the exports into one, and then import the file into Access in a few seconds.

You don't even need a script to merge the exports, just open a command window and do it in a few seconds. I don't remember the extension of the exports, but something like this:

copy *.drf allexports.txt

Tom 10-19-2014 07:34 PM

I use the HTR exports and the HDW charts for source data, and BRIS files - I have a program that extracts a set of data. I also hand add Beyer numbers to an Excel file for each track I follow. I will export various queries to Excel to work with.

Access is my database , but I export a lot to Excel to do work in. Access is very easy to use. I picked it up pretty quickly with just the tutorials from HTR on how to do it. I use it as is - no vbasic or scripting or anything advanced.

One program I find very helpful is Simple file Joiner - freeware. I use it to join text files - maybe a 100 at a time - all the daily chart files - to make one file to use for importing.

MJC922 10-19-2014 08:57 PM

Tough questions. Trackmaster is my source for charts, not to be taken as an endorsement as I'm sure what you're already using would be fine too. CJs tip is great to join files, jot it down for yourself, I remember that trick was the only way I was able to push a massive amount into Excel power pivot without it choking.

There are VBA routines you can put together to automate importing a whole folders worth of data into an Access db. If you're willing to work through frustrations (and being a horseplayer I'm sure you are) then much of this stuff is well within capabilities. I would say for anyone even kicking around the idea most definitely make a new years resolution to learn more about Excel, Access, VBA, macros etc. It's rewarding to gain the skills even if it's not just racing you may end up researching other sports or whatever. There are good books out there, tons of tutorials online, I'm sure people on this board will help sometimes too.

For making product I use SQL Server, Access and Excel. For research, I will pull data out of the production SQL DB which at any point is live and current each day, put it into an Access db, play around with queries in there and then last of all bring subsets of that into Excel. So the final stop for research is Excel for me.

How 'many' workbooks varies, probably half-dozen or more along the way. Just to setup for research I'm constantly moving around data because workbooks get huge and bloated from formulas and I want to start fresh sometimes with just 'values'. Formulas in excel create overhead, especially lookups. Sometimes you use 20 columns of formula to get the data how you want it to be and then finally paste the 'values only' to a new workbook to release memory. If you just want to use one app then Excel can practically be used as a db these days with its million row limit since 2007.

It depends on your requirements too, if by research you mean you personally need a live db for researching tomorrows card then it's going to be a challenge to keep it current, again in that case you have the nightly importing piece. If you can use old data like from 2012-2013 and research with that to gain some useful insights then it's more practical.

MJC922 10-20-2014 06:28 AM

To add to what I've already stated, you were asking if there's an easier approach to this, and I don't know what's commercially available these days but my guess is no. It's always been a hard road (at least for me) but the desire for answers pushed me on. Having someone else do the research is unthinkable to me even though on the surface it seems possible. I think you'd find out that you'd give some person with a db a list of queries and get back results that make no sense at all to you and that requires further investigation. You really need your own db unless you're looking for really simple things. I would have to know more about what you're looking to find out to make a better call but I can understand if you'd rather not discuss it in public.

classhandicapper 10-20-2014 08:56 AM

Quote:

Originally Posted by cj
First, there is an option for a single file rather than the seven files.

Thanks.

I forgot about that. I used to use that option in the old version of Formulator for your product, but I didn't even realize the new version had it.

The single file looked like a mess when I looked at it because each of the record types has a different layout. I'm having a tough time imagining how importing that mess into a database can be easy.

I will probably have a few questions for everyone later in the process.

One question I would already have is on PP data. Let's assume I load all the PP data for some horses that ran yesterday. Then they run a few weeks later. A lot of the information on their old races will be duplicate. I assume if you import it a second time it just dismisses it.

Capper Al 10-20-2014 09:19 AM

I use both ACCESS and EXCEL. ACCESS is the must have. I like EXCEL for reporting. Much of what can be done EXCEL reports can be done just as well in ACCESS reports.

sjk 10-20-2014 09:44 AM

I use Access and data from HDW via HSH. Years ago when I had to give up chart parsing due to the format change Dave S wrote some custom code into the then current version of HSH that helped me populate the fields of my existing database.

I have never found any limitations in Access that I could not work around. It is extremely powerful.

MJC922 10-20-2014 06:20 PM

Quote:

Originally Posted by classhandicapper
A lot of the information on their old races will be duplicate. I assume if you import it a second time it just dismisses it.

Removing duplicates will be the least of your problems. Excel even has a dedicated button for it in the ribbon. For example save an import spec for your file, record a macro, walk through running the import spec and then remove duplicates, click stop recording on the macro. The next day repeating the process is nothing more than running the macro. You will have fun -- sometimes :)

classhandicapper 10-20-2014 07:12 PM

Quote:

Originally Posted by MJC922
You will have fun -- sometimes :)

Somehow I doubt it. ;)

The alternative is a manual study. I've done that 2 ways.

1. I would create a folder for a specific race (ex. Graded Stakes, Routes, Grass, Older horses). I would include the PPs, result charts, and a handful of notes for every race in that category for a year or two. Then one day when I was really bored I'd review each race with pen and paper to see if I could decipher them.

2. Manually enter the specific data I want to study in a spreadsheet and when the sample was large enough, see what I could figure out. I just did that kind of thing for the recently concluded mule racing meets on the California Fair Circuit. That study was so enlightening I entered some of the data for the previous 3 years also.

MJC922 10-20-2014 07:20 PM

Quote:

Originally Posted by MJC922
Removing duplicates will be the least of your problems. Excel even has a dedicated button for it in the ribbon. For example save an import spec for your file, record a macro, walk through running the import spec and then remove duplicates, click stop recording on the macro. The next day repeating the process is nothing more than running the macro. You will have fun -- sometimes :)

My mistake an import spec is for Access but no worries, Excel's import process can be recorded in a macro along with the duplicate purge. In Access the duplicates should never make it through the import provided you have setup a unique index on the proper field or fields.

raybo 10-25-2014 04:42 AM

Quote:

Originally Posted by classhandicapper
I have a few questions for the people that have built their own database or are using Excel spreadsheets to keep data and use them as a way of learning and doing studies.

1. What is your source of data?

2. Which database do you use?

3. If you use Excel, how many worksheets does that involve.

1. JCapper/HDW data files
2. Excel
3. 1 workbook per track, 1 record keeping worksheet (with rankings method summary) per rankings method (11 total), with an additional record keeping sheet for variable rankings methods (customizable "what ifs"/experimentation) not already hard-wired into the auto-recording routine, 1 database worksheet, 1 summary worksheet containing all rankings methods' summaries from their own individual record keeping sheets.

DeltaLover 10-25-2014 07:31 AM

Raw Data

The main data source I am using is BRIS while I also have some data that I collect in real time by scrapping odds feeds; these data are stored in a dropbox following a well know directory structure and specific naming conventions that simplify data discovery and retrieval

VM

Instead of relying on physical boxes, I use Virtual Machines (vagrant), that are easy to build from scratch, in a matter of a few minutes

Data Bases

My primary data bases are mysql, sqlite and mongodb although I also use flat files, which in some cases are best for performance reasons. I use Open Office for research and presentation purposes, although I try to not write lower level code for it

Coding

In most of the cases I use python for coding. Depending on the needs there are some cases where I also use languages like C++, lisp and octave. All the source files are kept in version control (github)

DJofSD 10-25-2014 08:32 AM

DeltaLover, you mentioned Mongo. I just learned about it but not used it. When I look at the capabilities of the database, I don't immediately see where I could use it. What are you doing with it, what do you stuff into it that you can not use a traditional RDBMS to do?


All times are GMT -4. The time now is 03:42 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 03:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.