PDA

View Full Version : Looking for Database Solution


Frankie D
03-12-2017, 09:30 PM
I am going to try to develop a database (or spreadsheet) solution to read and process data. Handicapping 3 or 4 cards is becoming tedious beyond fun, not to mention mistake ridden..

A few questions and a request for suggestions. I am somewhat computer literate and usually learn software well. This should be pretty challenging I suppose.

Does the open office "base" work? Any other free solutions that work well?

Does excell have the power? I basically want to:
- run analysis of races and cards
- print out bet sheets with essential factors i use
- performance research and optimization and weiughting of factors
- keep records etc
Any tips, suggestions.

TIA

AltonKelsey
03-12-2017, 10:23 PM
You can do just about anything in Excel, especially if you can master VBA, but I'd suggest looking at Access or some other database solution instead.

There already exists commercial software that does a lot of this. Anything you do from scratch will take hundreds of hours or more.

Frankie D
03-12-2017, 10:50 PM
Thanks Alton.

What are some of the commercial solutions and how powerful are they?

JustRalph
03-13-2017, 02:00 AM
the best database for horse racing is the Jcapper app. There's a learning curve, but I suggest you look at it.

Jcapper.com

classhandicapper
03-13-2017, 12:46 PM
I use Access and Excel, but most of it done via Access.

I have a database of PPs and result charts data for the major tracks and miscellaneous cards elsewhere. I use that data to generate and test various personal metrics. When I feel like I have something value oriented and useful, I create a customized report.

When I start the day I run all my reports. That helps guide me towards races and horses that show the most potential based on my metrics. That way I don't waste too much time with manual analysis of things my computer can do quickly and in many cases better. Then I dig in more carefully on those races.

I keep refining and adding to the reports.

It takes awhile to build it all and learn how to create all the queries, but once you are up an running you can't believe you ever managed without it.

There's a good thread somewhere where a lot of people helped me get started. If I can find it and I'll post it here.

edit: http://www.paceadvantage.com/forum/showthread.php?t=117713&highlight=databases+excel

DeltaLover
03-13-2017, 01:02 PM
I am going to try to develop a database (or spreadsheet) solution to read and process data. Handicapping 3 or 4 cards is becoming tedious beyond fun, not to mention mistake ridden..

A few questions and a request for suggestions. I am somewhat computer literate and usually learn software well. This should be pretty challenging I suppose.

Does the open office "base" work? Any other free solutions that work well?

Does excell have the power? I basically want to:
- run analysis of races and cards
- print out bet sheets with essential factors i use
- performance research and optimization and weiughting of factors
- keep records etc
Any tips, suggestions.

TIA

Your best approach would be to use a relational database to store your data, the best choice is mysql which is open source and has very extensive documentation, 3rd party tools and code samples to select from. On top of it you will need a higher level of processing that you use any programming language to implement it; python is probably your best bet for this. Using a spreadsheet like open office will also be helpful for your work although I do not suggest it as a storage means.

lamboy
03-13-2017, 01:47 PM
any open source database will do the job. the reporting and snalytics aspect is what you are seeking. my humble suggestion is to look at pentaho. ctools has been long integrated and you can export back out to excel if needed.

classhandicapper
03-13-2017, 02:05 PM
ctools has been long integrated and you can export back out to excel if needed.

This is one of the things I like about Access.

It's extremely easy to export any database info (including my personal metrics) into an Excel spreadsheet when Excel is better equipped to do calculations etc.. and it's just as easy to take a spreadsheet of data and create a new database table.

lamboy
03-13-2017, 02:39 PM
i appreciate the longevity and usefulness of MS office, however they lack business intelligence and a decent graphics generator. if the OP is just starting out, he's not tied to any legacy product. why not pick something with useful BI and powerful graphics? Imho, python and R are better choices

AltonKelsey
03-13-2017, 03:30 PM
I'd say office was much easier to use, especially for the non programmer.

What kind of graphics would he need that office cant produce?

lamboy
03-13-2017, 04:07 PM
dashboard designs are endless. msoffice is notoriously weak for this. ETL different data sources and feed to predictive analytics like weka? msoffice no -- pentaho yes. if one is starting anew, choose something free, modern and extensible. there are a plethora of better open source vs. paid. many drag and drop guis as well for the programming challenged. free courses on edx, coursera etc to get over the learning curve if one is really motivated. hey, it's your hard earned dollars that is being wagered and why not try to develop your own program/tool to give yourself an edge over the masses?

AltonKelsey
03-13-2017, 04:18 PM
Since I already paid my dues and can pretty much make excel jump though any number of hoops, maybe I'm jaded.

I wouldn't discourage anyone from trying the newer platforms.

classhandicapper
03-13-2017, 04:41 PM
Since I already paid my dues and can pretty much make excel jump though any number of hoops, maybe I'm jaded.

I wouldn't discourage anyone from trying the newer platforms.

Occasionally I run into programming problems that would clearly be a lot easier with something other than Access or a combination of Access/Excel. However, I think for the vast majority of people without a data processing background Office will more than suffice because they aren't going to take it to a stage much past information retrieval and calculations that can be done within those products very easily. Some of my stuff is getting way more sophisticated than I even imagined when people generously got me started but I can I knock out the queries very quickly.

lamboy
03-13-2017, 05:25 PM
the access report generator is very crude. if you ever used crystal reports i suggest you give pentaho a try. it's very similar but comes with a lot more features all for free. even if you're using vba to do most of your work, you can leverage the predictive analytics

HalvOnHorseracing
03-13-2017, 05:46 PM
I saw a presentation on a product called Bet Mix. I wasn't really taken with it. Has anyone had experience using it?

NorCalGreg
03-13-2017, 06:11 PM
I saw a presentation on a product called Bet Mix. I wasn't really taken with it. Has anyone had experience using it?

I was a Bet-Mix user for quite a while. It's set up so the user can immediately get into the database after a five minute block of instructions.
Very easy, and fun! I would manipulate stats for hours...and I have no programming knowledge. The other thing is--there is a monthly fee--that includes data files. (When I had it--it was $50 a month--probably gone up a little since)

I'm in direct competition with owner Dave Williams and Bet-Mix...but I have nothing but good things to say about both. I lacked the discipline to make a profit with the program ....eventually creating my own software. I'm sure some folks are using Bet-Mix profitably. :)

Cratos
03-13-2017, 06:25 PM
I'd say office was much easier to use, especially for the non programmer.

What kind of graphics would he need that office cant produce?

Try MathCad graphics and math applications

classhandicapper
03-14-2017, 10:14 AM
the access report generator is very crude. if you ever used crystal reports i suggest you give pentaho a try. it's very similar but comes with a lot more features all for free. even if you're using vba to do most of your work, you can leverage the predictive analytics

I agree. The report generator is pretty crude. It doesn't bother me too much. I'm not creating a product for sale. I just need the information in a neat enough format so that it's easy for me to read and flashes the especially relevant numbers at me. I've been able to accomplish that and I'm not even very good at using it. Maybe down the line I'll want to step it up a notch.

Frankie D
03-14-2017, 12:17 PM
Thanks all. So it appears the choices generally are:

1. Jcapper questions

Pros
- relatively user friendly
- economical (for "Silver")
- canned ready to go. Can create custom factors. Can create UDM's (really like this but see cons)

Cons
- "canned ready to go" may mean more difficult to create custom factors or custom UDMs outside of the canned ones. Can custom, complex factors be created as easily as others? Is it possible at all? Can custom factors be assembled from component factors
- Stuck with buying the one or two data sources, and can't transport them if you wanted to move up later?
- Annual service /upgrade fee structure (thanks a lot, oracle/Microsoft!)

From first developmentEstimated time frame for everyday utility?


2. Access/Excel questions

Pros
- reasonably user friendly, ubiquitous user base, lots of information available
- easy import of data (?)
- no service fees
- potentially powerful with creation of formulas though this may be awkward in application compared to others

Cons
- $400 !!!!! no future upgrades included.
- The "maybe awkward" creation of custom factors. The other referenced thread seems to describe this from time to time.
- some complaints that rep[orts can be crude. Beautiful output would be nice but like classhandicapper not such a concern to me if it readily conveys the info I am looking for.

Estimated time frame for everyday utility?


Purposeful Database open source program

Pros

- price is right
- powerful if you get down the learning curve
- handles data well? Does it?

Cons

- Substantial learning curve to be effective user
- MySQL seems to be a good one. Others mentioned: pentaho Will learning a programming language be required to use these or simply to handle queries and reporting? Do these languages use function development similarly to the way C+ does? Is this painfully difficult?
- Which version of MySQL should be used? Enterprise or community? Any important considerations on this decision?

Estimated time frame for everyday utility?

Jeff P
03-14-2017, 04:45 PM
I think no matter what direction you decide to take - you'll want to obtain chart data (at a minimum) - and (ideally) race data too from one of the data vendors on an ongoing basis.

HDW is my preferred data vendor but the procedure described below can be done (to some degree) using alternate data sources - and also (again to some degree) using a few of the other commercial software offerings - or even a program you've written yourself.

By the way the program written yourself pathway is how I got started.

I download race files and chart data from HDW on a daily basis.

I update my JCapper databases with fresh race data and fresh chart data on a daily basis too -- not because I have to -- but because I LIKE the idea of always having current data to work with when I decide to model (or tune a model) for a track-surface-distance I'm currently following (or may want to follow.)

I spend maybe 3-5 minutes a day managing that (daily grind) process.

By the way the JCappper module that performs that (daily grind) file download process is now programmed to recognize keywords embedded on a command line - meaning that it can be run (fully) by a script (for instance WGet, etc.) -- but I still do part of it manually each day.

I now also offer a cloud based solution if you want charts only data delivered to an ftp folder. (This can be fully automated as well.)

Jcapper questions

Can custom, complex factors be created as easily as others? Is it possible at all? Can custom factors be assembled from component factors?

The short answer is yes.

The longer answer is: If you want to separate yourself from the other players roi-wise - and I'm not talking about other players using JCapper - but players using ANY methodology:

I think you have to do something different approach-wise than what most of the other money in the pools is trying to do.

I think the most effective use of JCapper (and it's the way that I use it) is to make heavy use of an app that ships with JCapper called the JCX File Exports Module.

I regularly find myself using this app to export selected rows and columns out of a JCapper StarterHistory table to .csv files on my hard drive.

The exports are driven (completely) by user defined sql expressions. (These can of course be saved for future use at the click of a button. And then pulled up later and quickly edited and resaved, etc.)

The module is programmed to interpret the column names present in the sql expression the user is using to drive the export.

For example, if you reference specific column names (SELECT [DATE], [TRACK], [RACE], , [DIST] FROM STARTERHISTORY WHERE... ) the resulting .csv file created by your export ends up containing the referenced columns only.

And when you use the * or wildcard character instead of individual column names, for example (SELECT * FROM STARTERHISTORY WHERE... ) the resulting .csv file created by your export ends up containing every column in the table.

And of course everything that comes after the WHERE keyword, for example (WHERE TRACK = 'AQU' AND [DATE] >= #01-01-2017#... ) controls the rows that make it into your .csv file.

The module is also programmed to perform a handful of canned transformations on the exported data.

The general idea is that your exported .csv files end up containing precisely the data you want them to contain.


From there, it's not terribly hard to get data from a .csv file into a third party stat package.

[I]And for me that's where the real work begins.

I'm currently partial to the mlogit module in r (where I've been creating some really interesting track specific logistic regression models.)

But as long as your third party stat tool can handle data from .csv files - you can do the same thing/create your own models etc. using the stat package of your choice.

Hope I managed to explain most of that in a way that makes sense,



-jp

.

sjk
03-15-2017, 03:11 AM
I have used Access with good results for many years. The concern about a $400 brings to mind when I purchased Access back in 1993.

I wanted to buy a database and there seemed to be three options: Access, Paradox and Approach. The MS products were on sale so I bought them. I landed on the right answer for the wrong reason. Who knows where I would be today if Approach (ever hear of it?) was the one on sale.

As Jeff points out you are going to need to spend $100/month or more on getting data into your database and unless you commit time to it every day (as I have done since 1993) and catch up after travels you will not get the full benefit of your work.

Access has worked well for me and the $400 low on the list of commitments you will need to make.

Jcapper does a fine job and that would be an excellent choice as well.

Frankie D
03-15-2017, 09:26 AM
sage advice on the $400 thing. Thanks

I think what I have is absolutely worth it. I do intend to pay full attention to this every day, whatever it takes.

I finally have a job that will more or less allow me to work a few more years until I retire and at the same time spend time and effort on making money doing what I love. In a few years I can retire to full time racing handicapping.

I started with a minimal bankroll with that goal. I always believed that if a system is good enough it will always grow itself into an income stream without zeroing out.

I spent a month reviewing and working it manually on three tracks. This was for polishing my rules, checking its potential (for meeting my retirement goals above), and looking into computerizing as much as possible of the data processing. Over optimistically maybe, I even thought it could all be computerized. Maybe not

Aside, I am also working on money management based on performance numbers, and bet size, which seems to be working out to be 1/3 to 1/5 of Kelly Criterion. This is a number that I can refine.

Computerization will help eliminate mistakes and provide a platform for real time adjustment of rules for which races optimize return. Also, and I realize this from the past, I need to develop concrete resistance to taking any "action" betting activities. I used to do this when active 30 years ago, and it cropped up again when I started exploring again. It must be and I think I have eliminated this tendency.

I love the races, the thunder of 6 to 12 thoroughbreds rumbling past, the breeze from it. etc but I intend to do all the work before I even watch the races. It is a system I have to have. Going to enjoy the races after the work is done.

sjk
03-15-2017, 12:03 PM
Best wishes. Sounds like you have made a good start.

I have been fully computerized for many years. Not an unreasonable objective at all.