PDA

View Full Version : Datamining Without Queries...Could This Apply To Horseracing?


Pace Cap'n
12-17-2011, 07:42 AM
"A new program can find and compare relationships in complicated data without having to be asked specific queries"

That is the tagline for a brief article describing this new program, which can be seen here...

http://www.theatlantic.com/technology/archive/2011/12/connecting-the-dots-finding-patterns-in-large-piles-of-numbers/250126/

The article does not contain much info, however it does contain a link to an article in "Science" magazine, but they want you to register.

Knowing little about such matters, I did not register but thought perhaps someone here might be interested. Would PP's be considered complicated data?

Robert Goren
12-17-2011, 07:57 AM
I believe that this is already being done. It pretty hard for me to believe that is a statistical technique that hasn't been applied to horse racing.

Pace Cap'n
12-17-2011, 08:02 AM
In the article it is described as a new software program. Are you aware of any racing applications that currently exist?

Robert Goren
12-17-2011, 08:42 AM
I went to the MINE web site to check out what the program actual does or doesn't do. I believe most if not all of the program does has been available for quite sometime for PCs and a bit longer for Main Frames. About the only thing new in statistical analysis is the ability to look at larger and larger data base. other than that, you have only limited by your imagination in testing out theories for many years. I believe their several programs for sale that will already do what MINE will do that are geared to horse racing data. The one thing that I can guarantee you is that there is not a multiple regression models out that has been tried no matter how fancy you try to get with your data. I was running multiple regression model in the 1970s using logs and exponentials. I tried every possible thing I could think of including such things sines and cosines on UNL's IBM main frame with something call SPSS.

Robert Goren
12-17-2011, 08:56 AM
I do not own any of the programs so I am not sure exactly what they do, but I believe that Dave Schwartz and Jcapper have query programs. I am sure there others too.

JBmadera
12-17-2011, 09:03 AM
I think this is kind of interesting. The net seems to be that instead of using various parameters to uncover statistically significant relationships this program discovers significant relationships without first defining the parameters. Somehow uncovering the "I didn't think to look for those" relationships.

Pace Cap'n
12-17-2011, 10:13 AM
I do not own any of the programs so I am not sure exactly what they do, but I believe that Dave Schwartz and Jcapper have query programs. I am sure there others too.

But the thread title, and the article, specifies "without queries".

I am well aware of the various database-type programs presently available, and while I don't use them I have a general understanding of how they work.

What I was wondering was "Could this MINE program (sans queries) have any practical handicapping applications?"

Warren Henry
12-17-2011, 01:18 PM
I just went to the MINE website and it looks to me like the program is available for download. Anyone with a lot of free time want to take a shot at this?

Personally, I need to master the things I THINK I understand first. :bang:

PaceAdvantage
12-17-2011, 08:00 PM
Yes, the program is available for free it appears...I'm sure some adventurous soul out there will take a crack at it...please report back with your findings if you do...

pondman
12-19-2011, 11:12 AM
It's similiar to brainmaker. Eventually, one of the steps will be to quantify all variables and normalize data, usually in a format between 0 and 1. This will always give you a problem in horse racing, which IMO requires a check off and/or green light, red light algorithm.

raybo
12-20-2011, 10:44 PM
Something similar was done by an AllData member. He did not "query" per se. He wrote code that starts with a single factor, steps through all combinations of all the individual databased factors and arrives at the best hit rate/ROI combination, then that new "composite" factor starts the process all over again, running through all the other factors and arrives at the best hit rate/ROI combination, this becomes the next "composite" factor and the process starts again.

Although it is probably not as sophisticated as the program mentioned, I believe it operates basically the same way, discovering viable combinations that one would not normally think of, or would at least take much time to discover.