Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 04-09-2024, 05:03 PM   #46
equusvates
Registered User
 
Join Date: May 2021
Posts: 48
XML data

Sorry if I appear to be straying from the subject. As others have stated, the files are intended for machine learning. The files will be a big help in my learning algorithms. Previously I used Post Time Data files of which I have about 40K races. Last year I switched to Brisnet’s to get more information and less trainer / jockey confusion. I brought up the Python / C++ languages for security of your derived program, but also C++ is about 7 times faster than Python. Machine learning is very CPU intensive. I know Python has libraries that can make use of GPU accelerators. My learning technique is not able to take advantage of that capability.
equusvates is offline   Reply With Quote Reply
Old 04-09-2024, 05:17 PM   #47
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,545
Getting the data in position to actually be researched has always been a very arduous endeavor IMO. The few people willing to learn what was necessary have (for the most part) already been here and done that years ago. What this offering does do though is it takes away the initial cost of the data. Some people will build their own software to use these files or maybe build commercial software and make it for sale etc. it's a win win IMO.

What it doesn't do is for the (queue up that Jetson's intro) masses is to allow someone to for example download a giant CSV or say a fully populated Access DB or maybe a fully populated Excel spreadsheet and hit the ground running to start querying the data 10 minutes later. I don't know if there's anyone out there willing to donate their copius free time to the enrichment of those who will find that first step to be an insurmountable hurdle.

Adding on to that thought if someone does want to offer the data in some other (say easier to research) aggregated format then I imagine you would certainly have to get permission from Equibase.
__________________
North American Class Rankings

Last edited by MJC922; 04-09-2024 at 05:32 PM.
MJC922 is offline   Reply With Quote Reply
Old 04-09-2024, 06:46 PM   #48
equusvates
Registered User
 
Join Date: May 2021
Posts: 48
XML data

Since I am doing it anyway with Equibase approval, I will create a temporary link to my Dropbox and or Google Drive. Currently, I only work on the files at my leisure. The format will csv with one line per race which excel or any database can handle. The files will be huge in raw form and thus will be zipped. I have completed the race data portion of the files. I am about 60% complete with the entries portion of the files. Next are the past performances which are the largest single item. Then the workouts, results and payouts will be processed. Admittedly, the parsing technique I am using is tedious. A different technique is faster to write code for but is slower in execution. I have yet to strictly determine the format. I am leaning towards tagging the start of each entry, pp, workout…
Anyone with any suggestions is welcomed to input. Only one type will be done.

Horse racing needs more participants and a program that can somewhat entice more people is a good thing. I do not agree that data should be given out freely.. There is a cost that that must be burdened somewhere. I will freely give out my program to use the Equibase data or Brisnet. Of the A. I. Portions will not be included. In others portions of this forum I will start placing the program projections of a race a day in advance. No complete cards. Figure out your own pick 6.

One last thing. The intelligent port of the program will “never” be for sale. The free stuff can actually be computed by anyone programming experience or even by hand if you want to speed all night with pen paper and calculator. Why not sale. Legal reasons and why would I want to compete against myself? In any case, you will be surprised what the free version can do.

Last edited by equusvates; 04-09-2024 at 07:00 PM.
equusvates is offline   Reply With Quote Reply
Old 04-09-2024, 09:08 PM   #49
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,922
Quote:
Originally Posted by equusvates View Post
One last thing. The intelligent port of the program will “never” be for sale. The free stuff can actually be computed by anyone programming experience or even by hand if you want to speed all night with pen paper and calculator. Why not sale. Legal reasons and why would I want to compete against myself? In any case, you will be surprised what the free version can do.
I like your style.


Reach out if you want some conversation on your AI approach. I have some degree of experience writing AI engines.
Dave Schwartz is online now   Reply With Quote Reply
Old 04-09-2024, 10:26 PM   #50
equusvates
Registered User
 
Join Date: May 2021
Posts: 48
XML data

Quote:
Originally Posted by Dave Schwartz View Post
I like your style.


Reach out if you want some conversation on your AI approach. I have some degree of experience writing AI engines.
Dave I’m sure you don’t remember me but I had a conversation with you years ago about the two bell curves I saw in viewing msw performances at the early part of the year and the later part of the year. You clearly explain better horses at the beginning of the year and less capable horses later. I used your pars for years and now created my own pace numbers which is the bedrock of my program. Your short video on early / late is spot on.

I already submitted a request to Equibase for giving out post processed data.

As for my a.I. technique I use likelihood ratios, Monte Carlo simulation, clips rules engine and linear genetic programming.I think neural networks are ultimately the best technique but the data requirements is enormous and CPU power as stated before is Hugh. Without heavy prepossessing, neural networks are doomed. I mean raw data even after normalization requires millions of races to minimize the inherent volatility in the data.

Thanks again for your past help.

Last edited by equusvates; 04-09-2024 at 10:28 PM. Reason: More info
equusvates is offline   Reply With Quote Reply
Old 04-09-2024, 10:54 PM   #51
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,922
Quote:
Originally Posted by equusvates View Post
Dave I’m sure you don’t remember me but I had a conversation with you years ago about the two bell curves I saw in viewing msw performances at the early part of the year and the later part of the year. You clearly explain better horses at the beginning of the year and less capable horses later. I used your pars for years and now created my own pace numbers which is the bedrock of my program. Your short video on early / late is spot on.

I already submitted a request to Equibase for giving out post processed data.

As for my a.I. technique I use likelihood ratios, Monte Carlo simulation, clips rules engine and linear genetic programming.I think neural networks are ultimately the best technique but the data requirements is enormous and CPU power as stated before is Hugh. Without heavy prepossessing, neural networks are doomed. I mean raw data even after normalization requires millions of races to minimize the inherent volatility in the data.

Thanks again for your past help.
I recall the conversation.
Reach out and I'll help if I can.



Meanwhile, about 8 months ago I finished an AI system that could best be described as a DEEP LEARNING, CONVOLUTIONAL GENETIC ALGORITHM, with TOPOLOGICAL DATA ANALYSIS.

Just days away from first demonstrations.
This weekend, I hope.

Note that it is all pre-trained - comes with no database and does not need one. The user has all that he needs on day one.

Although it could be used as a black box, that was not my intention.

Instead, it will allow the player to expand his current information loop. You know... that part about how we all use the same data?

I guarantee the whales won't have this stuff, and if they did, it wouldn't fit their approach.

It is highly situational.

Whales are expert at handicapping HORSES. The deTerminator handicaps RACES.

That's a world of difference.








.
Dave Schwartz is online now   Reply With Quote Reply
Old 04-10-2024, 10:41 AM   #52
vegasone
Registered User
 
Join Date: Aug 2007
Posts: 531
Admittedly, the parsing technique I am using is tedious. A different technique is faster to write code for but is slower in execution. I have yet to strictly determine the format. I am leaning towards tagging the start of each entry, pp, workout…




You may want to check out the single and multiple file formats available for past performance data from Bris. That may give you some ideas.
vegasone is offline   Reply With Quote Reply
Old 04-10-2024, 08:39 PM   #53
equusvates
Registered User
 
Join Date: May 2021
Posts: 48
Quote:
Originally Posted by vegasone View Post
Admittedly, the parsing technique I am using is tedious. A different technique is faster to write code for but is slower in execution. I have yet to strictly determine the format. I am leaning towards tagging the start of each entry, pp, workout…




You may want to check out the single and multiple file formats available for past performance data from Bris. That may give you some ideas.

I currently use and process Brisnet files which are zipped csv files with data partitioned into separate files. Being that the files are partitioned by data types, no start of fields is required. If single lines are desired, field tagging will be required. Does not really matter to me which type people want, as I take the data and save it in a Microsoft Server database. Therefore, if properly designed, telling the server date, track and race number is trivial to fill up a dataframe or C/C++ structure. With no input, I'll place the data in a Brisnet's type files. The thing about Brisnet files is, you must keep track of the mapping of the data. Brisnet fully documents the mapping.

Last edited by equusvates; 04-10-2024 at 08:41 PM.
equusvates is offline   Reply With Quote Reply
Old 04-11-2024, 01:15 AM   #54
vegasone
Registered User
 
Join Date: Aug 2007
Posts: 531
Been using BRIS stuff since the 80's in one form or another. You should see some of the early charts
vegasone is offline   Reply With Quote Reply
Old 04-11-2024, 05:34 PM   #55
MJC922
Registered User
 
Join Date: Nov 2012
Posts: 1,545
Quote:
Originally Posted by vegasone View Post
Been using BRIS stuff since the 80's in one form or another. You should see some of the early charts
My stuff is all setup to use the csv formatted charts. Sadly a large portion of the chart processor coding is line after mind numbing line of find and replace for typos (think missing a single random character in the middle of a string inside the full race conditions). The XML version of the charts came out later and I always wondered whether it suffered from the same issue. With my luck no. Of course by that time I was so many years into the project that I pushed ahead with csv.
__________________
North American Class Rankings

Last edited by MJC922; 04-11-2024 at 05:38 PM.
MJC922 is offline   Reply With Quote Reply
Old 04-11-2024, 09:38 PM   #56
vegasone
Registered User
 
Join Date: Aug 2007
Posts: 531
Quote:
Originally Posted by MJC922 View Post
My stuff is all setup to use the csv formatted charts. Sadly a large portion of the chart processor coding is line after mind numbing line of find and replace for typos (think missing a single random character in the middle of a string inside the full race conditions). The XML version of the charts came out later and I always wondered whether it suffered from the same issue. With my luck no. Of course by that time I was so many years into the project that I pushed ahead with csv.



Probably wouldn't matter since the same error is going to show up in every version of the file. Lots and lots and lots of error checking. The same issue with results shows up in the charts and results files , etc.
vegasone is offline   Reply With Quote Reply
Old 04-12-2024, 06:48 PM   #57
equusvates
Registered User
 
Join Date: May 2021
Posts: 48
Received the response from Equibase for my request to post processed XML data files. The request was denied, and I do understand. It is explicitly stated in the license agreement. Equibase stated that guidance to others as to how post processing can be done is okay. But, if you are a programmer, even without XML knowledge, you should have no problem in learning to process these files. I think trying to use EXCEL or other spreadsheet programs to import these files, the number of child nodes of child nodes, is problematic.

If you are not a programmer, this conversion project needs a good programming foundation first.
equusvates is offline   Reply With Quote Reply
Old 04-13-2024, 03:41 PM   #58
TrifectaBox
Registered User
 
Join Date: Jul 2023
Posts: 38
How nice of them to permit guidance to others .


Exactly how would they prevent this even if they wanted to .
TrifectaBox is offline   Reply With Quote Reply
Old 04-13-2024, 03:52 PM   #59
Saratoga
Registered User
 
Join Date: Mar 2012
Posts: 515
Quote:
Originally Posted by TrifectaBox View Post
How nice of them to permit guidance to others .


Exactly how would they prevent this even if they wanted to .
Easily.....Contact PA ...POOFFFFF

Attached Images
File Type: jpg Capture.JPG (11.9 KB, 5 views)

Last edited by Saratoga; 04-13-2024 at 03:54 PM.
Saratoga is offline   Reply With Quote Reply
Old 04-17-2024, 10:30 PM   #60
JJMartin
Registered User
 
JJMartin's Avatar
 
Join Date: Jun 2011
Posts: 588
So this data is not in .csv format?
JJMartin is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Which horse do you like most
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 01:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.