Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > Handicapping Software
User Name
Password

Reply
 
Thread Tools Search this Thread Rate Thread
Old Yesterday, 12:56 PM   #76
traynor
Registered User
 
traynor's Avatar
 
Join Date: Jan 2005
Posts: 6,294
vCash: 400
Quote:
Originally Posted by JJMartin
I use it for everything. The program I built, assembles the pp files including results files. I can then run a single file through one of multiples models, or build a database to back test a model against a whole year of data or whatever range I choose to use. Mostly I use it for the latter. I try to automate to the absolute maximum.


What kind of data source do you use? From your description, it seems you are downloading PPs and results separately. I assume you create/generate your own track variants on the fly using the above data sources? That was a key issue in my own apps, and it took a bit of work to get right.

When you back test models, do you set filters mainly for accuracy (win%) or (possible) return (ROI)? Most lean toward the ROI side, to their disadvantage. In most cases, ROI (in smaller samples of whatever size) is derived from what are essentially anomalies. Rarely repeat going forward, yet everyone seems obsessed with the "woulda coulda shoulda" type modeling.

You might try parsing for winner attributes (ignoring ROI) and test that on future races. ROIs in the 90s (in past races) with relatively high win rates can be especially productive. My conjecture is that bettors "seeking value wagers" tend to avoid the obvious best choices for win, and those obvious best choices for win may often generate a decent profit going forward (that may be concealed or missing in the sample used to build the model).

The big advantage is that the results of a frequency in the 45% and up is much more likely to be productive than a lower frequency is likely to reproduce a paper ROI from a sample of past races.
__________________
"Sooner or later we must be ready to leave the dreamland of childhood, where imagination finds unlimited scope, and take our place in a world of limited freedoms. That world however, can in the long run give us something better than any vision conjured up in childhood."
Joost Meerloo Total War and the Human Mind: A Psychologist's Experience in Occupied Holland, 1944.

Last edited by traynor : Yesterday at 12:58 PM.
traynor is offline   Reply With Quote Reply
Old Yesterday, 03:02 PM   #77
JJMartin
Registered User
 
JJMartin's Avatar
 
Join Date: Jun 2011
Posts: 428
vCash: 400
Quote:
Originally Posted by traynor
What kind of data source do you use? From your description, it seems you are downloading PPs and results separately. I assume you create/generate your own track variants on the fly using the above data sources? That was a key issue in my own apps, and it took a bit of work to get right.

When you back test models, do you set filters mainly for accuracy (win%) or (possible) return (ROI)? Most lean toward the ROI side, to their disadvantage. In most cases, ROI (in smaller samples of whatever size) is derived from what are essentially anomalies. Rarely repeat going forward, yet everyone seems obsessed with the "woulda coulda shoulda" type modeling.

You might try parsing for winner attributes (ignoring ROI) and test that on future races. ROIs in the 90s (in past races) with relatively high win rates can be especially productive. My conjecture is that bettors "seeking value wagers" tend to avoid the obvious best choices for win, and those obvious best choices for win may often generate a decent profit going forward (that may be concealed or missing in the sample used to build the model).

The big advantage is that the results of a frequency in the 45% and up is much more likely to be productive than a lower frequency is likely to reproduce a paper ROI from a sample of past races.


The PP's and results are 2 separate files. I don't use track variants or create them but I did write a program for someone who wanted to automate their own manually derived variant formula.

I have done thousands of tests over 12+ years. I look at both value and strike rate. In the beginning I was probably more focused on ROI. Any time I see an outlier I convert it to the win average of the rest of the group. The main thing I have realized is that static testing of the most common or obvious factors such as speed figures, distance, surface and class or any combination of them will generally result in a negative ROI in the long term. For example looking at a long list of past data and just filtering factor "A" with factor "B" then moving on to factor "A" with "C" and so on. Without developing some external formula or calculation that creates a new metric that is not in the raw data per se but is derived from the data (or not), I would say there is no hope of developing anything of any value (unless possibly you are extremely selective with great discipline or are very intuitive with visual cues or something like that). The novice and most handicapping software/services usually end up with a selection in the top 2 or 3 M/L or post time odds. Since this category is over bet, you end up with underlays consistently. I look at the more competitive races that are more confusing to the public and find an edge there. The general public will gravitate towards the easier or obvious choices without fail. So part of what I do is handicap the handicappers and use "outside" factors that can still be derived from the data through their theoretically (hopefully objective) implied meaning. So a big part of the battle is overcoming the majority consensus which dictates a high percentage of the finish order in the results. It is no secret that the post time odds are extremely efficient. When analyzing data, the trick is to distinguish the things that truly have a real effect from the ones that are merely illusions. The problem is that the illusions can be very convincing when looking at patterns. The hardwired human ability to detect patterns can be detrimental in this scope. I would agree about trying to increase the win rate and looking at attributes.
JJMartin is offline   Reply With Quote Reply
Old Yesterday, 04:38 PM   #78
traynor
Registered User
 
traynor's Avatar
 
Join Date: Jan 2005
Posts: 6,294
vCash: 400
Quote:
Originally Posted by JJMartin
The PP's and results are 2 separate files. I don't use track variants or create them but I did write a program for someone who wanted to automate their own manually derived variant formula.

I have done thousands of tests over 12+ years. I look at both value and strike rate. In the beginning I was probably more focused on ROI. Any time I see an outlier I convert it to the win average of the rest of the group. The main thing I have realized is that static testing of the most common or obvious factors such as speed figures, distance, surface and class or any combination of them will generally result in a negative ROI in the long term. For example looking at a long list of past data and just filtering factor "A" with factor "B" then moving on to factor "A" with "C" and so on. Without developing some external formula or calculation that creates a new metric that is not in the raw data per se but is derived from the data (or not), I would say there is no hope of developing anything of any value (unless possibly you are extremely selective with great discipline or are very intuitive with visual cues or something like that). The novice and most handicapping software/services usually end up with a selection in the top 2 or 3 M/L or post time odds. Since this category is over bet, you end up with underlays consistently. I look at the more competitive races that are more confusing to the public and find an edge there. The general public will gravitate towards the easier or obvious choices without fail. So part of what I do is handicap the handicappers and use "outside" factors that can still be derived from the data through their theoretically (hopefully objective) implied meaning. So a big part of the battle is overcoming the majority consensus which dictates a high percentage of the finish order in the results. It is no secret that the post time odds are extremely efficient. When analyzing data, the trick is to distinguish the things that truly have a real effect from the ones that are merely illusions. The problem is that the illusions can be very convincing when looking at patterns. The hardwired human ability to detect patterns can be detrimental in this scope. I would agree about trying to increase the win rate and looking at attributes.


I agree with the difficulties, primarily because people tend to seek validation in the form of agreement. Every bettor wants the horse with "everything going for it" that wins frequently--but always goes off at long odds. The easy way out is to look for primaries, and diminish the reliance on (and impression with) secondaries. Not as many "sure things" but better returns.

An example would be "cheap speed"--almost always labeled AFTER the race (as an excuse for doing or not doing whatever). Many "speed handicappers" (and more than a few "pace handicappers") believe their area of specialization overcomes something that "class handicappers" consider an obvious deficiency. If speed is viewed as a primary, with secondary attributes of pace and class ignored or diminished in significance, results (and models) tend to vary significantly when compared to scenarios in which the latter are considered equivalent (or near-equivalent) in importance.

The primaries (in any given sample, regardless of size) may NOT be the same as the factors most bettors consider important. They also may exist (in equal or greater measure) in entries considered throw-outs.
__________________
"Sooner or later we must be ready to leave the dreamland of childhood, where imagination finds unlimited scope, and take our place in a world of limited freedoms. That world however, can in the long run give us something better than any vision conjured up in childhood."
Joost Meerloo Total War and the Human Mind: A Psychologist's Experience in Occupied Holland, 1944.
traynor is offline   Reply With Quote Reply
Old Yesterday, 04:39 PM   #79
Cratos
Registered User
 
Join Date: Jan 2004
Location: The Big Apple
Posts: 4,036
vCash: 400
Send a message via AIM to Cratos
Quote:
Originally Posted by JJMartin
The PP's and results are 2 separate files. I don't use track variants or create them but I did write a program for someone who wanted to automate their own manually derived variant formula.

I have done thousands of tests over 12+ years. I look at both value and strike rate. In the beginning I was probably more focused on ROI. Any time I see an outlier I convert it to the win average of the rest of the group. The main thing I have realized is that static testing of the most common or obvious factors such as speed figures, distance, surface and class or any combination of them will generally result in a negative ROI in the long term. For example looking at a long list of past data and just filtering factor "A" with factor "B" then moving on to factor "A" with "C" and so on. Without developing some external formula or calculation that creates a new metric that is not in the raw data per se but is derived from the data (or not), I would say there is no hope of developing anything of any value (unless possibly you are extremely selective with great discipline or are very intuitive with visual cues or something like that). The novice and most handicapping software/services usually end up with a selection in the top 2 or 3 M/L or post time odds. Since this category is over bet, you end up with underlays consistently. I look at the more competitive races that are more confusing to the public and find an edge there. The general public will gravitate towards the easier or obvious choices without fail. So part of what I do is handicap the handicappers and use "outside" factors that can still be derived from the data through their theoretically (hopefully objective) implied meaning. So a big part of the battle is overcoming the majority consensus which dictates a high percentage of the finish order in the results. It is no secret that the post time odds are extremely efficient. When analyzing data, the trick is to distinguish the things that truly have a real effect from the ones that are merely illusions. The problem is that the illusions can be very convincing when looking at patterns. The hardwired human ability to detect patterns can be detrimental in this scope. I would agree about trying to increase the win rate and looking at attributes.

I will agree with your statement with the following caveat that it can be done (and we are doing it successfully), but it takes a rigorous understanding of force and motion with the mathematical ability to apply the theory into a useful and practical wagering model with profitable results.
__________________
Independent thinking, emotional stability, and a keen understanding of both human and institutional behavior are vital to long-term investment success My hero, Warren Edward Buffett

"Science is correct; even if you don't believe it" - Neil deGrasse Tyson
Cratos is online now   Reply With Quote Reply
Reply

« Previous Thread | Next Thread »



Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump



All times are GMT -4. The time now is 05:40 PM.



Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright 1999 - 2016 -- PaceAdvantage.Com -- All Rights Reserved -- Best Viewed @ 1024x768 Resolution Or Higher