Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 08-29-2020, 10:23 AM   #1
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,606
Building a Better Model

This has been floating around on Twitter and elsewhere. I think it's worth the time to read about building a handicapping model.

https://plusevanalytics.wordpress.co...ort-of-quants/
__________________
"Unlearning is the highest form of learning"
classhandicapper is online now   Reply With Quote Reply
Old 08-29-2020, 11:26 AM   #2
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by classhandicapper View Post
This has been floating around on Twitter and elsewhere. I think it's worth the time to read about building a handicapping model.

https://plusevanalytics.wordpress.co...ort-of-quants/
I built a model with similar functional form to what he is describing and have maintained and augmented it for several years with more data and new sources. Given my experience, I’d be very surprised if they would make any money without the rebates. The model can definitely help you outperform market odds, but it’s tough to overcome the huge takeout. Of course, it would probably be better if I wasn’t just a one man team!
CBYRacer is offline   Reply With Quote Reply
Old 08-29-2020, 03:48 PM   #3
Tom
The Voice of Reason!
 
Tom's Avatar
 
Join Date: Mar 2001
Location: Canandaigua, New york
Posts: 112,815
Yay, calculus!
__________________
Who does the Racing Form Detective like in this one?
Tom is online now   Reply With Quote Reply
Old 09-01-2020, 12:18 AM   #4
Jeffwb
Registered User
 
Join Date: Aug 2020
Posts: 14
What I want to know is since when can one negotiate "rebates"? Other than the traditional MVP rewards type thing and getting some money back for points that one gets for wagering certain amounts I was not aware some conglomerate could come in and call track management and say, hey, we're going to wager a high amount so what take out % can you give us? But clearly in this case I could be missing something.

This article does highlight the need in general, over the long haul, for a reduced take out. When is the last time you played poker and the rake was 20%? Arghhh! OMG, who would play. And other sports are 10%. Imagine even on the picks 3s and what they would look like if the take out was just 10%.

I guess it's cool to hear that there are those out there making so much!
Jeffwb is offline   Reply With Quote Reply
Old 09-01-2020, 11:52 AM   #5
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,909
Quote:
Originally Posted by classhandicapper View Post
This has been floating around on Twitter and elsewhere. I think it's worth the time to read about building a handicapping model.

https://plusevanalytics.wordpress.co...ort-of-quants/
This was an awesome article.



I strongly suggest that everyone who really cares about winning read it.

Ironically, the takeaway does NOT have to be about the math, so even if you just skip articles like this because you aren't going to actually DO IT, there is value.

The value is in understanding this statement:
Quote:
Take, for example, the variable “horse won its previous race = YES”. This is obviously a positive predictor of its win probability in this race. But, it’s also the most obvious stat to a casual reader of the racing form, and if the betting public tends to overvalue it then it’s possible that it could be significant with a negative coefficient in an odds offset model.
Paraphrase:
obviously a positive predictor of its win probability.
Also the most obvious stat.
betting public tends to overvalue it.
a negative coefficient in an odds offset model.

My Paraphrase:
  • You KNOW how to handicap.
  • You've been doing it for 30 years.
  • Problem: You do it using the same factors as everyone else.
  • Therefore, the approach you use -- DESPITE HOW MUCH YOU KNOW -- has little or no value in making profit.
  • In fact, it may actually hurt your results!

Think of Let It Ride. A guy sitting next to you could probably outperform his own selections by tossing yours.

Not trying to be insulting here. My picks work the same way.

I've been working on a new Science of Handicapping approach which is based upon determining when to use handicapping and when not to use handicapping.

The Age of Covid has taught me a lot.


Just my opinion.


Dave
Dave Schwartz is online now   Reply With Quote Reply
Old 09-01-2020, 12:27 PM   #6
GMB@BP
Registered User
 
Join Date: Feb 2003
Location: Dark Side of the Moon
Posts: 5,870
Quote:
Originally Posted by Dave Schwartz View Post
This was an awesome article.



I strongly suggest that everyone who really cares about winning read it.

Ironically, the takeaway does NOT have to be about the math, so even if you just skip articles like this because you aren't going to actually DO IT, there is value.

The value is in understanding this statement:


Paraphrase:
obviously a positive predictor of its win probability.
Also the most obvious stat.
betting public tends to overvalue it.
a negative coefficient in an odds offset model.

My Paraphrase:
  • You KNOW how to handicap.
  • You've been doing it for 30 years.
  • Problem: You do it using the same factors as everyone else.
  • Therefore, the approach you use -- DESPITE HOW MUCH YOU KNOW -- has little or no value in making profit.
  • In fact, it may actually hurt your results!

Think of Let It Ride. A guy sitting next to you could probably outperform his own selections by tossing yours.

Not trying to be insulting here. My picks work the same way.

I've been working on a new Science of Handicapping approach which is based upon determining when to use handicapping and when not to use handicapping.

The Age of Covid has taught me a lot.


Just my opinion.


Dave
Good post Dave.

I do agree that it was a very good article. Its an interesting statement that someone who was beating the game using computer algorithms became a losing player when better, larger, more sophisticated CAW teams came into the mix.

Now imagine the average player trying to break into the game on a somewhat consistent basis of play..
GMB@BP is offline   Reply With Quote Reply
Old 09-01-2020, 12:44 PM   #7
FakeNameChanged
Registered User
 
Join Date: Jan 2010
Posts: 2,176
Quote:
Originally Posted by Dave Schwartz View Post
This was an awesome article.



I strongly suggest that everyone who really cares about winning read it.

Ironically, the takeaway does NOT have to be about the math, so even if you just skip articles like this because you aren't going to actually DO IT, there is value.

The value is in understanding this statement:


Paraphrase:
obviously a positive predictor of its win probability.
Also the most obvious stat.
betting public tends to overvalue it.
a negative coefficient in an odds offset model.

My Paraphrase:
  • You KNOW how to handicap.
  • You've been doing it for 30 years.
  • Problem: You do it using the same factors as everyone else.
  • Therefore, the approach you use -- DESPITE HOW MUCH YOU KNOW -- has little or no value in making profit.
  • In fact, it may actually hurt your results!

Think of Let It Ride. A guy sitting next to you could probably outperform his own selections by tossing yours.

Not trying to be insulting here. My picks work the same way.

I've been working on a new Science of Handicapping approach which is based upon determining when to use handicapping and when not to use handicapping.

The Age of Covid has taught me a lot.


Just my opinion.


Dave
Good summary Dave. So, using your point #3 of using the same factors as everyone else, if we do a group brainstorm, what other factors might be more predictive?
i.e.-(not in any particular order): Horse's weight last race, and today's weight; change of jockey today and corresponding exp. improvement/decline in winning pct.; odds in last three races, vs. odds today at post time; ave. performance at today's distance within +/- 1/2 furlong; is horse's best perf. in the afternoon or night racing?; etc.
I realize weights aren't currently avail. in the US.
__________________
One of the downsides of the Internet is that it allows like-minded people to form communities, and sometimes those communities are stupid.
FakeNameChanged is offline   Reply With Quote Reply
Old 09-01-2020, 12:51 PM   #8
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by GMB@BP View Post
Good post Dave.

I do agree that it was a very good article. Its an interesting statement that someone who was beating the game using computer algorithms became a losing player when better, larger, more sophisticated CAW teams came into the mix.

Now imagine the average player trying to break into the game on a somewhat consistent basis of play..
Few points here:

1) They were beating the game with a computer algorithm + rebates. The article doesn't mention the level of rebates involved.

2) Their ROI was a skinny 3% and evidently not stable. How do we know that it wasn't the standard deviation of their model (and not the competition) that led to their demise? The author mentions competition but it could have just as well been that 3% was the best the model could achieve and just so happened to mean revert later on.

3) The examples of factors that the author was discussing were simplistic. I'm sure these were just illustrative. To model (or at least test) certain factors like pace dynamics, trainer / jockey level fixed effects and interactions, non-stationary track-level effects, etc. requires a deep understanding of handicapping plus the ability to program them in a non-linear way. Even with a team this can be very complex. Neural networks can help with feature selection and non-linearity, but you have to be careful with over-fitting. In my own experience, coming up with logical yet unique handicapping factors is the key but also very difficult.

Ultimately, a 3% ROI with a rebate is not going to cut it...unless your model is turn key and has no standard deviation
CBYRacer is offline   Reply With Quote Reply
Old 09-01-2020, 12:54 PM   #9
xtb
Registered User
 
Join Date: Dec 2005
Location: Western NY
Posts: 5,336
Quote:
Originally Posted by Dave Schwartz View Post
This was an awesome article.



I strongly suggest that everyone who really cares about winning read it.

Ironically, the takeaway does NOT have to be about the math, so even if you just skip articles like this because you aren't going to actually DO IT, there is value.

The value is in understanding this statement:


Paraphrase:
obviously a positive predictor of its win probability.
Also the most obvious stat.
betting public tends to overvalue it.
a negative coefficient in an odds offset model.

My Paraphrase:
  • You KNOW how to handicap.
  • You've been doing it for 30 years.
  • Problem: You do it using the same factors as everyone else.
  • Therefore, the approach you use -- DESPITE HOW MUCH YOU KNOW -- has little or no value in making profit.
  • In fact, it may actually hurt your results!

Think of Let It Ride. A guy sitting next to you could probably outperform his own selections by tossing yours.

Not trying to be insulting here. My picks work the same way.

I've been working on a new Science of Handicapping approach which is based upon determining when to use handicapping and when not to use handicapping.

The Age of Covid has taught me a lot.


Just my opinion.


Dave
Is this a different take on the A/E ratio?

http://www.netcapper.com/TrackTracts...e/TT010223.htm
xtb is offline   Reply With Quote Reply
Old 09-01-2020, 12:56 PM   #10
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,606
Quote:
My Paraphrase:
You KNOW how to handicap.
You've been doing it for 30 years.
Problem: You do it using the same factors as everyone else.
Therefore, the approach you use -- DESPITE HOW MUCH YOU KNOW -- has little or no value in making profit.
In fact, it may actually hurt your results!
I agree Dave.

That's exactly why I posted it.

I've spent decades trying to understand the deepest nuances and interrelationship between speed figures, pace, class, bias, trip, consistency, trainers, race development etc... I know a lot more now than I did 10 years ago and massively more now than I did 20 years ago. But much of it is of very little value when it comes to winning money because so much of it is already built into the odds well enough to take away any edge.

The idea for the typical player isn't to try to out-handicap and make a better odds line than the consensus of all the players out there. Some of then have information you don't have, mathematical models you can't replicate, your information may occasionally be inaccurate, you may misunderstand some situations etc..

The idea is find situations where you know the public still screws up (positive or negative) and focus on those situations.
__________________
"Unlearning is the highest form of learning"
classhandicapper is online now   Reply With Quote Reply
Old 09-01-2020, 01:52 PM   #11
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,909
Quote:
Originally Posted by xtb View Post
Is this a different take on the A/E ratio?

http://www.netcapper.com/TrackTracts...e/TT010223.htm
Yes, but that approach is a little faulty.

Back in 1992 I called "Pool Impact Value." (To my knowledge, I invented it then. It didn't appear in my software until 1993.)

Quote:
Classic Handicapper:
I agree Dave.

That's exactly why I posted it.

I've spent decades trying to understand the deepest nuances and interrelationship between speed figures, pace, class, bias, trip, consistency, trainers, race development etc... I know a lot more now than I did 10 years ago and massively more now than I did 20 years ago. But much of it is of very little value when it comes to winning money because so much of it is already built into the odds well enough to take away any edge.

The idea for the typical player isn't to try to out-handicap and make a better odds line than the consensus of all the players out there. Some of then have information you don't have, mathematical models you can't replicate, your information may occasionally be inaccurate, you may misunderstand some situations etc..

The idea is find situations where you know the public still screws up (positive or negative) and focus on those situations.
The research I am doing is literally about addressing the question: "What if the obvious winners don't win?"

Simply moving the other horses up does not correlate.
Dave Schwartz is online now   Reply With Quote Reply
Old 09-01-2020, 03:43 PM   #12
Nitro
Registered User
 
Nitro's Avatar
 
Join Date: Feb 2009
Location: NY
Posts: 18,949
I also found some segments of the article (written I believe 8 years ago) very interesting especially when compared to similar content in a report authored by Bill Benter. From what I gather they did show a profit, but nothing compared to what Mr. Benter accomplished.
The segments I’m referring to are (both high-lighted in blue):
Quote:
Author & Syndicate – UNKNOWN
I tried two model form, a multinomial logistic regression, and a probit model. I read that some other modelers preferred probit, but I found that my model worked best with logistic. Without giving too much away (in case I ever get back into the game), here’s a random list of some of the most important things I had to figure out when building this model.

Offset the Public Odds

When I first built the model, I tried all kinds of variables in all kinds of combinations. I would fit a model, back test it on historical races and every time it would lose money (even after rebates). Then I tried something that made all the difference. Instead of fitting a regular logistic model:
ln(win prob / (1-win prob)) = intercept + coefficients * variables

I took the track odds, converted them to implied probabilities and used them as “offsets”:
ln(win prob / (1-win prob)) = ln(odds-implied prob / (1-odds implied prob)) + intercept + coefficients * variables

What does that do? It changes the entire purpose of the model. Instead of building a probability from scratch, I am starting with the assumption that the market odds are an efficient predictor of win probability, and finding variables that provide residual signal to those odds. It’s a much easier task. Suppose my model is missing an important variable that is uncorrelated with the other variables in my model. In a model from scratch, all my predictions will be wrong. In an odds offset model, as long as that variable is priced into the public odds it will end up in my model implicitly; I won’t find an edge, but I won’t lose one either.

The end…or is it?
Around 4 years into our journey, we stopped winning. We broke even for a while, then slowly started losing. Just like any other prediction market, horse racing is a data science arms race. The competition is improving their models and getting sharper every day. They have armies of quants, and I even heard rumours that some of them have spotters at the tracks to look for things that are not captured in the data. I was one guy with a full time job doing this on evenings and weekends, plus I had married my wife and had our first child during this period. I just couldn’t keep up…so we called it quits. During our run we bet a total of approximately $100 million, with a return after rebates of around 3%. Being a part of it is one of the coolest things I’ve ever done.
You might also want to consider what Bill Benter has personally stated on a similar topic. If you don’t recognize the name here’s a link:
http://www.worlds-greatest-gamblers....illiam-benter/
If you don’t respect his credibility, I would assume that you’re beyond his accomplishments and capabilities. So the significance of his comments (below) may be a moot point.

Quote:
Excerpts from:
“Computer Based Horse Race Handicapping and Wagering Systems:”
A Report by William Benter


INTRODUCTION
The question of whether a fully mechanical system can ever "beat the races" has been widely discussed in both the academic and popular literature. Certain authors have convincingly demonstrated that profitable wagering systems do exist for the races. The most well documented of these have generally been of the technical variety, that is, they are concerned mainly with the public odds, and do not attempt to predict horse performance from fundamental factors. Technical systems for place and show betting, (Ziemba and Hausch, 1987) and exotic pool betting, (Ziemba and Hausch,1986) as well as the 'odds movement' system developed by Asch and Quandt (1986), fall into this category. A benefit of these systems is that they require relatively little preparatory effort, and can be effectively employed by the occasional race goer.

The complexity of predicting horse performance makes the specification of an elegant handicapping model quite difficult. Ideally, each independent variable would capture a unique aspect of the influences effecting horse performance. In the author's experience, the trial and error method of adding independent variables to increase the model's goodness-of-fit, results in the model tending to become a hodgepodge of highly correlated variables whose individual significance's are difficult to determine and often counter-intuitive.

Additionally, there will always be a significant amount of 'inside information' in horse racing that cannot be readily included in a statistical model. Trainer's and jockey's intentions, secret workouts, whether the horse ate its breakfast, and the like, will be available to certain parties who will no doubt take advantage of it. Their betting will be reflected in the odds. This presents an obstacle to the model developer with access to published information only. For a statistical model to compete in this environment, it must make full use of the advantages of computer modeling, namely, the ability to make complex calculations on large data sets.

The odds set by the public betting yield a sophisticated estimate of the horses' win probabilities.

It can be presumed that valid fundamental information exists which can not be systematically or practically incorporated into a statistical model. Therefore, any statistical model, however well developed, will always be incomplete. An extremely important step in model development, and one that the author believes has been generally overlooked in the literature, is the estimation of the relation of the model's probability estimates to the public's estimates, and the adjustment of the model's estimates to incorporate whatever information can be gleaned from the public's estimates. The public's implied probability estimates generally correspond well with the actual frequencies of winning.
BTW in contrast to the section on “Offsetting the Public Odds” (above) and the last 2 sentences express exactly how Mr. Benter was able to achieve his success.
Nitro is offline   Reply With Quote Reply
Old 09-01-2020, 05:55 PM   #13
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by Nitro View Post
I also found some segments of the article (written I believe 8 years ago) very interesting especially when compared to similar content in a report authored by Bill Benter. From what I gather they did show a profit, but nothing compared to what Mr. Benter accomplished.
The segments I’m referring to are (both high-lighted in blue):

You might also want to consider what Bill Benter has personally stated on a similar topic. If you don’t recognize the name here’s a link:
http://www.worlds-greatest-gamblers....illiam-benter/
If you don’t respect his credibility, I would assume that you’re beyond his accomplishments and capabilities. So the significance of his comments (below) may be a moot point.


BTW in contrast to the section on “Offsetting the Public Odds” (above) and the last 2 sentences express exactly how Mr. Benter was able to achieve his success.
Bill Benter was a true pioneer in this area. Things have changed a lot since then...More computer players, greater pool efficiency, smaller fields...Much tougher to win without an alternative approach and rebates.
CBYRacer is offline   Reply With Quote Reply
Old 09-01-2020, 08:41 PM   #14
Nitro
Registered User
 
Nitro's Avatar
 
Join Date: Feb 2009
Location: NY
Posts: 18,949
Quote:
Originally Posted by CBYRacer View Post
Bill Benter was a true pioneer in this area. Things have changed a lot since then...More computer players, greater pool efficiency, smaller fields...Much tougher to win without an alternative approach and rebates.
Well, I could certainly agree with most of “changes” to the racing dynamics you’ve described, but only because they pertain primarily to Stateside racing. They don’t apply to Hong Kong racing where Bill Benter made his fortune.

The HK dynamic is certainly different in many positive respects. Two of which include large race fields (averaging 12 to 14 entries per race) and the typically huge Win, Place and Exotic betting pool sizes on a race by race basis. I found another segment written by the author in question and highlighted what I thought to be very interesting, considering how mentioned the NEGATIVE effects of large pool sizes with regard to using that Calculus formula. Again because average everyday betting pools in HK are significantly larger and easily comparable in size to those in the KY Derby or Breeder’s Cup races this Calculus approach would fail miserably.
Quote:
Author & Syndicate – UNKNOWN
Optimal bet sizing
In a parimutuel system, your bets have a direct influence on the payout odds. For small bets and/or large pools this influence is negligible; however, professional syndicates generally bet large enough that this makes a big difference. Every dollar you add to your bet size diminishes the EV of all of the previous dollars in your bet. This is why “staking methods change your variance but not your EV” is correct everywhere BUT here, and why using the Kelly Criterion alone can turn a winning model into a losing result.
When you factor in the impact of your bet on the final odds, the EV is no longer just win probability x odds – 1, it becomes a fairly complex function of the win probability, the odds-implied probability, the pool size, the takeout and the rebate. Hey, remember when you were in high school thinking “calculus is stupid, I’m never going to use this”?
The optimal bet size is the point where each successive dollar added to your bet stops adding to your EV and before it starts reducing your EV. That is, the solution to
d(EV) / d(bet size) = 0
Yay calculus!

This works well for everything except huge pools like the triple crown races, where the optimal bet size could be millions of dollars. So we took the lesser of the optimal bet size according to this formula and the optimal bet size according to the Kelly Criterion.
So that brings me to the “alternative approach” you mentioned. Keep in mind that both of these “Modeling” reports suggest using all sorts of variables based on racing data and statistics in their models in conjunction with a real time Win pool probability estimates in order to be successful. So it becomes nice mix of old information with some current information. Doesn’t the betting pool probability become an additional variable? If so, than I completely agree with Mr. Benter’s assessment:
Quote:
The method of adding independent variables to increase the model's goodness-of-fit, results in the model tending to become a hodgepodge of highly correlated variables whose individual significance's are difficult to determine and often counter-intuitive.
My alternative selection approach is to entirely disregard ALL of the racing data and statistics. (The old information) My selections for the contenders in any race are based on money flow in all of the available betting pools which are analyzed by a very sophisticated tote analysis program. (The current information) Why is it ultimately successful? Because I’ve been told that it actually takes into account human psychological betting habits and characteristics to establish distinguishable betting patterns which become obvious to the end user (me).
Nitro is offline   Reply With Quote Reply
Old 09-01-2020, 10:46 PM   #15
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,287
Quote:
Originally Posted by Nitro View Post
...the NEGATIVE effects of large pool sizes with regard to using that Calculus formula...
To be fair, I don't think the author was saying their model was incapable of identifying positive expectancy on major race days - KY Derby, Belmont, Preakness, Breeders Cup, etc.

I think the author was simply saying they could not use d(EV) / d(bet size) = 0 for optimal bet sizing on major race days.

My guess as to why would be the same reason I can't use that (or similar) formulas for bet sizing on major race days.

Simply put:

If d(EV) / d(bet size) is in + EV territory as the horses are facing up to the gate on a major race day, the pools are so large, that in order for me to knock the odds down to the point where d(EV) / d(bet size) actually approaches break-even territory:

I'd have to bet my entire net worth (many times over) on that one race.

How many times in a row can you say "I'm all in" before you tap out?


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com

Last edited by Jeff P; 09-01-2020 at 10:55 PM.
Jeff P is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 04:25 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.