Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 04-12-2019, 08:09 AM   #31
JerryBoyle
Veteran
 
Join Date: Feb 2018
Posts: 845
Quote:
Originally Posted by ultracapper View Post
accurately predicting splits is exponentially more helpful than final time. An acceptable +- for final time should be pretty tight to be of value IMO. Maybe 2/5 give or take at 6 furlongs. Predicting 1:12.2 and accepting anything between 1:11.2 and 1:13.2 is much too liberal IMHO.
Value in what sense? My goal was to predict the estimated "fundamental" finishing time. By this I mean the finishing time considering nothing about the "condition" of today's surface. The hope being that the difference between predicted and actual would then be at least in part because of the condition of today's surface. Are you saying that variations in conditions only contribute up to +/- 2/5s at a 6f race? In general I'd love to hear from you/others on how much you might expect conditions to contribute to differences in time over days.

As an interesting thought experiment, if I could build a model that accurately predicts finishing time given nothing but all the runners' past information, as well as info about today's race that does not include condition of the surface, would that mean that condition of the surface isn't actually as big of a difference maker as people assume?
JerryBoyle is offline   Reply With Quote Reply
Old 04-20-2019, 02:28 AM   #32
deelo
Registered User
 
Join Date: Nov 2015
Location: LNN
Posts: 524
so it predicts final times on given days ahead of time? i would think it'd be useful to predict track variants ahead of time and possibly pace scenarios which would lead you to being able to upgrade certain running styles and downgrade non-optimal running styles.

Long story short, just find the chaos races and box the 6 worst odds so you can hit one of those $4,000 10 centers.
__________________
They didn't take your money...You paid for lessons
deelo is offline   Reply With Quote Reply
Old 04-20-2019, 12:09 PM   #33
JerryBoyle
Veteran
 
Join Date: Feb 2018
Posts: 845
Quote:
Originally Posted by deelo View Post
so it predicts final times on given days ahead of time? i would think it'd be useful to predict track variants ahead of time and possibly pace scenarios which would lead you to being able to upgrade certain running styles and downgrade non-optimal running styles.

Long story short, just find the chaos races and box the 6 worst odds so you can hit one of those $4,000 10 centers.
Yep, that's exactly what it currently does. I like the idea of creating pace scenarios. It's trivial to adjust it to predict any fraction, not just final. So what I will do is generate a model for each point of call.
JerryBoyle is offline   Reply With Quote Reply
Old 04-20-2019, 01:39 PM   #34
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,877
Quote:
Originally Posted by JerryBoyle View Post
Really two questions:

1. What could you do with this model?

2. What average error would you find good/ok/bad? That is, if the model, on average, is off by 1 second, would unacceptable would you consider this. Perhaps another way to think of it is, when looking at a race, how accurately could you guess the final race time?

I've toyed around with building one a few times, and they're reasonably accurate, but not great. The average error is about 1s across all races/distances. Obviously, 1s on a 6f race is much worse than 1s on 1 1/4 mile race. Originally, I thought I might use it as a way to determine if a track was running slower/faster on a given day by comparing the difference in estimate vs actual for all races.
Respectfully, 1 second (plus or minus) is 10 lengths, and that is just not meaningful. It would encapsulate about 94% of the winners.

I did this many years ago and using a simple curvo-linear regression on each horse's pacelines got it down to +/- 3 lengths. It was still pretty worthless.

BTW, for those who want to try this approach, what you do is run the regression and remove the paceline with the largest error. Then you continue until you get down to 3 races.

(In about 90% of the horses you should be able to draw a curve between 3 pacelines.)

It pointed to the obvious horses and the price horses became the outliers.
Dave Schwartz is online now   Reply With Quote Reply
Old 04-21-2019, 09:15 AM   #35
JerryBoyle
Veteran
 
Join Date: Feb 2018
Posts: 845
Quote:
Originally Posted by Dave Schwartz View Post
Respectfully, 1 second (plus or minus) is 10 lengths, and that is just not meaningful. It would encapsulate about 94% of the winners.

I did this many years ago and using a simple curvo-linear regression on each horse's pacelines got it down to +/- 3 lengths. It was still pretty worthless.

BTW, for those who want to try this approach, what you do is run the regression and remove the paceline with the largest error. Then you continue until you get down to 3 races.

(In about 90% of the horses you should be able to draw a curve between 3 pacelines.)

It pointed to the obvious horses and the price horses became the outliers.
Thanks, Dave, this is exactly some of the feedback I wanted. Looking at how many runners are on average included in the window between final time and final estimated time is an interesting way to do it. E.g. if the difference between expected and final is on avg 3s and if a 3s diff includes all runners, then it's totally useless.

I'd still be interested to hear from anyone how much track variants effect the final times of races day to day. I'm sure this is common knowledge for more experienced handicappers. Meaning, can a slow track slow a race down by more than a second on a given day? Or is the change usually smaller or larger?

Since coming back to this over the last 2 weeks, I've tweaked the model inputs a good deal and have gotten the average difference between final time and estimated time to .628 seconds. This covered ~67k races from 20160101-20190330. I've converted those differences to relative differences and the average relative difference to final race time is about .9%.

These differences have become my "variants", as well as some derivatives of them, like avg difference only on the specific surface. To test their "usefulness", I've included them in a fundamental model which does predict probability estimates of coming in first for each runner, similar to a conditional logit model. Holding all else equal, including specific races used, other metrics, etc, the variants make a significant impact on the final estimates vs not using them. However, that only answers one question, which is "is it useful to include these variants?". It doesn't tell me how good these variants are relative to variants created in a different manner, which I'd love to find out.
JerryBoyle is offline   Reply With Quote Reply
Old 04-21-2019, 11:09 AM   #36
sjk
Registered User
 
Join Date: Feb 2003
Posts: 2,105
I have an average absolute value of track variants about 8.5 points (beyer scale) so the track variant often affects final time by more than a second. Several seconds would not be uncommon.
sjk is offline   Reply With Quote Reply
Old 04-21-2019, 12:29 PM   #37
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,877
Quote:
Originally Posted by JerryBoyle View Post
Thanks, Dave, this is exactly some of the feedback I wanted. Looking at how many runners are on average included in the window between final time and final estimated time is an interesting way to do it. E.g. if the difference between expected and final is on avg 3s and if a 3s diff includes all runners, then it's totally useless.
Remember that I was doing a regression on how they run when they run. Thus, there was a bias towards "How good they are" as opposed to "How good are they usually."

There is a big difference.


Quote:
I'd still be interested to hear from anyone how much track variants effect the final times of races day to day. I'm sure this is common knowledge for more experienced handicappers. Meaning, can a slow track slow a race down by more than a second on a given day? Or is the change usually smaller or larger?
Again, I have experience. Making good, projection-based track variants is not an easy task. Frankly, it is a lifestyle commitment. That is, you have to give up your current lifestyle to support it.

However, my belief would be that it could produce wonderful numbers.

Two caveats:
1. You must do ALL the races and not just a handful of tracks.

2. Average Daily Variants are pretty close to worthless.


Quote:
Since coming back to this over the last 2 weeks, I've tweaked the model inputs a good deal and have gotten the average difference between final time and estimated time to .628 seconds. This covered ~67k races from 20160101-20190330. I've converted those differences to relative differences and the average relative difference to final race time is about .9%.
That's pretty much what I found.
Dave Schwartz is online now   Reply With Quote Reply
Old 04-21-2019, 09:46 PM   #38
JerryBoyle
Veteran
 
Join Date: Feb 2018
Posts: 845
Quote:
Originally Posted by Dave Schwartz View Post
2. Average Daily Variants are pretty close to worthless.
Hey Dave, I'm curious what you mean by this statement? Are you saying that variants change a lot from day-to-day to be of any use? That is, they must be averaged over many days/weeks/seasons?

This got me thinking about how I might test whether the variants I've created actually capture the information I believe they should capture. Given that they significantly increase the predictive power of a model which includes them, it seems they are capturing something relevant. However, I'd like to determine if they're capturing what I think they should be - whether a track is slower/faster than "expected".

One way I thought to do this is to measure a given day's average variant against the prior day's average. Presumably, a variant is caused by many factors, but the most important strike me as things that persist day-to-day such as track maintenance preference, weather, wear-and-tear, etc. This is to say, I'd expect a given day's surface conditions to correlate with a prior day's. Obviously, this will not always be the case, but on average, I wouldn't expect a surface to oscillate day-over-day from slow - fast - slow, etc. If this is true, then I'd expect there to be a strong correlation between prior day's variant and current day's. Here were the results using the relative difference between final race time and predicted race time (covering 2016-01-01 to 2019-03-30):

Code:
Correlation between prior day variant and current day variant: .521014

Linear regression using prior day variant as predictor and current day variant as target (x1 is prior day variant):

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.271
Model:                            OLS   Adj. R-squared:                  0.271
Method:                 Least Squares   F-statistic:                     6213.
Date:                Mon, 22 Apr 2019   Prob (F-statistic):               0.00
Time:                        01:23:21   Log-Likelihood:                 36938.
No. Observations:               16676   AIC:                        -7.387e+04
Df Residuals:                   16674   BIC:                        -7.386e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0061      0.000     27.610      0.000       0.006       0.007
x1             0.5230      0.007     78.821      0.000       0.510       0.536
==============================================================================
Omnibus:                     4268.570   Durbin-Watson:                   2.306
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           168508.100
Skew:                           0.503   Prob(JB):                         0.00
Kurtosis:                      18.540   Cond. No.                         32.4
==============================================================================
*One note about this analysis: it compares a given track-day's variant to the prior track-day. However, I didn't exclude gaps between track-days, so this will contain gaps as large as 1 year. Removing those data points will likely only increase the correlation, though.

I've attached a scatter plot with the fitted line. This doesn't tell me anything about the "quality" of this particular variant. It's entirely possible and likely that other variants are better, but I find this analysis interesting nonetheless (and i don't have access to any others).

As a follow up study, I think it'd be interesting to see if/when the correlation changes. That is, how long does it take a track to switch from fast - slow or vice versa.

Attached Images
File Type: png PriorDayVariantvsCurrentDay.png (14.5 KB, 0 views)

Last edited by JerryBoyle; 04-21-2019 at 09:52 PM.
JerryBoyle is offline   Reply With Quote Reply
Old 04-21-2019, 10:14 PM   #39
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,877
Quote:
Hey Dave, I'm curious what you mean by this statement? Are you saying that variants change a lot from day-to-day to be of any use? That is, they must be averaged over many days/weeks/seasons?
What I am saying is that every study I have ever done has proven that using an ADV approach (i.e. SR + TV) is far inferior to using no variant at all.

Surprisingly, it is just as true in the winter as in the summer and probably more so.

I know it sounds crazy but the ADV approach really does not work.
Dave Schwartz is online now   Reply With Quote Reply
Old 04-21-2019, 10:18 PM   #40
JerryBoyle
Veteran
 
Join Date: Feb 2018
Posts: 845
Quote:
Originally Posted by Dave Schwartz View Post
What I am saying is that every study I have ever done has proven that using an ADV approach (i.e. SR + TV) is far inferior to using no variant at all.

Surprisingly, it is just as true in the winter as in the summer and probably more so.

I know it sounds crazy but the ADV approach really does not work.
Ahhh, got it, got it.
JerryBoyle is offline   Reply With Quote Reply
Old 04-21-2019, 10:59 PM   #41
deelo
Registered User
 
Join Date: Nov 2015
Location: LNN
Posts: 524
I have a question. I'm not much of a programmer and only been playing horses for a few years, but I like to look at statistics guys' stuff here and there to learn a little at a time, maybe someday have the time to get into it more.

Anyways, I am curious as to if there's any logic in this and if not, I'd like to understand why.

So, the speeding rating and track variant are both based on final time of the entire race, correct.

Theory is that the most unstable part of the race is the beginning. You have different run-ups messing with the 2f call, you have extreme pace breaks, bad breaks, bumping, etc. The further you go, the more the race should "normalize" i would think. Breakouts settle a little, bad breaks recover a little, the field basically settles into their roles a little more. So in theory, as you progress through the race incrementals 2f to 4f, 4f to 6f, these segments should be more "stable" perhaps.

What if you throw out the first 2f. Consider the time 2f through Final the actual time of the race. Based off that time, create speed ratings and track variants the same way you normally would. Would this change anything? Would this be more reliable?

Is this a worthless thought or something worth messing with? Thanks in advance.
__________________
They didn't take your money...You paid for lessons
deelo is offline   Reply With Quote Reply
Old 04-22-2019, 05:22 PM   #42
Tom
The Voice of Reason!
 
Tom's Avatar
 
Join Date: Mar 2001
Location: Canandaigua, New york
Posts: 112,470
Quote:
What could you do with a model that accurately predicts final race time?
Sell it.

(Might be better than what some tracks are using to time races now!)
__________________
Who does the Racing Form Detective like in this one?
Tom is offline   Reply With Quote Reply
Old 04-22-2019, 10:12 PM   #43
thaskalos
Registered User
 
Join Date: Jan 2006
Posts: 28,390
Quote:
Originally Posted by Tom View Post
Sell it.

(Might be better than what some tracks are using to time races now!)


True enough.
__________________
Live to play another day.
thaskalos is offline   Reply With Quote Reply
Old 04-22-2019, 10:25 PM   #44
bob60566
Vancouver Island
 
Join Date: Dec 2010
Posts: 1,747
Quote:
Originally Posted by Tom View Post
Sell it.

(Might be better than what some tracks are using to time races now!)
Read statement in one of the racing books way back when.

Time is only for those behind bars

Last edited by bob60566; 04-22-2019 at 10:27 PM.
bob60566 is offline   Reply With Quote Reply
Old 04-22-2019, 11:23 PM   #45
thaskalos
Registered User
 
Join Date: Jan 2006
Posts: 28,390
Quote:
Originally Posted by bob60566 View Post
Read statement in one of the racing books way back when.

Time is only for those behind bars
The funny thing is...the guy who wrote that actually did 3 months in the can for passing bad checks.
__________________
Live to play another day.
thaskalos is offline   Reply With Quote Reply
Reply




Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 01:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.