Quote:
Originally Posted by sjk
The way I handle firsters (and horses who have not run in 120 days) is simple for those looking for such an approach.
Firsters as a group return -25% which is the same the return on a randomly chosen horse. I give in to the idea that I do not have enough information to hope to overcome that edge so I never bet them.
In order to assign odds to all of the other competitors I give the firsters the same chance to win as the public. To do this you need to be willing to adjust your odds on the fly based on real time tote information. So I take the tote odds and normalize all of the chances to 100% to see what remain for the others.
I don't play races where firsters and long layoff horses are more than 1/3 of the field. That cuts out a lot of maiden races.
Of course they beat me and it is annoying when that happens just one of many ways to get beat so I accept it.
|
sjk,
I can certainly appreciate the case for simplicity.
I never created functions to identify and handle outliers and FTS because I was looking to bet FTS.
I created those functions because I wanted better prob estimates for the horses I primarily bet --
Horses with hidden advantages that have run before.
This is what I have in my database for FTS from 01-01-2018 current through Fri 01-18-2019:
Code:
Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor Odds Cap: None
SQL UDM Plays Report: Hide
SQL: SELECT * FROM STARTERHISTORY
WHERE STARTSLIFETIME=0
AND [DATE] >= #01-01-2018#
AND [DATE] <= #01-18-2019#
ORDER BY [DATE], TRACK, RACE
Data Summary Win Place Show
-----------------------------------------------------
Mutuel Totals 25008.10 23804.70 22748.00
Bet -32746.00 -32746.00 -32746.00
-----------------------------------------------------
P/L -7737.90 -8941.30 -9998.00
Wins 1645 3247 4824
Plays 16373 16373 16373
PCT .1005 .1983 .2946
ROI 0.7637 0.7269 0.6947
Avg Mut 15.20 7.33 4.72
If I were to skip identifying and handling outliers altogether, and if I weren't scoring rider, trainer, post position, workouts, breeding, etc. for FTS:
I'd end up treating all FTS pretty much the same. The above sample would be close to my baseline for every FTS -- and my models would end up generating a prob estimate for every FTS somewhere close to 0.10.
That said, this is what the above FTS sample looks like broken out by rank for one of my global models:
Code:
By: SQL-F10 Rank
Rank P/L Bet Roi Wins Plays Pct Impact AvgMut
----------------------------------------------------------------------------------
1 -127.70 3476.00 0.9633 368 1738 .2117 2.1075 9.10
2 -378.50 4876.00 0.9224 355 2438 .1456 1.4493 12.67
3 -1180.10 4964.00 0.7623 272 2482 .1096 1.0908 13.91
4 -1252.50 4686.00 0.7327 201 2343 .0858 0.8539 17.08
5 -1205.80 4084.00 0.7048 155 2042 .0759 0.7555 18.57
6 -1108.20 3576.00 0.6901 127 1788 .0710 0.7070 19.43
7 -619.80 2536.00 0.7556 71 1268 .0560 0.5573 26.99
8 -543.70 1826.00 0.7022 47 913 .0515 0.5124 27.28
9 -606.20 1252.00 0.5158 24 626 .0383 0.3816 26.91
10 -451.80 858.00 0.4734 15 429 .0350 0.3480 27.08
11 -120.70 392.00 0.6921 6 196 .0306 0.3047 45.22
12 -140.10 198.00 0.2924 3 99 .0303 0.3016 19.30
13 9.20 10.00 1.9200 1 5 .2000 1.9906 19.20
14 -12.00 12.00 0.0000 0 6 .0000 0.0000 0.00
The data suggests all FTS are not the same.
About 0.1062 of all FTS can reliably be predicted to win north of 0.21. About 0.1489 of all FTS can reliably be predicted to win at about 0.145.
At the other end of the spectrum, about 0.2163 of all FTS can reliably be predicted to win at 0.056 or less.
If I use a weighted average that reflects the number of starters in each row, the FTS for ranks 3,4,5,6 -- or about 0.5286 of all FTS -- have an avg win rate of about 0.0872.
From this, if I weren't using functions to handle outliers and FTS:
My prob estimates for FTS would be in line with reality roughly 0.5286 of the time.
The other 0.4714 of the time, my prob estimates for FTS would be out of line with reality.
Intuitively, that might seem ok -- especially if I don't plan on betting FTS.
But if I am using a model where the prob estimates for all horses in the race must sum to 1:
Prob estimates for outliers like FTS will impact horses that I primarily bet.
Suppose I am considering whether or not to bet a horse that has run before where I see hidden advantages (let's call him Horse_A) in a race with one FTS.
If I do nothing to handle outliers or FTS the race might look something like this:
Code:
Name Prob Odds EV Notes
------- ---- ----- ---- --------------------------
Horse_A 0.26 3-1 1.04 EV = 0.26 x (3 + 1)
Horse_B 0.38 7-5 0.91
Horse_C 0.16 7-2 0.72
Horse_D 0.10 9-2 0.55 FTS
Horse_D 0.07 10-1 0.77
Horse_E 0.03 18-1 0.57
----------------------------
Total 1.00
Under this scenario:
- I'm probably betting Horse_A at EV = 1.04 with or without a rebate.
- I'm not betting the FTS at EV = 0.55
But what if I am handling outliers and FTS as described in my above posts?
What if the FTS in this race is one of the rare 0.1062 that deserves to be rated at 0.21?
The prob estimates change -- not just for the FTS -- but for every horse in the race.
Given the revised prob estimate of the FTS, the race might look something like this:
Code:
Name Prob Odds EV Notes
------- ---- ----- ---- --------------------------
Horse_A 0.24 3-1 0.96 EV = 0.24 x (3 + 1)
Horse_B 0.37 7-5 0.89
Horse_C 0.13 7-2 0.59
Horse_D 0.21 9-2 1.16 FTS EV = 0.21 x (4.5 + 1)
Horse_D 0.03 10-1 0.33
Horse_E 0.02 18-1 0.38
----------------------------
Total 1.00
Under this scenario:
- I'm probably not betting Horse_A at EV = 0.96 without a rebate.
- I actually should bet the FTS at EV = 1.16
I realize the numbers in the above race tables were cherry picked to generate specific EV's for Horse_A and the FTS.
But I hope the above example explains why I decided to create functions for identifying and handling outliers.
-jp
.