A/I & Computer Handicapping - Page 3 - Horse Racing Forum - PaceAdvantage.Com

HalvOnHorseracing · 01-20-2019, 12:36 PM

Good stuff in this thread.

I suppose I would also be a pencil and paper handicapper. I'm looking for horses that don't stick out for the public.

When it comes to two year old FTS these are the things I look for:

- breeding. Look at the average winning distance for the sire and mare, and the percentage of first time winners.

- trainer. Sometimes you can look at the trainer and stop handicapping. For years Todd Pletcher's first timers at Saratoga would be in the 40% win range, and occasionally one of them might go off at 6-1. Chad Brown is dynamite with young grass fillies. But there are also less well known trainers who do well with young horses. Identify them and get some nice payoffs. If you see a trainer with a 1 for 62 record with FTS, toss the horse.

- workout pattern. I look for horses that show good speed in one of their very early workouts, and then a steady pattern of workouts, mostly at 4-furlongs. This tells me the horse is healthy and has some talent. I shy away from horses that have been working 4-6 months with breaks in the workout pattern.

- gate jockey, like Paco Lopez

- no front wraps

- sale price. It's usually a positive when a horse sells for significantly more than the stud fee

- tote action. I like to see a couple of good hits on the tote board

PaceAdvantage · 01-20-2019, 02:04 PM

Another tremendous thread that proves it's worthwhile to keep doing this after almost 20 years...thank you!

Jeff P · 01-20-2019, 03:09 PM

Quote:

Originally Posted by sjk

The way I handle firsters (and horses who have not run in 120 days) is simple for those looking for such an approach.

Firsters as a group return -25% which is the same the return on a randomly chosen horse. I give in to the idea that I do not have enough information to hope to overcome that edge so I never bet them.

In order to assign odds to all of the other competitors I give the firsters the same chance to win as the public. To do this you need to be willing to adjust your odds on the fly based on real time tote information. So I take the tote odds and normalize all of the chances to 100% to see what remain for the others.

I don't play races where firsters and long layoff horses are more than 1/3 of the field. That cuts out a lot of maiden races.

Of course they beat me and it is annoying when that happens just one of many ways to get beat so I accept it.

sjk,

I can certainly appreciate the case for simplicity.

I never created functions to identify and handle outliers and FTS because I was looking to bet FTS.

I created those functions because I wanted better prob estimates for the horses I primarily bet --

Horses with hidden advantages that have run before.

This is what I have in my database for FTS from 01-01-2018 current through Fri 01-18-2019:

Code:

Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor  Odds Cap: None
SQL UDM Plays Report: Hide

SQL:  SELECT * FROM STARTERHISTORY
      WHERE STARTSLIFETIME=0 
      AND [DATE] >= #01-01-2018# 
      AND [DATE] <= #01-18-2019# 
      ORDER BY [DATE], TRACK, RACE


Data Summary          Win         Place          Show
-----------------------------------------------------
Mutuel Totals    25008.10      23804.70      22748.00
Bet             -32746.00     -32746.00     -32746.00
-----------------------------------------------------
P/L              -7737.90      -8941.30      -9998.00

Wins                 1645          3247          4824
Plays               16373         16373         16373
PCT                 .1005         .1983         .2946
ROI                0.7637        0.7269        0.6947
Avg Mut             15.20          7.33          4.72

If I were to skip identifying and handling outliers altogether, and if I weren't scoring rider, trainer, post position, workouts, breeding, etc. for FTS:

I'd end up treating all FTS pretty much the same. The above sample would be close to my baseline for every FTS -- and my models would end up generating a prob estimate for every FTS somewhere close to 0.10.

That said, this is what the above FTS sample looks like broken out by rank for one of my global models:

Code:

By: SQL-F10 Rank

Rank       P/L        Bet        Roi    Wins   Plays     Pct     Impact     AvgMut
----------------------------------------------------------------------------------
 1     -127.70    3476.00     0.9633     368    1738   .2117     2.1075       9.10  
 2     -378.50    4876.00     0.9224     355    2438   .1456     1.4493      12.67  
 3    -1180.10    4964.00     0.7623     272    2482   .1096     1.0908      13.91  
 4    -1252.50    4686.00     0.7327     201    2343   .0858     0.8539      17.08  
 5    -1205.80    4084.00     0.7048     155    2042   .0759     0.7555      18.57  
 6    -1108.20    3576.00     0.6901     127    1788   .0710     0.7070      19.43  
 7     -619.80    2536.00     0.7556      71    1268   .0560     0.5573      26.99  
 8     -543.70    1826.00     0.7022      47     913   .0515     0.5124      27.28  
 9     -606.20    1252.00     0.5158      24     626   .0383     0.3816      26.91  
10     -451.80     858.00     0.4734      15     429   .0350     0.3480      27.08  
11     -120.70     392.00     0.6921       6     196   .0306     0.3047      45.22  
12     -140.10     198.00     0.2924       3      99   .0303     0.3016      19.30  
13        9.20      10.00     1.9200       1       5   .2000     1.9906      19.20  
14      -12.00      12.00     0.0000       0       6   .0000     0.0000       0.00

The data suggests all FTS are not the same.

About 0.1062 of all FTS can reliably be predicted to win north of 0.21. About 0.1489 of all FTS can reliably be predicted to win at about 0.145.

At the other end of the spectrum, about 0.2163 of all FTS can reliably be predicted to win at 0.056 or less.

If I use a weighted average that reflects the number of starters in each row, the FTS for ranks 3,4,5,6 -- or about 0.5286 of all FTS -- have an avg win rate of about 0.0872.

From this, if I weren't using functions to handle outliers and FTS:

My prob estimates for FTS would be in line with reality roughly 0.5286 of the time.

The other 0.4714 of the time, my prob estimates for FTS would be out of line with reality.

Intuitively, that might seem ok -- especially if I don't plan on betting FTS.

But if I am using a model where the prob estimates for all horses in the race must sum to 1:

Prob estimates for outliers like FTS will impact horses that I primarily bet.

Suppose I am considering whether or not to bet a horse that has run before where I see hidden advantages (let's call him Horse_A) in a race with one FTS.

If I do nothing to handle outliers or FTS the race might look something like this:

Code:

Name       Prob   Odds  EV    Notes
-------    ----  -----  ----  --------------------------
Horse_A    0.26    3-1  1.04  EV = 0.26 x (3 + 1)
Horse_B    0.38    7-5  0.91  
Horse_C    0.16    7-2  0.72
Horse_D	   0.10    9-2  0.55  FTS
Horse_D	   0.07   10-1  0.77
Horse_E    0.03   18-1  0.57
----------------------------
Total      1.00

Under this scenario:

I'm probably betting Horse_A at EV = 1.04 with or without a rebate.
I'm not betting the FTS at EV = 0.55

But what if I am handling outliers and FTS as described in my above posts?

What if the FTS in this race is one of the rare 0.1062 that deserves to be rated at 0.21?

The prob estimates change -- not just for the FTS -- but for every horse in the race.

Given the revised prob estimate of the FTS, the race might look something like this:

Code:

Name       Prob   Odds  EV    Notes
-------    ----  -----  ----  --------------------------
Horse_A    0.24    3-1  0.96  EV = 0.24 x (3 + 1)
Horse_B    0.37    7-5  0.89  
Horse_C    0.13    7-2  0.59
Horse_D	   0.21    9-2  1.16  FTS  EV = 0.21 x (4.5 + 1)
Horse_D	   0.03   10-1  0.33
Horse_E    0.02   18-1  0.38
----------------------------
Total      1.00

Under this scenario:

I'm probably not betting Horse_A at EV = 0.96 without a rebate.
I actually should bet the FTS at EV = 1.16

I realize the numbers in the above race tables were cherry picked to generate specific EV's for Horse_A and the FTS.

But I hope the above example explains why I decided to create functions for identifying and handling outliers.

-jp

.

Jeff P · 01-20-2019, 03:39 PM

In my most recent post, I accidentally hit the submit button and ran out of edit time before I was finished.

Instead of:

Quote:

If I were to skip identifying and handling outliers altogether, and if I weren't scoring rider, trainer, post position, workouts, breeding, etc. for FTS:

I'd end up treating all FTS pretty much the same. The above sample would be close to my baseline for every FTS -- and my models would end up generating a prob estimate for every FTS somewhere close to 0.10.

I meant to post the following:

Quote:

f I were to skip identifying and handling outliers altogether, and if I weren't scoring rider, trainer, post position, workouts, breeding, etc. for FTS:

I'd end up treating all FTS pretty much the same. The above sample would be close to my baseline for every FTS -- and my models would end up generating a before the odds are known prob estimate for every FTS somewhere close to 0.10.

I hope that makes sense.

-jp

.

sjk · 01-21-2019, 08:16 AM

Jeff,

Looks like you have a fine measure to distinguish FTS.

If tote information is available it is going to give a wider scope to the adjustments you make to the other runners.

The difference between the .1 and the .21 is going to swing the odds on the others by around 10%. I require a much higher level of EV than the 1.16 so I am generally according a higher allowance for the public odds. A change of 10% in calculated odds is only going to affect plays at the margin.

Looks like FTS which go off at 3-1 or less win around 31% of the time and those which are odds on win around half of the time. (These are not exact stats; quick calculation). If I were up against one of these I would think it relevant to the odds line.

classhandicapper · 01-21-2019, 10:51 AM

My database is made up of DRF Formulator PP data and Text Results Chart files for all the major tracks since late 2014. It also includes the same data for any track not on that list that ran a Graded Stakes on that day in North America.

I don't use my database to generate odds lines or automatic plays.

I use it to do research, answer controversial handicapping questions, to develop, test, and optimize personal metrics, and to just generally make the handicapping process easier.

At the start of the handicapping day I get a series of reports that identify the races most likely to offer a play, horses that are likely to be better or worse than they look on paper, and personalized performance ratings I've demonstrated outperform speed figures. Then I focus my attention on where I think the best opportunities to find value will be and handicap manually after that.

IMO, the best part about having a database is the ability to do research and answer handicapping questions. I can test what I or other handicappers "think" and get past all the faulty ideas and theories out there very quickly.

classhandicapper · 01-21-2019, 10:54 AM

I don't try to make any projections on FTS in my reports or metrics. I just note the number of FTS. That allows me to lower the confidence level on anything related to the race as the number of FTS increases.

classhandicapper · 01-21-2019, 11:45 AM

You guys have given my a pretty good idea.

As described above, I don't change any of my numbers to reflect FTS. I identify the number of FTS and alter my thinking about the probability of my projections being right based on that # of FTS. I change the risk profile.

However, there's no reason I can't start calculating actual probabilities like that.

Sprint, Dirt, 8 horse field, 2 FTS - How often does one of the FTS get the lead or race within 1/2 length pressing?

Sprint, Dirt, 8 horse field, 3 FTS - How often does one of the FTS get the lead or race within 1/2 length pressing?

Sprint, Dirt, 9 horse field, 2 FTS - How often does one of the FTS get the lead or race within 1/2 length pressing?

etc...

I can create a table by Sprint/Route, Dirt/Turf, Field Size, and # of FTS to get the actual probabilities. Then I could even odds adjust it.

jay68802 · 01-21-2019, 12:11 PM

Got to give this thread a thumbs up. I use a computer to handicap. The one thing that I can't do is FTS's. The problem is if the ideas keep coming I am going to try and use them. And that means more writing and less handicapping.

formula_2002 · 01-21-2019, 08:23 PM

Quote:

Originally Posted by PIC6SIX

I am asking only as a point of interest since I am 73 years old and a dyed in the wool pen and pencil capper. I would like to know from where, what and how you computer guys gather/manipulate your info. Do you buy downloads from DRF then sort such data according to your program (your own hcp parameters). Maybe someone would like to go through their step by step handicapping process and how long it takes to hcp an 8 race card at one track. No handicapping secrets asked.

PIC6SIX,
I’m about 9 years your senior.
I’ve been working with computers since dos first came out and I have studied exacta play way before that.
I have forgotten more than I can remember but I still have my hand in the game, though not for betting of course.
That will change once I get one of these darn formula’s right.
I use excel now. At one time I used dbase 5.
I do use a program that I wrote to download the twinspires toteboard wps and exacta data into my excel file.. I must have a few thousands of them by now. I depend on the public to leave a bit of meat on the table for me. I don’t use anything other than the tote board data.
Other than for me, it’s a not user-friendly program.
From there it’s “ if, then, else, and, or etc.”
That’s the short of it.
Good luck with your endeavors.

Buckeye · 01-21-2019, 08:43 PM

Well here's the story, nobody knows what's going to happen but they bet with that truth in mind even though it's false.

Who's your competition?

What do they do?

Buckeye · 01-21-2019, 08:49 PM

Basically, they are betting like they know what the truth will be and that's problematic.

headhawg · 01-21-2019, 11:57 PM

Quote:

Originally Posted by Buckeye

Basically, they are betting like they know what the truth will be and that's problematic.

So you use the dartboard method or pick horse names out of hat then, yes?

Augenj · 01-22-2019, 12:46 AM

I've hesitated to reply to this post because of the intimidating nature of the replies. Some of the terms in the replies are over my head and others are hard to understand, even with my decades of both horse race handicapping and a career in Information Technology. Some background... I was there at the dawn of both mainframe and personal computers. My career spanned most jobs in IT from computer operator up through installing and maintaining IBM operating systems on mainframe computers, a job that left me as a burned out former shell of myself.

Let me explain how I see computers and horse racing from a simple person's point of view using a seat-of-the-pants type of regression analysis and forecasting. My personal computer system uses basic math with only square root as as its most exotic function. The language is Microsoft VB.Net. It's a black box type of system where at the end of the process you push a Forecast button and a day's races are calculated in under a second. The approach I use isn't an end-all, be-all solution to computerized handicapping and it probably never will be. However, it continues to show promise and I intend to make it better. See THA Free Picks here at PA for an example.

PIC6SIX posted "Maybe someone would like to go through their step by step handicapping process and how long it takes to hcp an 8 race card at one track. No handicapping secrets asked." and here's my response.

Short answer is:
* Download CSV PP data files for the track.
* Select the parameter file for the track with a button.
* Select the data file for the track with a button.
* Click the Forecast button.
Total time is less than a minute.

Longer answer is:
* Download a full year's CSV PP data for a track.
* Build a data base with the downloaded data.
* Run a regression analysis on the data base to calculate and optimize parameters for jockey, trainer, speed, finish, earnings, workouts, lead tendency, and closer tendency.
* Combine these base factors into a single rating for the horse. If FTS or foreign factors have data missing, use the mean of the other horses' factors.
* Calculate several hundred conditional factors and their opposites to pair with highest rated horse.
* Select the best of the profitable factors for possible betting.
All of this is done programmatically except for the download.

Do the short answer above as needed.

As they used to say in IT, this is the view from 50,000 feet, meaning that a lot of detail has not been mentioned.

My hat's off to those who drilled deeper in earlier replies.

jasperson · 01-22-2019, 11:42 AM

Quote:

Originally Posted by jay68802

Got to give this thread a thumbs up. I use a computer to handicap. The one thing that I can't do is FTS's. The problem is if the ideas keep coming I am going to try and use them. And that means more writing and less handicapping.

My method with FTS is to evaluate the field that he will be running against. As a friend of mine said when won with a FTS " How did you pick that long shot?" He answered the horse hadn't proved he couldn't do it yet. He was up against a field of maidens the couldn't run close to par. This is the method I use also.
I know it is not very scientific but for me it works. I never bet against a horse that run the par or better in his last race. The first race at GP Saturday is an example. The

horse run a 86 at 7f par was for the race was 86. The betters made the FTs

the odds on favorite and he finished out of the money.
The

paid a nice $10.40 to win. I love maiden races.