Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 09-03-2020, 12:32 PM   #1
hpollock
Apprentice
 
Join Date: Jul 2019
Posts: 10
How would you quantify being held up for clear running for inclusion in model?

How would you quantify or factor this into a model
hpollock is offline   Reply With Quote Reply
Old 09-03-2020, 12:43 PM   #2
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by hpollock View Post
How would you quantify or factor this into a model
What does 'being held up for clear running' mean? The jockey holding back the horse?
CBYRacer is offline   Reply With Quote Reply
Old 09-03-2020, 02:33 PM   #3
cj
@TimeformUSfigs
 
cj's Avatar
 
Join Date: Jan 2002
Location: Moore, OK
Posts: 46,828
Quote:
Originally Posted by CBYRacer View Post
What does 'being held up for clear running' mean? The jockey holding back the horse?
Sounds to me like stuck in traffic.
cj is offline   Reply With Quote Reply
Old 09-03-2020, 02:42 PM   #4
thaskalos
Registered User
 
Join Date: Jan 2006
Posts: 28,549
Some people post as if these are telegrams where they charge you by the letter.
__________________
Live to play another day.
thaskalos is offline   Reply With Quote Reply
Old 09-03-2020, 02:43 PM   #5
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by cj View Post
Sounds to me like stuck in traffic.
Got it.

For model inclusion, I use Python to parse through the race comments of each horse. I have a lexicon (dictionary) of trip terms that I developed manually and classify these terms into buckets like 'Encountered traffic', 'Wide', 'Very Wide', etc. I then one-hot encode these buckets and feed those values into the model.

Curious how others handle this as well?
CBYRacer is offline   Reply With Quote Reply
Old 09-03-2020, 08:36 PM   #6
headhawg
crusty old guy
 
headhawg's Avatar
 
Join Date: Aug 2003
Location: Snarkytown USA
Posts: 3,917
IMO, if you are going to model anything that has to do with trips you better be watching the videos and taking your own notes. Parsing from charts will simply add noise and have no ROI advantage because everyone sees the same thing.
__________________
"Don't believe everything that you read on the Internet." -- Abraham Lincoln
headhawg is offline   Reply With Quote Reply
Old 09-03-2020, 10:48 PM   #7
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by headhawg View Post
IMO, if you are going to model anything that has to do with trips you better be watching the videos and taking your own notes. Parsing from charts will simply add noise and have no ROI advantage because everyone sees the same thing.
No doubt that watching the race replay firsthand is optimal if you have the time. At the same, unless you statistically test the parsed comments, you can't conclude that there will not be an ROI advantage. The public may systematically over or underweight certain chart comments based on their own personal biases. They may THINK a certain comment means something but not KNOW that it does. Also, I inferred from headhawg's post that he wants to incorporate something into his computer model. Notes from watching replays, while useful, can't be tested in a model unless you code them somehow or use natural language processing on your notes. Again, this process would be extremely time consuming unless you hired someone to do it. Are there ways to do this efficiently?
CBYRacer is offline   Reply With Quote Reply
Old 09-03-2020, 11:10 PM   #8
jay68802
Registered User
 
jay68802's Avatar
 
Join Date: May 2008
Location: Nebraska
Posts: 15,121
One of my best days at the track came about because of comments. The comments were:

Rank, stopped.
Rated, no response

After looking at the replays, I came to the conclusion that the comments should have read.

Again????
Why is he rating this horse, put him on the lead.

Horse went wire to wire at 38-1.

I rarely make any adjustments to pace and speed figures because of comments. Only make adjustments after watching replays.
jay68802 is offline   Reply With Quote Reply
Old 09-03-2020, 11:15 PM   #9
headhawg
crusty old guy
 
headhawg's Avatar
 
Join Date: Aug 2003
Location: Snarkytown USA
Posts: 3,917
Quote:
Originally Posted by CBYRacer View Post
No doubt that watching the race replay firsthand is optimal if you have the time. At the same, unless you statistically test the parsed comments, you can't conclude that there will not be an ROI advantage. The public may systematically over or underweight certain chart comments based on their own personal biases. They may THINK a certain comment means something but not KNOW that it does. Also, I inferred from headhawg's post that he wants to incorporate something into his computer model. Notes from watching replays, while useful, can't be tested in a model unless you code them somehow or use natural language processing on your notes. Again, this process would be extremely time consuming unless you hired someone to do it. Are there ways to do this efficiently?
I have no desire to model chart comments as the pursuit doesn't seem to be worthwhile to me. Does the same person make all of the comments in all of the charts? Of course not. Then how do we know that all chart creators use the same words to describe identical (or near-identical) trips? We don't, so to me that seems akin to GIGO. Much of the data that handicappers use already has built-in inaccuracy. I think that trying to model too many things just keeps adding more error into the mix. Just my .02.
__________________
"Don't believe everything that you read on the Internet." -- Abraham Lincoln
headhawg is offline   Reply With Quote Reply
Old 09-04-2020, 12:03 AM   #10
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,289
Assuming you are using something like conditional or multinomial logistic regression for your model --

Classify your trip types. To keep things simple, letter codes like TripA and TripB, etc. should work just fine.

Imo, it doesn't matter what your trip types are (at least not at first.) As you move forward with your model, the data will tell you which of your trip types (if any) are significant and which you can safely discard.

The important thing is to classify your trip types, give each a distinct letter code, compile the data for each of your trip types in a consistent manner, and include a column for your trip types in the history table you are using to accumulate data for purposes of building your model.

For example purposes, below is a simple history table that contains data for Remington Park R1 on 09-03-2020.

The Track, rDate, Race, Surf, Dist, and Odds columns should be self explanatory.

The Horse column contains the horse's position in the starting gate from the rail out.

The Speed column contains the horse's HDW final time speed fig from its most recent running line. (Imo, there's nothing magic about last race running line speed fig. Just using it here for example purposes.)

The TripA column contains a value of 1 for True in cases where the horse qualifies as a Trip Type A. Otherwise it contains a value of 0 for False. In this case Trip Type A describes a poor start last out.

The TripB column contains a value of 1 for True in cases where the horse qualifies as a Trip Type B. Otherwise it contains a value of 0 for False. In this case Trip Type B describes a horse that was making an outside closing move on the far turn (or tying to) last out.

The Wnr column is assigned a value of 1 to indicate True this horse won this race. All other horses are assigned a value of 0 for False.

The table structure with data looks something like this:

Code:
Track  rDate     Race  Surf  Dist  Horse  Speed  TripA  TripB   Odds  Wnr
-----  --------  ----  ----  ----  -----  -----  -----  -----  -----  ---
  RPX  9/3/2020     1     1  1210      1     67      0      0    4.5    0
  RPX  9/3/2020     1     1  1210      2     61      0      0   29.5    0
  RPX  9/3/2020     1     1  1210      3     55      1      0    2.5    1
  RPX  9/3/2020     1     1  1210      4     70      0      0    1.3    0
  RPX  9/3/2020     1     1  1210      5     63      1      0   13.4    0
  RPX  9/3/2020     1     1  1210      6     71      0      1    3.4    0

After your history contains data for a few thousand races, and if you've done a good job of compiling your trip type data in a consistent manner:

When you run the data through a third party stat tool such as SPSS, Stata, or one of the logistic regression packages in R:

The third party stat tool should be able to display significance for your trip types.

From there you should be in a position to make an informed decision whether or not to include your trip types in your model.

Hope I managed to type most of that out in a way that makes sense,


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com

Last edited by Jeff P; 09-04-2020 at 12:04 AM.
Jeff P is offline   Reply With Quote Reply
Old 09-04-2020, 12:12 AM   #11
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by Jeff P View Post
Assuming you are using something like conditional or multinomial logistic regression for your model --

Classify your trip types. To keep things simple, letter codes like TripA and TripB, etc. should work just fine.

Imo, it doesn't matter what your trip types are (at least not at first.) As you move forward with your model, the data will tell you which of your trip types (if any) are significant and which you can safely discard.

The important thing is to classify your trip types, give each a distinct letter code, compile the data for each of your trip types in a consistent manner, and include a column for your trip types in the history table you are using to accumulate data for purposes of building your model.

For example purposes, below is a simple history table that contains data for Remington Park R1 on 09-03-2020.

The Track, rDate, Race, Surf, Dist, and Odds columns should be self explanatory.

The Horse column contains the horse's position in the starting gate from the rail out.

The Speed column contains the horse's HDW final time speed fig from its most recent running line. (Imo, there's nothing magic about last race running line speed fig. Just using it here for example purposes.)

The TripA column contains a value of 1 for True in cases where the horse qualifies as a Trip Type A. Otherwise it contains a value of 0 for False. In this case Trip Type A describes a poor start last out.

The TripB column contains a value of 1 for True in cases where the horse qualifies as a Trip Type B. Otherwise it contains a value of 0 for False. In this case Trip Type B describes a horse that was making an outside closing move on the far turn (or tying to) last out.

The Wnr column is assigned a value of 1 to indicate True this horse won this race. All other horses are assigned a value of 0 for False.

The table structure with data looks something like this:

Code:
Track  rDate     Race  Surf  Dist  Horse  Speed  TripA  TripB   Odds  Wnr
-----  --------  ----  ----  ----  -----  -----  -----  -----  -----  ---
  RPX  9/3/2020     1     1  1210      1     67      0      0    4.5    0
  RPX  9/3/2020     1     1  1210      2     61      0      0   29.5    0
  RPX  9/3/2020     1     1  1210      3     55      1      0    2.5    1
  RPX  9/3/2020     1     1  1210      4     70      0      0    1.3    0
  RPX  9/3/2020     1     1  1210      5     63      1      0   13.4    0
  RPX  9/3/2020     1     1  1210      6     71      0      1    3.4    0

After your history contains data for a few thousand races, and if you've done a good job of compiling your trip type data in a consistent manner:

When you run the data through a third party stat tool such as SPSS, Stata, or one of the logistic regression packages in R:

The third party stat tool should be able to display significance for your trip types.

From there you should be in a position to make an informed decision whether or not to include your trip types in your model.

Hope I managed to type most of that out in a way that makes sense,


-jp

.
This is exactly what I was referring to. Jeff, are your Trip A, Trip B, etc. from watching race replays or parsing chart comments? If the former, how long did it take you to accumulate enough data (i.e., watch that many replays) for your model?
CBYRacer is offline   Reply With Quote Reply
Old 09-04-2020, 06:40 AM   #12
sjk
Registered User
 
Join Date: Feb 2003
Posts: 2,105
At one time I thought about parsing the comments but it looked like they were so different from one circuit to the next it would be difficult to classify by machine and I had no interest in looking at races one by one by one.

As I recall it looked like you could do pretty well betting horses that lost their jockey last out.
sjk is offline   Reply With Quote Reply
Old 09-04-2020, 08:52 AM   #13
Robert Fischer
clean money
 
Robert Fischer's Avatar
 
Join Date: Sep 2006
Location: Maryland
Posts: 23,558
I think it's worth it to pay a competent 'trip guy'.

Even if it's just a simple 'neutral' -1 or +1 ...

at least you then have a model and can see if there is a potential synergy(agrees), or tradeoff(disagrees) with a horse that is a potential play.


You get horses that were best and were race-ridden and shuffled back and trapped

you got horses who had a drive or rally that was 'muted'

then you got horses who in reality 'saved' a bunch of energy and got a dream trip while it superficially looks like trouble
__________________
Preparation. Discipline. Patience. Decisiveness.
Robert Fischer is offline   Reply With Quote Reply
Old 09-04-2020, 09:17 AM   #14
classhandicapper
Registered User
 
classhandicapper's Avatar
 
Join Date: Mar 2005
Location: Queens, NY
Posts: 20,610
Jeff,

That's a nice approach.

I've been able to test computer generated race flow and bias notes, but nothing beyond that. The rest has been trial and error experience. I've concluded T & E is a risky way to learn. A short flurry of random successes or failures can cause you to come to an incorrect conclusion that lasts for years.
__________________
"Unlearning is the highest form of learning"
classhandicapper is offline   Reply With Quote Reply
Old 09-04-2020, 10:43 AM   #15
CBYRacer
Registered User
 
Join Date: Jun 2020
Posts: 178
Quote:
Originally Posted by Robert Fischer View Post
I think it's worth it to pay a competent 'trip guy'.

Even if it's just a simple 'neutral' -1 or +1 ...

at least you then have a model and can see if there is a potential synergy(agrees), or tradeoff(disagrees) with a horse that is a potential play.


You get horses that were best and were race-ridden and shuffled back and trapped

you got horses who had a drive or rally that was 'muted'

then you got horses who in reality 'saved' a bunch of energy and got a dream trip while it superficially looks like trouble
I like this approach, Robert. Have you done this before? If so, any suggestions on where to find the person?
CBYRacer is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 08:16 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.