Does this type of stat exist? [Archive] - Horse Racing Forum - PaceAdvantage.Com

SG4

01-08-2018, 03:13 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

cj

01-08-2018, 04:09 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

I love doing this kind of stuff. You need a database to do it, but once you have that it isn't too tough. I've never seen this one publicly but I've done similar myself. It is well worth the time and effort.

green80

01-08-2018, 04:16 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

I think equibase list an average odds per winner in their jockey stats. I think average odds per mount would be a better indicator. At all tracks there a the top few jockeys that are usually on the favorites. Of course they have the highest win percentages. The top jock at a track with an 18% win percentage is not near as impressive as some bug boy riding all 30/1 shots with a 10% win percentage. The stats don't tell all, its what kind of horses the jockey gets to ride.

classhandicapper

01-08-2018, 04:27 PM

I like this idea a lot.

If I have some extra time, I'll write the query and do the calculations for NYRA for 2017 and post the results later in the week.

cj

01-08-2018, 04:41 PM

I like this idea a lot.

If I have some extra time, I'll write the query and do the calculations for NYRA for 2017 and post the results later in the week.

Recommend anything over 5 counts as 5, odds rank and finish...gets rid of a lot of the noise. ;)

thaskalos

01-08-2018, 05:03 PM

And this stat could also be created for the TRAINER....or the jockey/trainer combination.

Charli125

01-08-2018, 05:59 PM

Recommend anything over 5 counts as 5, odds rank and finish...gets rid of a lot of the noise. ;)

I just ran this on my DB using the above and limiting to jockeys with at least 50 starts(should probably be 100). Not ready to share as the totals seem out of line so there might be a small issue but it really highlights the overbet and underbet jockeys. I'm sure it would do the same with trainers or any other connections.

JustRalph

01-08-2018, 09:02 PM

It sounds great......then along comes Kent Desormeux

cj

01-08-2018, 10:36 PM

I have to think back to how I programmed this. Riders that are almost always favored or second choice are going to have a lower rating most of the time...they have nowhere to go but down.

TheOracle

01-08-2018, 11:38 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Hey SG4

This sounds like a good idea public 4th finishes 2nd is a +2 but a public 1st choice finishing 5th is a -4.

My Jockey data isn’t where it needs to be but I can do this for Trainers at NYRA but I would use the Morning Lines since the odds are subject to change while the race is actually running as we’ve all seen

http://www.paceadvantage.com/forum/showthread.php?t=142423

Also, I would imagine a public 1st choice finishing 1st = 0 which would indicate that the outcome was expected so they wouldn’t earn a score for that situation

They would only earn a positive score if they finished better than what was expected and I guess you can add up the scores to see if they are performing above or below expectations

I will look into that I’m all for anything that goes away from using earnings as a criteria to rate Trainers

BreadandButter

01-09-2018, 01:10 AM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Sounds much like the Wong Index. Named after jockey Tommy Wong. Fonner published them at the end of the 2017 meet.

classhandicapper

01-09-2018, 09:28 AM

I have to think back to how I programmed this. Riders that are almost always favored or second choice are going to have a lower rating most of the time...they have nowhere to go but down.

It should be possible to tweak the model a little to correct for that.

I'm not saying this is the solution, but if you look at all favorites and determined average finish, then any finish above that would be a positive and any finish below that would be a negative.

You could do the same for 2nd choices, 3rd choices etc..

That way, even if every horse you rode was the favorite, you wouldn't have to win to avoid a negative rating. You'd just have to do average. And you could still get a positive rating if you won more than your fair share.

It you wanted to get crazier you could make adjustments for field size and actual odds.

If you are riding a 2/5 favorite in a 4 horse field that's different from riding a 3-1 favorite in a 12 horse field.

The one things I recall about this kind of thing is that it gets a little tricky to do ranking in Access, which is where I do most of my work. I might have to export to Excel. I think it's a little easier in Excel.

cj

01-09-2018, 10:40 AM

It should be possible to tweak the model a little to correct for that.

I'm not saying this is the solution, but if you look at all favorites and determined average finish, then any finish above that would be a positive and any finish below that would be a negative.

You could do the same for 2nd choices, 3rd choices etc..

That way, even if every horse you rode was the favorite, you wouldn't have to win to avoid a negative rating. You'd just have to do average. And you could still get a positive rating if you won more than your fair share.

It you wanted to get crazier you could make adjustments for field size and actual odds.

If you are riding a 2/5 favorite in a 4 horse field that's different from riding a 3-1 favorite in a 12 horse field.

The one things I recall about this kind of thing is that it gets a little tricky to do ranking in Access, which is where I do most of my work. I might have to export to Excel. I think it's a little easier in Excel.

Yeah, there are a lot of tweaks you can do. For simplicity, I did something like this.

=MIN(OddsRank+1,5)-MIN(FinishPosition,5)

That makes it so you get a positive for a win.

Jeff P

01-09-2018, 12:25 PM

You can tweak something like this a lot of different ways.

In the end, no matter what tweaks are made -- hopefully, what you end up with is a number that measures performance vs. expected performance.

In his book Precision (https://www.amazon.com/Precision-Statistical-Mathematical-Methods-Racing/dp/1432768522), on page 80, CX Wong wrote about a stat called Cumulative Probability or F(x) and presented a formula for calculating it.

It takes a little programming to calculate it, but I've found that Cumulative Probability or F(x) does a decent job of measuring performance vs. expected performance in a statistically valid way. (All of the values generated end up being between 0 and 1 and any value >= 0.50 suggests the rider, trainer, sire, what have you, etc. outperformed expectations.)

Imo, the way to get bang for your buck or separation between yourself and the other players who you are competing against in the pools with a stat like this doesn't necessarily come from the tweaks, although doing that can provide a degree of separation -- but rather from applying a stat like this in a conceptually unique way vs. the way the other players you are competing against in the pools are applying it.

For example, in the DRF podcast for his new book Betting with an Edge (http://www.paceadvantage.com/forum/showthread.php?t=141838), Mike Maloney talked about the value of knowing whether or not a rider likes to be inside or outside.

I'm not saying that knowing an inside rider from an outside rider is the be all end all of things (it's not.) But the database testing I've done suggests that knowing that and measuring performance produces better results than ignoring that and measuring performance for rider alone.

Imo, if you can give a stat like this some thought, and get creative about what is or isn't reflected in the odds:

And from there measure performance for rider, trainer, sire, what have you, etc, vs. expected performance given the situation...

I've found at that point you are more likely to actually HAVE something.

I hope I managed to type most of that out in a way that makes sense.

-jp

.

classhandicapper

01-09-2018, 12:32 PM

Here's a quick sample of the kind of thing you can do in 5 minutes.

This is Jockey performance for riders that had a minimum of 25 mounts on favorites at AQU, BEL and SAR. This covers about 2 1/2 years. Doing it by rank will take a little longer. Something like this could also obviously be broken up by track, distance, surface, running style etc...

The average for all NYRA jockeys was 2.83.

So anything lower than 2.83 would be positive and anything higher would be negative. You can also see the average odds of the favorites the jockey ride to make an adjustment there.

formula_2002

01-09-2018, 12:34 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

As in all odds related plays, it would be best to determine a jocky's efficiency by comparing his actual wins to his expected wins. the forever a/e ratio :) using the final true dollar odds.
these comparisons should further be broken down by incremental odds, the small the more accurate, the greater the required data

classhandicapper

01-09-2018, 01:50 PM

One of the problems with all these stats is that jockeys take money independently of the horses they ride.

We have to make sure what we want our metric to measure.

Do we want to measure betting value or do we want to measure riding skill?

A great rider could wind up on a lot of underlays because he's so popular. So the fact that his horses under-perform their odds may say nothing about his skill.

classhandicapper

01-09-2018, 01:55 PM

I'm not saying that knowing an inside rider from an outside rider is the be all end all of things (it's not.) But the database testing I've done suggests that knowing that and measuring performance produces better results than ignoring that and measuring performance for rider alone.

Something like that would be especially useful on an inside or outside biased track.

So would knowledge of which riders pick up on biases quickly and ride to their advantage and which don't. That also gets into connections. Some owners and trainers are on top of that and give instructions on bias to the rider. Others do not. Same with pace setup.

Jeff P

01-09-2018, 03:00 PM

Here's a quick sample of the kind of thing you can do in 5 minutes.

This is Jockey performance for riders that had a minimum of 25 mounts on favorites at AQU, BEL and SAR. This covers about 2 1/2 years. Doing it by rank will take a little longer. Something like this could also obviously be broken up by track, distance, surface, running style etc...

The average for all NYRA jockeys was 2.83.

So anything lower than 2.83 would be positive and anything higher would be negative. You can also see the average odds of the favorites the jockey ride to make an adjustment there.

Yup.

That's a good illustration of using the type of analysis being discussed in this thread to come up with a rider rating.

I wanted to add to my previous post and provide some data.

The sample below shows all morning line favorites that have raced at what I consider to be A and B tracks over (roughly) the past three weeks:
query start: 1/9/2018 10:47:34 AM
query end: 1/9/2018 10:47:35 AM
elapsed time: 1 seconds

Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor Odds Cap: None
SQL UDM Plays Report: Hide

SQL: SELECT * FROM STARTERHISTORY
WHERE RANKMLINE=1
AND INSTR('AQU-GGX-GPX-HAW-LRL-PHA-TAM-SAX', TRACK) > 0
AND [DATE] >= #12-17-2017#
AND [DATE] <= #01-08-2018#
ORDER BY [DATE], TRACK, RACE

Data Summary Win Place Show
-----------------------------------------------------
Mutuel Totals 1043.50 1094.70 1080.40
Bet -1258.00 -1258.00 -1258.00
-----------------------------------------------------
P/L -214.50 -163.30 -177.60

Wins 196 333 411
Plays 629 629 629
PCT .3116 .5294 .6534

ROI 0.8295 0.8702 0.8588
Avg Mut 5.32 3.29 2.63

Nothing Earth shattering there. (The above results are about what you'd expect from morning line favorites.)

I recently wrote an algorithm (involving a little bit of AI) that calculates Cumulative Probability or F(x) for each rider given the situation he or she finds himself or herself in.

Admittedly, classifying individual rider situations is something subjective on my part.

That said, the algorithm is programmed to analyze the attributes for each mount, and from there classify that mount as belonging to a basic category such as inside speed, inside closer, middle post speed, middle post closer, outside speed, or outside closer.

From there the algorithm is programmed to pull each rider's like mounts from the database and calculate F(x) -- with the resulting F(x) representing actual performance vs. expected performance over all the times the rider was asked to perform that specific task... for example, ride an inside closer in a sprint race on the dirt at today's track.

All of that said, here is the above sample broken out by rank for Rider (Fx) as described above:
By: SQL-F01 Rank -- F(x) for Rider, given the situation

Rank P/L Bet Roi Wins Plays Pct Impact AvgMut
----------------------------------------------------------------------------------
1 48.60 190.00 1.2558 43 95 .4526 1.4526 5.55
2 -56.80 152.00 0.6263 19 76 .2500 0.8023 5.01
3 -4.50 126.00 0.9643 22 63 .3492 1.1207 5.52
4 -56.20 130.00 0.5677 15 65 .2308 0.7406 4.92
5 -46.60 170.00 0.7259 23 85 .2706 0.8684 5.37
6 -7.00 142.00 0.9507 26 71 .3662 1.1752 5.19
7 -35.80 122.00 0.7066 15 61 .2459 0.7891 5.75
8 -35.00 104.00 0.6635 13 52 .2500 0.8023 5.31
9 -32.00 68.00 0.5294 8 34 .2353 0.7551 4.50
10 20.20 32.00 1.6313 10 16 .6250 2.0057 5.22
11 -9.80 14.00 0.3000 1 7 .1429 0.4585 4.20
12 0.40 8.00 1.0500 1 4 .2500 0.8023 8.40

And here is the above sample broken out by numeric value for Rider (Fx) as described above:
By: SQL-F01 Numeric Value -- F(x) for Rider, given the situation
>=Min < Max P/L Bet Roi Wins Plays Pct Impact
--------------------------------------------------------------------------------------
-99.0000 0.0000 0.00 0.00 0.0000 0 0 .0000 0.0000
0.0000 0.0500 -3.80 160.00 0.9763 32 80 .4000 1.2837
0.0500 0.1000 -15.70 128.00 0.8773 22 64 .3438 1.1032
0.1000 0.1500 -76.10 170.00 0.5524 18 85 .2118 0.6796
0.1500 0.2000 -14.60 82.00 0.8220 11 41 .2683 0.8610
0.2000 0.2500 -26.60 64.00 0.5844 7 32 .2188 0.7020
0.2500 0.3000 -8.70 76.00 0.8855 14 38 .3684 1.1823
0.3000 0.3500 9.20 74.00 1.1243 12 37 .3243 1.0408
0.3500 0.4000 -15.80 36.00 0.5611 4 18 .2222 0.7132
0.4000 0.4500 -17.20 32.00 0.4625 2 16 .1250 0.4011
0.4500 0.5000 -12.40 42.00 0.7048 6 21 .2857 0.9169
0.5000 0.5500 -16.80 96.00 0.8250 17 48 .3542 1.1366
0.5500 0.6000 -4.00 30.00 0.8667 5 15 .3333 1.0697
0.6000 0.6500 -24.60 40.00 0.3850 3 20 .1500 0.4814
0.6500 0.7000 -7.60 20.00 0.6200 2 10 .2000 0.6418
0.7000 0.7500 -6.80 14.00 0.5143 2 7 .2857 0.9169
0.7500 0.8000 -20.20 24.00 0.1583 1 12 .0833 0.2674
0.8000 0.8500 -11.80 22.00 0.4636 2 11 .1818 0.5835
0.8500 0.9000 1.30 20.00 1.0650 4 10 .4000 1.2837
0.9000 9999.0000 57.70 128.00 1.4508 32 64 .5000 1.6046

Note the outperformance by the rank=1 mounts for Rider F(x).

Also note the outperformance at the extreme edges of the Rider F(x) numeric value distribution. In theory, any F(x) value over 0.50 represents outperformance.

Yet, in this sample, the strongest outperformance occurred when F(x) was greater than or equal to 0.85.

I'm guessing outperformance only when F(x) is greater than or equal to 0.85 may turn out to be the result of small sample noise. (My gut tells me a larger sample is needed.)

That said, this is the first sample I've generated using this technique and the results are promising (at least so far.)

I also wanted to touch on outpeformance at the other edge of the sample -- specifically when F(x) is equal to zero.

A closer look at the data reveals this part of the sample is populated by riders who have a small number of mounts in the situation they are being queried for.

For example: If a rider only has three mounts as a closer with a far outside post in dirt routes at today's track -- and the query results come back as 0 for 3 with an F(x) of 0.00... That 0.00 is probably not a true representation of the rider's ability in that situation.

-jp

.

SG4

01-09-2018, 09:34 PM

Thanks for all the feedback! Glad this has made for an interesting discussion, and of course shows all the varied ways to approach gathering a worthwhile stat on this end.

I think my interest was more in what jockeys move longshots up, so the query of jockeys on just favored mounts may lend insight in one way, but in my expectations wouldn't be as valuable. Not to mention where were those 25 times Fernando Jara rode a favorite on the NYRA circuit in the last 3 years, I must've been on vacation all those days lol.

The most recent thought that inspired this idea was seeing some of Irad Ortiz's results lately, and it seemed like whenever he was on horses at decent odds they always outperformed expectations, which made me think he really is proving his worth as a top rider. But of course with the simplistic calculation I proposed initially of 0 points at best if you're riding a favorite, a jockey like him who is often on chalk will probably be under-rated in this method. The Wong #'s seemed to be on the right track far as something in a useful vein.

Where the best value lies far as real world application goes I think would be applying this stat to apprentices in their very nascent stages, I feel like if your eyes are open you'll start to see some of those initial longshots running well aren't just flukes & could be the markings of a solid new rider, and you can usually jump on this for a period before word is fully out among the betting public.

gm10

01-13-2018, 08:42 AM

Lots of good points made here ... just wanted to add two things

1) Another way of measuring how good a jockey is by expressing how much better than expected he does with his mounts. In Hong Kong, UK, Ireland, etc you can just compare his or her speed/performance ratings with the official ratings. You could then conclude that (hypothetical example) Ryan Moore is worth an extra 2lbs.

2) In my experience it is crucial to calculate jockey ability over different surfaces/distances. For example .... some jocks are fine on the dirt but completely useless on the turf. Some otherwise moderate jocks seem to be much better over 5F than their overall stats would suggest.

TheOracle

02-22-2018, 11:42 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Hey SG4

I am looking to revisit this using Davis at Aqueduct but using the Morning Line instead of actual odds

Also, instead of the finish position I will just use 4 for off the board so if he's 6th choice in the Morning Line and he runs 4th it will be 4 - 4 = 0 and not 6 - 4 = +2

I am not giving him a +2 for finishing off the board it doesn't feel right to do that

However, I will keep 1st choice in the Morning Line finishing 6th as a -5 so 1 - 6 = -5

Here is D Davis record at Aqueduct overall as of yesterday

http://www.insidethenumbers.net/images/comments/ddavisaquone.png

http://www.insidethenumbers.net/images/comments/ddavisaqutwo.png

Surprisingly enough, his record on a Fast Surface gives a much better return per $2 win wager than on a wet Surface!!!

I will keep an eye on his mounts from now on at Aqueduct!!!

I am curious to see what his +/- is overall and in certain race types (i.e. Maidens, Claimers, etc.).

If he is giving a huge minus in certain situations maybe you stay away and do the opposite for huge plus values

Let me know if this is similar to what you had in mind

I will also do this for the Trainers at Aqueduct!!!

davew

02-23-2018, 02:00 AM

Thanks for all the feedback! Glad this has made for an interesting discussion, and of course shows all the varied ways to approach gathering a worthwhile stat on this end.

I think my interest was more in what jockeys move longshots up, so the query of jockeys on just favored mounts may lend insight in one way, but in my expectations wouldn't be as valuable. Not to mention where were those 25 times Fernando Jara rode a favorite on the NYRA circuit in the last 3 years, I must've been on vacation all those days lol.

The most recent thought that inspired this idea was seeing some of Irad Ortiz's results lately, and it seemed like whenever he was on horses at decent odds they always outperformed expectations, which made me think he really is proving his worth as a top rider. But of course with the simplistic calculation I proposed initially of 0 points at best if you're riding a favorite, a jockey like him who is often on chalk will probably be under-rated in this method. The Wong #'s seemed to be on the right track far as something in a useful vein.

Where the best value lies far as real world application goes I think would be applying this stat to apprentices in their very nascent stages, I feel like if your eyes are open you'll start to see some of those initial longshots running well aren't just flukes & could be the markings of a solid new rider, and you can usually jump on this for a period before word is fully out among the betting public.

both brothers now at GP - larger fields, different distances
Irad has been doing well, and gets more longer priced horses with the full fields

Dave Schwartz

02-23-2018, 10:42 AM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

We've had this in our software for years.

It is based upon what we call PIV - "Pool Impact Value." - When I say "Years," I mean "Decades."

PIV is like the formula for IVs. That is, total wins divided by expected wins...

IV= TotWins/ExWins
ExWins are the sum of 1/FieldSize.

Thus, in a 10-horse field, each horse is "expected" to win 0.10 races.

In a 5-horse field, each horse is "expected" to win 0.20 races.

PIV=TotWins/ExWins

The only difference is that ExWins are determined by the pct of pool wagered on each horse.

Thus, a horse with 40% of the pool is expected to win 0.40 races.

JerryBoyle

02-24-2018, 10:20 PM

I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Been lurking around the forum for a while without contributing, and this post got me thinking. I like the idea, but wanted something that penalized/rewarded runners relative to the rest of the race. Inspired by BIC, and using the public odds, I've come up with:

(ln(p) / sum(ln(pj))) - (ln(1-p) / sum(ln(1-pk))) where p is the runner we're scoring, pj are the runners we've beaten, and pk are the runners to which we've lost.

Looking Race 1 at GP yesterday (2-23-18), the scores would have been:

Program # Finish Public Prob Score
3 1 .047 10.73
8 2 .244 -1.88
2 3 .065 4.98
1 4 .057 3.67
4 5 .047 2.61
5 6 .244 -20.04
7 7 .011 2.83
6 8 .296 -60.29

I plan to use it in a regression to try and determine significance, but so far it seems to capture the idea that the OP is after. Feel free to refine/improve

SG4

02-25-2018, 10:15 PM

Been lurking around the forum for a while without contributing, and this post got me thinking. I like the idea, but wanted something that penalized/rewarded runners relative to the rest of the race. Inspired by BIC, and using the public odds, I've come up with:

(ln(p) / sum(ln(pj))) - (ln(1-p) / sum(ln(1-pk))) where p is the runner we're scoring, pj are the runners we've beaten, and pk are the runners to which we've lost.

Looking Race 1 at GP yesterday (2-23-18), the scores would have been:
Program # Finish Public Prob Score3 1 .047 10.738 2 .244 -1.882 3 .065 4.981 4 .057 3.674 5 .047 2.615 6 .244 -20.047 7 .011 2.836 8 .296 -60.29I plan to use it in a regression to try and determine significance, but so far it seems to capture the idea that the OP is after. Feel free to refine/improve

Thanks for coming out of lurking to suggest this - seems to be onto the right path for what I was thinking.

Thanks also to others who followed up. Some of the other ideas which only used winners I don't think fully capture what I was after, I think all finish positions are useful in a review of this nature.

DeltaLover

02-25-2018, 11:30 PM

As a bettor you do not care a lot about the ability of jockey.

If your interest in horse racing is limited to its gambling alone, the skill of the jockey must be completely transparent to your approach and at no point it should become a central topic in your research and handicapping.

When it comes to jockeys (as in any other handicapping factor) what really matters is not their absolute skill or their classification as better and worse but how these are perceived by the betting public and how this perception is reflected in the betting pools.

Take as an example the following data set which comprises of all the jockeys who have more than 100 starters at more than 8- 1 since 2017 in 'AQU', 'BEL', 'SAR', 'GP', 'SA', 'DMR', 'LRC':

https://gist.github.com/deltalover/4645651a5df6881b95ee96ed8c6f71b6

Using this table it becomes obvious that Irad Ortiz (ROI 0.79) is certainly inferion to let’s say Aby Medina (ROI 1.08) for betting purposes. My objective as a bettor is not to decide which of the two might be more talented, smart or fit; it is enough to know that the betting crowd is commiting a huge error when estimating the chances of this two, meaning that it is overestimating the ability of Ortiz while simultaneously is underestimating Medina's.

NikeUnlimited

02-26-2018, 02:23 AM

Maybe a dumb question. Is there any place where I can download the day's results and put them into my own database?

steveb

02-26-2018, 02:41 AM

As a bettor you do not care a lot about the ability of jockey.

If your interest in horse racing is limited to its gambling alone, the skill of the jockey must be completely transparent to your approach and at no point it should become a central topic in your research and handicapping.

When it comes to jockeys (as in any other handicapping factor) what really matters is not their absolute skill or their classification as better and worse but how these are perceived by the betting public and how this perception is reflected in the betting pools.

Take as an example the following data set which comprises of all the jockeys who have more than 100 starters at more than 8- 1 since 2017 in 'AQU', 'BEL', 'SAR', 'GP', 'SA', 'DMR', 'LRC':

https://gist.github.com/deltalover/4645651a5df6881b95ee96ed8c6f71b6

Using this table it becomes obvious that Irad Ortiz (ROI 0.79) is certainly inferion to let’s say Aby Medina (ROI 1.08) for betting purposes. My objective as a bettor is not to decide which of the two might be more talented, smart or fit; it is enough to know that the betting crowd is commiting a huge error when estimating the chances of this two, meaning that it is overestimating the ability of Ortiz while simultaneously is underestimating Medina's.

while essentially that may be true, there is still two things out there in the race......a horse AND a rider.
i have always been of the opinion the rider is the more important of the two.
and your chart could be influenced by one or two long priced winners?
not to mention it is very fluid, so it probably looks very different month to month.
or you could have riders that are profitable......depending on how you filter, but you have less than 1.
there is seven zillion ways to analyse it.
you could have a rider with negative expectation, but in the circumstances could be a great bet.
rider IS VERY very important, but it's just one factor.

personally i would never bother analysing a rider the way you have done. it does not work that way.
if all your factors as one give positive expectation then what does it matter if the rider in this case, is a losing proposition according to his/her past history.
you job is to predict the future not the past even if the past may help to an extent.

DeltaLover

02-26-2018, 02:51 AM

while essentially that may be true, there is still two things out there in the race......a horse AND a rider.
i have always been of the opinion the rider is the more important of the two.
and your chart could be influenced by one or two long priced winners?
not to mention it is very fluid, so it probably looks very different month to month.
or you could have riders that are profitable......depending on how you filter, but you have less than 1.
there is seven zillion ways to analyse it.
you could have a rider with negative expectation, but in the circumstances could be a great bet.
rider IS VERY very important, but it's just one factor.

personally i would never bother analysing a rider the way you have done. it does not work that way.
if all your factors as one give positive expectation then what does it matter if the rider in this case, is a losing proposition according to his/her past history.
you job is to predict the future not the past even if the past may help to an extent.

You are right in the sense that a flat ROI analysis like the one I present here leaves a lot of room of improvement; removing outliers and applying a moving window based on time are two of the ways to do so. More that this though your job is to predict the error in the crowd's estimate and not the most probable outcome of the race; this is the point I am trying to make here.

upthecreek

02-26-2018, 09:37 AM

Jockey Intent has to be figured into the equqtion somehow, but I dont how. Ill try to express my point
Yesterday 1st race @ SA #7 Royal Opera House ridden by Kent D
Now the horse was claimed off of Bob Hess who Kent reguarly rides for. The horse was claimed by Alfredo Marquez(who) and Kent and I guess his agent ask to ride back. The horse won and paid $11.80, an underlay from ML, pointed out by Kurt Hoover(who picked the horse for some of these reasons), and an overlay on a KD ridden horse, probably because of the no name trainer.The horse finished 2nd last out , same condition, @ 3-1, with KD.

GMB@BP

02-26-2018, 02:08 PM

Maybe a dumb question. Is there any place where I can download the day's results and put them into my own database?

I believe both BRIS and DRF sell downloadable data for data base work.

JerryBoyle

02-26-2018, 02:43 PM

I believe both BRIS and DRF sell downloadable data for data base work.

Both sell csv files of historical data, which makes it easy to do the db work. Having gone with DRF myself, I'd go with BRIS if I were starting over. BRIS provides much more data, including workouts which is huge. DRF csv files don't have workout info and purchasing separately is very pricey. I think BRIS yearly plan is actually cheaper as well...

cj

02-27-2018, 04:25 PM

You are right in the sense that a flat ROI analysis like the one I present here leaves a lot of room of improvement; removing outliers and applying a moving window based on time are two of the ways to do so. More that this though your job is to predict the error in the crowd's estimate and not the most probable outcome of the race; this is the point I am trying to make here.

I think the ability of the jockey should be factored into the overall rating for each horse, those horses converted to an odds line, and the odds line determining betting. At no point do I consider how much the jockey gets bet by the public. It will be factored into my betting line already.

Simple example, lets say you use two ratings, horse and jockey, and count the jockey for 20%. Horse A is rated 100 and Horse B is rated 100. Horse A has a jockey that rates 100 on ability and horse B has a jockey that rates a 50. Overall ratings are Horse A is 100 and Horse B is 90. Convert those to an odds line and see how it plays out. I now the public will bet Horse A more because of the jockey but I also know that it is warranted. You need a premium to bet on Horse B in my opinion.

My point is that at no point do I consider if the jockeys are overbet or underbet normally by the crowd. I think that will all come out in the wash with a good odds line and comparing it to the public odds. But I definitely factor in the ability of the jockey.

One thing I would caution is that you need to factor in the jockey in past races as well. If Horses A and B have been continually ridden by the same jockey in previous races, make sure those 100 ratings reflect that as well.

steveb

02-27-2018, 06:03 PM

I think the ability of the jockey should be factored into the overall rating for each horse, those horses converted to an odds line, and the odds line determining betting. At no point do I consider how much the jockey gets bet by the public. It will be factored into my betting line already.

that's what i was trying to say!
you just said it better than i could have.

blackandtanstable

02-27-2018, 11:56 PM

A buddy of mine contacted me and thought I was posting in this thread. Several years ago, I developed a jockey rating system based on similar methodology as discussed here. I feel the ratings are good, but there are excellent ideas here and a better system underway. Years ago, Turfday.com had terrific jockey ratings that I love to used. (It also had excellent sire and trainer ratings too). After Turfday went out of business, I searched everywhere for jockey ratings that could add some betting value to my game and finally came up with my own. I'm looking forward to seeing what's created here -- it would be great to have an automated option with more tracks available and less manual work needed.

classhandicapper

02-28-2018, 12:10 PM

One thing I would caution is that you need to factor in the jockey in past races as well. If Horses A and B have been continually ridden by the same jockey in previous races, make sure those 100 ratings reflect that as well.

When I first started playing this game I used to pay close attention to jockey switches, jockey trainer combos, and red hot riders getting a lot of live mounts. Then somewhere along the line I started paying very little attention to jockey at all. I guess I didn't feel like it was adding much value to my opinions.

As an owner with a piece of a few horses, I'm paying more attention again. I've had the opportunity to observe a few jockeys. I know who has listened to instructions, who is riding hard for minor awards, who is being more or less aggressive than I want etc... Now I find myself forming opinions again. After a race I'll immediately start mumbling to myself how my horse would do better with so and so as a rider.

DeltaLover

02-28-2018, 03:11 PM

I think the ability of the jockey should be factored into the overall rating for each horse, those horses converted to an odds line, and the odds line determining betting. At no point do I consider how much the jockey gets bet by the public. It will be factored into my betting line already.

Simple example, lets say you use two ratings, horse and jockey, and count the jockey for 20%. Horse A is rated 100 and Horse B is rated 100. Horse A has a jockey that rates 100 on ability and horse B has a jockey that rates a 50. Overall ratings are Horse A is 100 and Horse B is 90. Convert those to an odds line and see how it plays out. I now the public will bet Horse A more because of the jockey but I also know that it is warranted. You need a premium to bet on Horse B in my opinion.

My point is that at no point do I consider if the jockeys are overbet or underbet normally by the crowd. I think that will all come out in the wash with a good odds line and comparing it to the public odds. But I definitely factor in the ability of the jockey.

One thing I would caution is that you need to factor in the jockey in past races as well. If Horses A and B have been continually ridden by the same jockey in previous races, make sure those 100 ratings reflect that as well.

I do not imply that the jockey should not be factored in the rating of each horse; instead what I am trying to say here, is that this factoring should not depend it his absolute ability but how this is been conceived by the betting crowd.

I am also saying that exactly the same principle applies to any other applicable handicapping factor that affects the betting patterns of the crowd.

Take as an example a horse that is offered at 4-1 odds; this means that the crowd believe that its winning chances are 0.2 or 20%. Whether this horse represents an overlay, underlay or a neutral betting proposition depends on how its handicapping factors are perceived by the betting public and not by the it’s (always unknown) absolute winning probability. The jockey’s ability is already part of the 4-1 odds since the crowd is well aware of its significance it the outcome of the race. The real challenge is not to quantify the ability of the jockey but to detect potential estimation errors committed by the crowd.

To keep our example relative to the topic of the thread, let’s assume that this starter is ridden by the absolutely best jockey that can be found in the circuit. The ability of the jockey alone, is not enough to derive any kind of useful handicapping opinion. It can very well be the case that the top ridder represents a very valid reason to bet against him; this happens when his superiority is so obvious that the crowd is mislead to the point of overbetting him while simultaneously creating value in some of the other starters. The reversed situation can very well occur in the case of the worst jockey who might be tremendously underbet converting his mount to an overlay.

Exactly the same behaviour applies to any other handicapping factor; think of females against males or a claimer who is trying stakes company for first time. In either case, it is well known that the winning chances of such a starter are significantly diminished but this does not necessary imply a bad bet; in many cases quite the opposite is true! It all depends on how the crowd will perceive each situation and to what degree it the bets will be optimally distributed or not.

Following this way of thinking, it is easy to conclude that what matters when it comes to the gambling aspect of the game, is not the raw ability of the jockey but the degree of the potential crowd mistake in its evaluation.

classhandicapper

02-28-2018, 03:51 PM

There are 2 common approaches to finding value.

1. Figure out how important each factor is in predicting the winner and then combine them into a model for making an odds line to determine your value plays.

2. Try to find factors/angles that the public often under/over estimates and use filters and combinations of them to find value plays

If you use approach #1, then you are probably interested in a rating that measures the skills of the jockey that you will combine with similar data on the horse, trainer, etc...

If you use approach #2, then you are interested in which jockeys are under/over bet in certain situations.

blackandtanstable

02-28-2018, 06:26 PM

I do not imply that the jockey should not be factored in the rating of each horse; instead what I am trying to say here, is that this factoring should not depend it his absolute ability but how this is been conceived by the betting crowd.

I am also saying that exactly the same principle applies to any other applicable handicapping factor that affects the betting patterns of the crowd.

Take as an example a horse that is offered at 4-1 odds; this means that the crowd believe that its winning chances are 0.2 or 20%. Whether this horse represents an overlay, underlay or a neutral betting proposition depends on how its handicapping factors are perceived by the betting public and not by the it’s (always unknown) absolute winning probability. The jockey’s ability is already part of the 4-1 odds since the crowd is well aware of its significance it the outcome of the race. The real challenge is not to quantify the ability of the jockey but to detect potential estimation errors committed by the crowd.

To keep our example relative to the topic of the thread, let’s assume that this starter is ridden by the absolutely best jockey that can be found in the circuit. The ability of the jockey alone, is not enough to derive any kind of useful handicapping opinion. It can very well be the case that the top ridder represents a very valid reason to bet against him; this happens when his superiority is so obvious that the crowd is mislead to the point of overbetting him while simultaneously creating value in some of the other starters. The reversed situation can very well occur in the case of the worst jockey who might be tremendously underbet converting his mount to an overlay.

Exactly the same behaviour applies to any other handicapping factor; think of females against males or a claimer who is trying stakes company for first time. In either case, it is well known that the winning chances of such a starter are significantly diminished but this does not necessary imply a bad bet; in many cases quite the opposite is true! It all depends on how the crowd will perceive each situation and to what degree it the bets will be optimally distributed or not.

Following this way of thinking, it is easy to conclude that what matters when it comes to the gambling aspect of the game, is not the raw ability of the jockey but the degree of the potential crowd mistake in its evaluation.

this is a great post. it's like kitten's joy offspring. they like turf but nowhere near as much as the betting public thinks they do.

Robert Fischer

02-28-2018, 06:42 PM

Have to say 'Thank You' for doing the work, and sharing it

Surprised to see Jose Cruz looking great, and Saez looking terrible.

Here's a quick sample of the kind of thing you can do in 5 minutes.

This is Jockey performance for riders that had a minimum of 25 mounts on favorites at AQU, BEL and SAR. This covers about 2 1/2 years. Doing it by rank will take a little longer. Something like this could also obviously be broken up by track, distance, surface, running style etc...

The average for all NYRA jockeys was 2.83.

So anything lower than 2.83 would be positive and anything higher would be negative. You can also see the average odds of the favorites the jockey ride to make an adjustment there.