PDA

View Full Version : Does this type of stat exist?


SG4
01-08-2018, 04:13 PM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

cj
01-08-2018, 05:09 PM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

I love doing this kind of stuff. You need a database to do it, but once you have that it isn't too tough. I've never seen this one publicly but I've done similar myself. It is well worth the time and effort.

green80
01-08-2018, 05:16 PM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

I think equibase list an average odds per winner in their jockey stats. I think average odds per mount would be a better indicator. At all tracks there a the top few jockeys that are usually on the favorites. Of course they have the highest win percentages. The top jock at a track with an 18% win percentage is not near as impressive as some bug boy riding all 30/1 shots with a 10% win percentage. The stats don't tell all, its what kind of horses the jockey gets to ride.

classhandicapper
01-08-2018, 05:27 PM
I like this idea a lot.

If I have some extra time, I'll write the query and do the calculations for NYRA for 2017 and post the results later in the week.

cj
01-08-2018, 05:41 PM
I like this idea a lot.

If I have some extra time, I'll write the query and do the calculations for NYRA for 2017 and post the results later in the week.

Recommend anything over 5 counts as 5, odds rank and finish...gets rid of a lot of the noise. ;)

thaskalos
01-08-2018, 06:03 PM
And this stat could also be created for the TRAINER....or the jockey/trainer combination.

Charli125
01-08-2018, 06:59 PM
Recommend anything over 5 counts as 5, odds rank and finish...gets rid of a lot of the noise. ;)

I just ran this on my DB using the above and limiting to jockeys with at least 50 starts(should probably be 100). Not ready to share as the totals seem out of line so there might be a small issue but it really highlights the overbet and underbet jockeys. I'm sure it would do the same with trainers or any other connections.

JustRalph
01-08-2018, 10:02 PM
It sounds great......then along comes Kent Desormeux

cj
01-08-2018, 11:36 PM
I have to think back to how I programmed this. Riders that are almost always favored or second choice are going to have a lower rating most of the time...they have nowhere to go but down.

TheOracle
01-09-2018, 12:38 AM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Hey SG4

This sounds like a good idea public 4th finishes 2nd is a +2 but a public 1st choice finishing 5th is a -4.

My Jockey data isn’t where it needs to be but I can do this for Trainers at NYRA but I would use the Morning Lines since the odds are subject to change while the race is actually running as we’ve all seen

http://www.paceadvantage.com/forum/showthread.php?t=142423

Also, I would imagine a public 1st choice finishing 1st = 0 which would indicate that the outcome was expected so they wouldn’t earn a score for that situation

They would only earn a positive score if they finished better than what was expected and I guess you can add up the scores to see if they are performing above or below expectations

I will look into that I’m all for anything that goes away from using earnings as a criteria to rate Trainers

BreadandButter
01-09-2018, 02:10 AM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

Sounds much like the Wong Index. Named after jockey Tommy Wong. Fonner published them at the end of the 2017 meet.

classhandicapper
01-09-2018, 10:28 AM
I have to think back to how I programmed this. Riders that are almost always favored or second choice are going to have a lower rating most of the time...they have nowhere to go but down.

It should be possible to tweak the model a little to correct for that.

I'm not saying this is the solution, but if you look at all favorites and determined average finish, then any finish above that would be a positive and any finish below that would be a negative.

You could do the same for 2nd choices, 3rd choices etc..

That way, even if every horse you rode was the favorite, you wouldn't have to win to avoid a negative rating. You'd just have to do average. And you could still get a positive rating if you won more than your fair share.

It you wanted to get crazier you could make adjustments for field size and actual odds.

If you are riding a 2/5 favorite in a 4 horse field that's different from riding a 3-1 favorite in a 12 horse field.

The one things I recall about this kind of thing is that it gets a little tricky to do ranking in Access, which is where I do most of my work. I might have to export to Excel. I think it's a little easier in Excel.

cj
01-09-2018, 11:40 AM
It should be possible to tweak the model a little to correct for that.

I'm not saying this is the solution, but if you look at all favorites and determined average finish, then any finish above that would be a positive and any finish below that would be a negative.

You could do the same for 2nd choices, 3rd choices etc..

That way, even if every horse you rode was the favorite, you wouldn't have to win to avoid a negative rating. You'd just have to do average. And you could still get a positive rating if you won more than your fair share.

It you wanted to get crazier you could make adjustments for field size and actual odds.

If you are riding a 2/5 favorite in a 4 horse field that's different from riding a 3-1 favorite in a 12 horse field.

The one things I recall about this kind of thing is that it gets a little tricky to do ranking in Access, which is where I do most of my work. I might have to export to Excel. I think it's a little easier in Excel.

Yeah, there are a lot of tweaks you can do. For simplicity, I did something like this.

=MIN(OddsRank+1,5)-MIN(FinishPosition,5)

That makes it so you get a positive for a win.

Jeff P
01-09-2018, 01:25 PM
You can tweak something like this a lot of different ways.

In the end, no matter what tweaks are made -- hopefully, what you end up with is a number that measures performance vs. expected performance.

In his book Precision (https://www.amazon.com/Precision-Statistical-Mathematical-Methods-Racing/dp/1432768522), on page 80, CX Wong wrote about a stat called Cumulative Probability or F(x) and presented a formula for calculating it.

It takes a little programming to calculate it, but I've found that Cumulative Probability or F(x) does a decent job of measuring performance vs. expected performance in a statistically valid way. (All of the values generated end up being between 0 and 1 and any value >= 0.50 suggests the rider, trainer, sire, what have you, etc. outperformed expectations.)

Imo, the way to get bang for your buck or separation between yourself and the other players who you are competing against in the pools with a stat like this doesn't necessarily come from the tweaks, although doing that can provide a degree of separation -- but rather from applying a stat like this in a conceptually unique way vs. the way the other players you are competing against in the pools are applying it.

For example, in the DRF podcast for his new book Betting with an Edge (http://www.paceadvantage.com/forum/showthread.php?t=141838), Mike Maloney talked about the value of knowing whether or not a rider likes to be inside or outside.

I'm not saying that knowing an inside rider from an outside rider is the be all end all of things (it's not.) But the database testing I've done suggests that knowing that and measuring performance produces better results than ignoring that and measuring performance for rider alone.

Imo, if you can give a stat like this some thought, and get creative about what is or isn't reflected in the odds:

And from there measure performance for rider, trainer, sire, what have you, etc, vs. expected performance given the situation...

I've found at that point you are more likely to actually HAVE something.

I hope I managed to type most of that out in a way that makes sense.


-jp

.

classhandicapper
01-09-2018, 01:32 PM
Here's a quick sample of the kind of thing you can do in 5 minutes.

This is Jockey performance for riders that had a minimum of 25 mounts on favorites at AQU, BEL and SAR. This covers about 2 1/2 years. Doing it by rank will take a little longer. Something like this could also obviously be broken up by track, distance, surface, running style etc...

The average for all NYRA jockeys was 2.83.

So anything lower than 2.83 would be positive and anything higher would be negative. You can also see the average odds of the favorites the jockey ride to make an adjustment there.

formula_2002
01-09-2018, 01:34 PM
I was wondering if any kind of stat or program exists to give jockeys something of a performance expectation rating. What I mean by this is say a jockey is on the 4th betting choice in a race & they finish 2nd, I'd give them a +2 score that race, or if you're on a favorite and finish 5th they'd get a -4 for that race. Wonder if an average score would help identify riders who may be over/underestimated by the betting public, or could possibly help identify upcoming apprentice jockeys as ones to watch.

As in all odds related plays, it would be best to determine a jocky's efficiency by comparing his actual wins to his expected wins. the forever a/e ratio :) using the final true dollar odds.
these comparisons should further be broken down by incremental odds, the small the more accurate, the greater the required data

classhandicapper
01-09-2018, 02:50 PM
One of the problems with all these stats is that jockeys take money independently of the horses they ride.

We have to make sure what we want our metric to measure.

Do we want to measure betting value or do we want to measure riding skill?

A great rider could wind up on a lot of underlays because he's so popular. So the fact that his horses under-perform their odds may say nothing about his skill.

classhandicapper
01-09-2018, 02:55 PM
I'm not saying that knowing an inside rider from an outside rider is the be all end all of things (it's not.) But the database testing I've done suggests that knowing that and measuring performance produces better results than ignoring that and measuring performance for rider alone.


Something like that would be especially useful on an inside or outside biased track.

So would knowledge of which riders pick up on biases quickly and ride to their advantage and which don't. That also gets into connections. Some owners and trainers are on top of that and give instructions on bias to the rider. Others do not. Same with pace setup.

Jeff P
01-09-2018, 04:00 PM
Here's a quick sample of the kind of thing you can do in 5 minutes.

This is Jockey performance for riders that had a minimum of 25 mounts on favorites at AQU, BEL and SAR. This covers about 2 1/2 years. Doing it by rank will take a little longer. Something like this could also obviously be broken up by track, distance, surface, running style etc...

The average for all NYRA jockeys was 2.83.

So anything lower than 2.83 would be positive and anything higher would be negative. You can also see the average odds of the favorites the jockey ride to make an adjustment there.

Yup.

That's a good illustration of using the type of analysis being discussed in this thread to come up with a rider rating.

I wanted to add to my previous post and provide some data.

The sample below shows all morning line favorites that have raced at what I consider to be A and B tracks over (roughly) the past three weeks:
query start: 1/9/2018 10:47:34 AM
query end: 1/9/2018 10:47:35 AM
elapsed time: 1 seconds

Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor Odds Cap: None
SQL UDM Plays Report: Hide

SQL: SELECT * FROM STARTERHISTORY
WHERE RANKMLINE=1
AND INSTR('AQU-GGX-GPX-HAW-LRL-PHA-TAM-SAX', TRACK) > 0
AND [DATE] >= #12-17-2017#
AND [DATE] <= #01-08-2018#
ORDER BY [DATE], TRACK, RACE


Data Summary Win Place Show
-----------------------------------------------------
Mutuel Totals 1043.50 1094.70 1080.40
Bet -1258.00 -1258.00 -1258.00
-----------------------------------------------------
P/L -214.50 -163.30 -177.60

Wins 196 333 411
Plays 629 629 629
PCT .3116 .5294 .6534

ROI 0.8295 0.8702 0.8588
Avg Mut 5.32 3.29 2.63


Nothing Earth shattering there. (The above results are about what you'd expect from morning line favorites.)


I recently wrote an algorithm (involving a little bit of AI) that calculates Cumulative Probability or F(x) for each rider given the situation he or she finds himself or herself in.

Admittedly, classifying individual rider situations is something subjective on my part.

That said, the algorithm is programmed to analyze the attributes for each mount, and from there classify that mount as belonging to a basic category such as inside speed, inside closer, middle post speed, middle post closer, outside speed, or outside closer.

From there the algorithm is programmed to pull each rider's like mounts from the database and calculate F(x) -- with the resulting F(x) representing actual performance vs. expected performance over all the times the rider was asked to perform that specific task... for example, ride an inside closer in a sprint race on the dirt at today's track.

All of that said, here is the above sample broken out by rank for Rider (Fx) as described above:
By: SQL-F01 Rank -- F(x) for Rider, given the situation

Rank P/L Bet Roi Wins Plays Pct Impact AvgMut
----------------------------------------------------------------------------------
1 48.60 190.00 1.2558 43 95 .4526 1.4526 5.55
2 -56.80 152.00 0.6263 19 76 .2500 0.8023 5.01
3 -4.50 126.00 0.9643 22 63 .3492 1.1207 5.52
4 -56.20 130.00 0.5677 15 65 .2308 0.7406 4.92
5 -46.60 170.00 0.7259 23 85 .2706 0.8684 5.37
6 -7.00 142.00 0.9507 26 71 .3662 1.1752 5.19
7 -35.80 122.00 0.7066 15 61 .2459 0.7891 5.75
8 -35.00 104.00 0.6635 13 52 .2500 0.8023 5.31
9 -32.00 68.00 0.5294 8 34 .2353 0.7551 4.50
10 20.20 32.00 1.6313 10 16 .6250 2.0057 5.22
11 -9.80 14.00 0.3000 1 7 .1429 0.4585 4.20
12 0.40 8.00 1.0500 1 4 .2500 0.8023 8.40



And here is the above sample broken out by numeric value for Rider (Fx) as described above:
By: SQL-F01 Numeric Value -- F(x) for Rider, given the situation
>=Min < Max P/L Bet Roi Wins Plays Pct Impact
--------------------------------------------------------------------------------------
-99.0000 0.0000 0.00 0.00 0.0000 0 0 .0000 0.0000
0.0000 0.0500 -3.80 160.00 0.9763 32 80 .4000 1.2837
0.0500 0.1000 -15.70 128.00 0.8773 22 64 .3438 1.1032
0.1000 0.1500 -76.10 170.00 0.5524 18 85 .2118 0.6796
0.1500 0.2000 -14.60 82.00 0.8220 11 41 .2683 0.8610
0.2000 0.2500 -26.60 64.00 0.5844 7 32 .2188 0.7020
0.2500 0.3000 -8.70 76.00 0.8855 14 38 .3684 1.1823
0.3000 0.3500 9.20 74.00 1.1243 12 37 .3243 1.0408
0.3500 0.4000 -15.80 36.00 0.5611 4 18 .2222 0.7132
0.4000 0.4500 -17.20 32.00 0.4625 2 16 .1250 0.4011
0.4500 0.5000 -12.40 42.00 0.7048 6 21 .2857 0.9169
0.5000 0.5500 -16.80 96.00 0.8250 17 48 .3542 1.1366
0.5500 0.6000 -4.00 30.00 0.8667 5 15 .3333 1.0697
0.6000 0.6500 -24.60 40.00 0.3850 3 20 .1500 0.4814
0.6500 0.7000 -7.60 20.00 0.6200 2 10 .2000 0.6418
0.7000 0.7500 -6.80 14.00 0.5143 2 7 .2857 0.9169
0.7500 0.8000 -20.20 24.00 0.1583 1 12 .0833 0.2674
0.8000 0.8500 -11.80 22.00 0.4636 2 11 .1818 0.5835
0.8500 0.9000 1.30 20.00 1.0650 4 10 .4000 1.2837
0.9000 9999.0000 57.70 128.00 1.4508 32 64 .5000 1.6046


Note the outperformance by the rank=1 mounts for Rider F(x).

Also note the outperformance at the extreme edges of the Rider F(x) numeric value distribution. In theory, any F(x) value over 0.50 represents outperformance.

Yet, in this sample, the strongest outperformance occurred when F(x) was greater than or equal to 0.85.

I'm guessing outperformance only when F(x) is greater than or equal to 0.85 may turn out to be the result of small sample noise. (My gut tells me a larger sample is needed.)

That said, this is the first sample I've generated using this technique and the results are promising (at least so far.)

I also wanted to touch on outpeformance at the other edge of the sample -- specifically when F(x) is equal to zero.

A closer look at the data reveals this part of the sample is populated by riders who have a small number of mounts in the situation they are being queried for.

For example: If a rider only has three mounts as a closer with a far outside post in dirt routes at today's track -- and the query results come back as 0 for 3 with an F(x) of 0.00... That 0.00 is probably not a true representation of the rider's ability in that situation.


-jp

.

SG4
01-09-2018, 10:34 PM
Thanks for all the feedback! Glad this has made for an interesting discussion, and of course shows all the varied ways to approach gathering a worthwhile stat on this end.

I think my interest was more in what jockeys move longshots up, so the query of jockeys on just favored mounts may lend insight in one way, but in my expectations wouldn't be as valuable. Not to mention where were those 25 times Fernando Jara rode a favorite on the NYRA circuit in the last 3 years, I must've been on vacation all those days lol.

The most recent thought that inspired this idea was seeing some of Irad Ortiz's results lately, and it seemed like whenever he was on horses at decent odds they always outperformed expectations, which made me think he really is proving his worth as a top rider. But of course with the simplistic calculation I proposed initially of 0 points at best if you're riding a favorite, a jockey like him who is often on chalk will probably be under-rated in this method. The Wong #'s seemed to be on the right track far as something in a useful vein.

Where the best value lies far as real world application goes I think would be applying this stat to apprentices in their very nascent stages, I feel like if your eyes are open you'll start to see some of those initial longshots running well aren't just flukes & could be the markings of a solid new rider, and you can usually jump on this for a period before word is fully out among the betting public.

gm10
01-13-2018, 09:42 AM
Lots of good points made here ... just wanted to add two things

1) Another way of measuring how good a jockey is by expressing how much better than expected he does with his mounts. In Hong Kong, UK, Ireland, etc you can just compare his or her speed/performance ratings with the official ratings. You could then conclude that (hypothetical example) Ryan Moore is worth an extra 2lbs.

2) In my experience it is crucial to calculate jockey ability over different surfaces/distances. For example .... some jocks are fine on the dirt but completely useless on the turf. Some otherwise moderate jocks seem to be much better over 5F than their overall stats would suggest.