Probability percentage from a rating - Horse Racing Forum - PaceAdvantage.Com

raybo · 05-22-2016, 07:55 PM

Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.

Dave Schwartz · 05-22-2016, 08:03 PM

Normalize it to 100%.

That is, add up the numbers and divide each score by the sum of the scores in the race.

raybo · 05-22-2016, 09:22 PM

Quote:

Originally Posted by Dave Schwartz

Normalize it to 100%.

That is, add up the numbers and divide each score by the sum of the scores in the race.

So, I was correct that you divide each final rating by the sum of all the ratings in the race? If so, then these are the probabilities for each horse:

#1 -- 72.65 ---- 0.076589914
#2 -- 101.06 --- 0.106536888
#3 -- 104.32 --- 0.109981423
#4 -- 87.72 ---- 0.09247253
#5 -- 89.90 ---- 0.094780224
#6 -- 10.20 ---- 0.01074802
#7 -- 75.89 ---- 0.080008831
#8 -- 79.96 ---- 0.084296919
#9 - 105.25 --- 0.110960109
#10 - 77.95 --- 0.082172241
#11 - 73.84 --- 0.077847444
#12 - 69.82 --- 0.073605457

But, the odds those probabilities result in are not logical. They range from 8/1 to 13/1, except for the #6 horse who gets 92/1 odds.

headhawg · 05-22-2016, 09:56 PM

raybo,

Have you ever read the Four Quarters of Horse Investing by Steve Fierro? I think that trying to assign a "fair" oddsline to more than four or five horses is just folly. If you read his book you will understand what I mean. I'm not trying to diminish what you're attempting, but I think that fair odds becomes nearly impossible to do with any accuracy once the projected win probabilities fall below a certain percentage.

Good luck

raybo · 05-22-2016, 10:12 PM

Quote:

Originally Posted by headhawg

raybo,

Have you ever read the Four Quarters of Horse Investing by Steve Fierro? I think that trying to assign a "fair" oddsline to more than four or five horses is just folly. If you read his book you will understand what I mean. I'm not trying to diminish what you're attempting, but I think that fair odds becomes nearly impossible to do with any accuracy once the projected win probabilities fall below a certain percentage.

Good luck

I have not read Fiero's book, but I've heard that one should only consider "actual" win contenders. While that's probably true, it doesn't address my problem, that being assigning win probabilities and their associated decimal odds to all horses in the race. There must be a way to scale the ratings in order to obtain more realistic probabilities and odds.

Sure, it would be easy to consider only those horses with at least a 10% probability, regarding assigning odds, however, there can be horses below that 10% that are decent bets, especially when you consider the price you're likely to get. Lots of people have automated methods of assigning realistic odds to every horse in a race. I guess that is my real question, how do they do it, without ending up with something unrealistic like this example?

I mean, really, in my example, there is a horse with a 105 rating, another with a 104, and another with a 101, with the next one way down at 89. And yet, all 4 of those top horses are only a couple of percentage points apart. The spread should be larger than that, IMO.

whodoyoulike · 05-22-2016, 11:27 PM

Isn't this similar to creating a M/L which means you'd need to consider the takeout % which differs by track if you're creating the odds e.g., some are 0.15 and others are 0.18 etc.?

How accurate are your ratings since they are very similar to each other which I would think the probabilities would then be close to each other?

raybo · 05-22-2016, 11:43 PM

Quote:

Originally Posted by whodoyoulike

Isn't this similar to creating a M/L which means you'd need to consider the takeout % if you're creating the odds?

How accurate are your ratings because they are very similar which I would think the probabilities would then be close to each other?

I have not considered takeout, yet, but I would suspect that the relationships between ratings would not change much, if any. The final ratings would still be too close to one another, for too many horses.

Perhaps the individual factor ratings are too "normalized", and that is causing the final ratings' probabilities to be too tight. I normalized all of the individual factor ratings to the same point range, 0 to 100. That might be the underlying problem. My thinking was that all of the factors should be relational, regarding scale, so that the weightings would not be skewed, higher or lower, simply due to the existence of different factor ranges. I've not much experience with weighted factors, so I'm learning as I go.

Dave Schwartz · 05-23-2016, 12:03 AM

Quote:

But, the odds those probabilities result in are not logical. They range from 8/1 to 13/1, except for the #6 horse who gets 92/1 odds.

They are "logical."

The issue is that your numbers do not scale as the tote board does. They are "flat."

As an example, suppose you are building probabilities based upon speed ratings where every horse is in the range of 90-100. The difference from top to bottom is only 10 points. When you graph the horses you get a lightly sloping line or curve.

If you think about it, you don't actually want the speed ratings. You want something that represents how the speed ratings translate into win percentages or impact values.

In other words, you want to ask (and answer) such questions as "How does a horse 3 points below the top horse in the field perform?" (Or 5 points or 8 points, or 17 points.) There are more creative ways to express this but I was trying to keep it simple.

Tagging factors with weighted values is not an easy task.

You need to go back and question your original weighting system.

As a long-standing member on PA, I respect your many contributions. If you'd care to contact me, I'd be willing to spend some time with you in an online meeting and work through some of this stuff and help any way I can.

Just email me.
Dave

raybo · 05-23-2016, 12:17 AM

Quote:

Originally Posted by Dave Schwartz

Tagging factors with weighted values is not an easy task.

You need to go back and question your original weighting system.

Dave

I appreciate the response Dave!

Are you saying that, by normalizing all of the individual factor ratings to the same scale, the factor weightings/multipliers will not have the desired effect?

The fact that some factor ratings have a very tight range, from highest to lowest, while others have a wider range, was the very reason I decided to normalize all the factor ratings to the same scale. Obviously, that doesn't work well, as the wider factor ranges get tightened up, while the tighter factor ranges get widened.

whodoyoulike · 05-23-2016, 01:30 AM

I agree with Dave's response which is the reason I asked "how accurate are your ratings". Maybe instead of calculating probabilities based on the ratings, approach it as Dave suggested.

Quote:

Originally Posted by Dave Schwartz

... If you think about it, you don't actually want the speed ratings. You want something that represents how the speed ratings translate into win percentages or impact values.

In other words, you want to ask (and answer) such questions as "How does a horse 3 points below the top horse in the field perform?" (Or 5 points or 8 points, or 17 points.) There are more creative ways to express this but I was trying to keep it simple. ...

Dave Schwartz · 05-23-2016, 01:52 AM

Quote:

Are you saying that, by normalizing all of the individual factor ratings to the same scale, the factor weightings/multipliers will not have the desired effect?

The fact that some factor ratings have a very tight range, from highest to lowest, while others have a wider range, was the very reason I decided to normalize all the factor ratings to the same scale. Obviously, that doesn't work well, as the wider factor ranges get tightened up, while the tighter factor ranges get widened.

Yes. All factors my scale similarly.

Example: In my system, all factors scale from 1 to 250.

Thus, Quirin Points go from 0 to 8, having 9 actual choices. Therefore, the value of a point is 27 points each and look like this:

0=1-27
1=28-54
2=55-81
3=82-108
4=109-135
5=136-162
6=163-189
7=190-216
8=217-243

The value of Speed Rating = 5 is returned as 149, the center point between high and low.

Speed ratings are scaled from the top of the field down to minus 25 below the top horse. This allows for 25 numbers. Specifically, if the top horse is a 97, he becomes 100 and the lowest horse is graded at 75. Any horse that comes in below 75 gets a 75.

Now that we know there will always be 25 numbers, we know that each number is worth 250/25 = 10 points.

maddog42 · 05-23-2016, 08:06 AM

Quote:

Originally Posted by raybo

I have not read Fiero's book, but I've heard that one should only consider "actual" win contenders. While that's probably true, it doesn't address my problem, that being assigning win probabilities and their associated decimal odds to all horses in the race. There must be a way to scale the ratings in order to obtain more realistic probabilities and odds.

Sure, it would be easy to consider only those horses with at least a 10% probability, regarding assigning odds, however, there can be horses below that 10% that are decent bets, especially when you consider the price you're likely to get. Lots of people have automated methods of assigning realistic odds to every horse in a race. I guess that is my real question, how do they do it, without ending up with something unrealistic like this example?

I mean, really, in my example, there is a horse with a 105 rating, another with a 104, and another with a 101, with the next one way down at 89. And yet, all 4 of those top horses are only a couple of percentage points apart. The spread should be larger than that, IMO.

As a proponent of Fierro's book, I must admit that your dilemma is a common one for me. About 90 percent of the time I am able to get a field down to 4 or 5 contenders. That other 10 percent is as difficult as peeling an onion to find the differences. The onion gets sliced super thin and is almost transparent. This is something that you already know. There is value in the 6th and 7th contender sometimes. Hard to prove, but I know it is there.
Reality is the only thing that matters. Let your records show you how to manipulate and assign odds. This is complicated by the different distances/categories. At some distances/ tracks my line is a wreck and recognizing this is a way to make money at this game.

davew · 05-23-2016, 10:26 AM

Quote:

Originally Posted by raybo

Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.

I have been trying to devise a method using standard deviations, but still have some snags. If you did probabilities by hand (from your numbers) what would you get? I guesstimated a probability line, but would have to see a few thousand races (with results) to see correlation with your numbers and results -> what percent of time would 2,3, or 9 win? Being able to go to the tenths would really help in lower odds horses...

#1 -- 72.65 -- 2
#2 -- 101.06 - 16
#3 -- 104.32 - 20
#4 -- 87.72 -- 9
#5 -- 89.90 -- 10
#6 -- 10.20 -- 1
#7 -- 75.89 -- 4
#8 -- 79.96 -- 6
#9 - 105.25 -- 22
#10 - 77.95 -- 5
#11 - 73.84 -- 3
#12 - 69.82 -- 2

raybo · 05-23-2016, 11:51 AM

Quote:

Originally Posted by davew

I have been trying to devise a method using standard deviations, but still have some snags. If you did probabilities by hand (from your numbers) what would you get? I guesstimated a probability line, but would have to see a few thousand races (with results) to see correlation with your numbers and results -> what percent of time would 2,3, or 9 win? Being able to go to the tenths would really help in lower odds horses...

#1 -- 72.65 -- 2
#2 -- 101.06 - 16
#3 -- 104.32 - 20
#4 -- 87.72 -- 9
#5 -- 89.90 -- 10
#6 -- 10.20 -- 1
#7 -- 75.89 -- 4
#8 -- 79.96 -- 6
#9 - 105.25 -- 22
#10 - 77.95 -- 5
#11 - 73.84 -- 3
#12 - 69.82 -- 2

Obviously, at least to me, if the ratings are fairly accurate, then the top 3 horses should win about 60% of the time, over time. Your line has them at 58%, that's very close. Based on that, how did you differentiate between those 3 horses to get their portion of the 58% total probability? It appears that the relationships are not linear, but I haven't a clue as to how to come up with that non-linear scale/slope.

This set of numbers seems promising, regarding the separation/relationship between the 3 ratings. I found the average and then subtracted the 3 ratings from that average.

#2 _ -2.483333333
#3 _ 0.776666667
#9 _ 1.706666667

Based on those numbers this is what the probabilities percentage would be:

16.85
20.11
21.04

lansdale · 05-23-2016, 05:23 PM

Quote:

Originally Posted by raybo

Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.

Hi Raybo,

I'm a little confused from what you've said in this thread whether this is a list of variable weights or a list of horses whose output is projected according to such weights. I'm guessing the latter.

If so, what the range of this data seem to resemble to me, since you've mentioned that your method is in the black, is the $net of a given field based on a few simple factors, which might explain the clustering. Also, since you've mentioned 'top 3' ranking as a part of your method, possibly you're penalizing horses who fall out of this grouping- would be consistent with this result. Since your description of your method implies that this is what you have sought to maximize, it would seem to make sense.

If it's not possible that this is what you've done, you already know this. But if it is, I would suggest just moving the decimal point two figures to the left and testing this against a database (you mentioned you're a client of J. Platt), and see how it stands up against a reasonably large sample. BTW, since the mean of even this small sample is 79, which would mean a return of .79 vs. all horses, which is quite close to what I believe is the mean return of all horses by the betting public, this may be quite accurate.

Cheers,

lansdale