Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > Handicapping Software


Reply
 
Thread Tools Rate Thread
Old 05-22-2016, 07:55 PM   #1
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Probability percentage from a rating

Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America
raybo is offline   Reply With Quote Reply
Old 05-22-2016, 08:03 PM   #2
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,909
Normalize it to 100%.

That is, add up the numbers and divide each score by the sum of the scores in the race.

Last edited by Dave Schwartz; 05-22-2016 at 08:05 PM.
Dave Schwartz is online now   Reply With Quote Reply
Old 05-22-2016, 09:22 PM   #3
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by Dave Schwartz
Normalize it to 100%.

That is, add up the numbers and divide each score by the sum of the scores in the race.
So, I was correct that you divide each final rating by the sum of all the ratings in the race? If so, then these are the probabilities for each horse:

#1 -- 72.65 ---- 0.076589914
#2 -- 101.06 --- 0.106536888
#3 -- 104.32 --- 0.109981423
#4 -- 87.72 ---- 0.09247253
#5 -- 89.90 ---- 0.094780224
#6 -- 10.20 ---- 0.01074802
#7 -- 75.89 ---- 0.080008831
#8 -- 79.96 ---- 0.084296919
#9 - 105.25 --- 0.110960109
#10 - 77.95 --- 0.082172241
#11 - 73.84 --- 0.077847444
#12 - 69.82 --- 0.073605457

But, the odds those probabilities result in are not logical. They range from 8/1 to 13/1, except for the #6 horse who gets 92/1 odds.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America
raybo is offline   Reply With Quote Reply
Old 05-22-2016, 09:56 PM   #4
headhawg
crusty old guy
 
headhawg's Avatar
 
Join Date: Aug 2003
Location: Snarkytown USA
Posts: 3,917
raybo,

Have you ever read the Four Quarters of Horse Investing by Steve Fierro? I think that trying to assign a "fair" oddsline to more than four or five horses is just folly. If you read his book you will understand what I mean. I'm not trying to diminish what you're attempting, but I think that fair odds becomes nearly impossible to do with any accuracy once the projected win probabilities fall below a certain percentage.

Good luck
headhawg is offline   Reply With Quote Reply
Old 05-22-2016, 10:12 PM   #5
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by headhawg
raybo,

Have you ever read the Four Quarters of Horse Investing by Steve Fierro? I think that trying to assign a "fair" oddsline to more than four or five horses is just folly. If you read his book you will understand what I mean. I'm not trying to diminish what you're attempting, but I think that fair odds becomes nearly impossible to do with any accuracy once the projected win probabilities fall below a certain percentage.

Good luck
I have not read Fiero's book, but I've heard that one should only consider "actual" win contenders. While that's probably true, it doesn't address my problem, that being assigning win probabilities and their associated decimal odds to all horses in the race. There must be a way to scale the ratings in order to obtain more realistic probabilities and odds.

Sure, it would be easy to consider only those horses with at least a 10% probability, regarding assigning odds, however, there can be horses below that 10% that are decent bets, especially when you consider the price you're likely to get. Lots of people have automated methods of assigning realistic odds to every horse in a race. I guess that is my real question, how do they do it, without ending up with something unrealistic like this example?

I mean, really, in my example, there is a horse with a 105 rating, another with a 104, and another with a 101, with the next one way down at 89. And yet, all 4 of those top horses are only a couple of percentage points apart. The spread should be larger than that, IMO.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America

Last edited by raybo; 05-22-2016 at 10:19 PM.
raybo is offline   Reply With Quote Reply
Old 05-22-2016, 11:27 PM   #6
whodoyoulike
Veteran
 
Join Date: Aug 2005
Posts: 3,428
Isn't this similar to creating a M/L which means you'd need to consider the takeout % which differs by track if you're creating the odds e.g., some are 0.15 and others are 0.18 etc.?

How accurate are your ratings since they are very similar to each other which I would think the probabilities would then be close to each other?

Last edited by whodoyoulike; 05-22-2016 at 11:31 PM.
whodoyoulike is offline   Reply With Quote Reply
Old 05-22-2016, 11:43 PM   #7
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by whodoyoulike
Isn't this similar to creating a M/L which means you'd need to consider the takeout % if you're creating the odds?

How accurate are your ratings because they are very similar which I would think the probabilities would then be close to each other?
I have not considered takeout, yet, but I would suspect that the relationships between ratings would not change much, if any. The final ratings would still be too close to one another, for too many horses.

Perhaps the individual factor ratings are too "normalized", and that is causing the final ratings' probabilities to be too tight. I normalized all of the individual factor ratings to the same point range, 0 to 100. That might be the underlying problem. My thinking was that all of the factors should be relational, regarding scale, so that the weightings would not be skewed, higher or lower, simply due to the existence of different factor ranges. I've not much experience with weighted factors, so I'm learning as I go.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America
raybo is offline   Reply With Quote Reply
Old 05-23-2016, 12:03 AM   #8
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,909
Quote:
But, the odds those probabilities result in are not logical. They range from 8/1 to 13/1, except for the #6 horse who gets 92/1 odds.
They are "logical."

The issue is that your numbers do not scale as the tote board does. They are "flat."

As an example, suppose you are building probabilities based upon speed ratings where every horse is in the range of 90-100. The difference from top to bottom is only 10 points. When you graph the horses you get a lightly sloping line or curve.

If you think about it, you don't actually want the speed ratings. You want something that represents how the speed ratings translate into win percentages or impact values.

In other words, you want to ask (and answer) such questions as "How does a horse 3 points below the top horse in the field perform?" (Or 5 points or 8 points, or 17 points.) There are more creative ways to express this but I was trying to keep it simple.

Tagging factors with weighted values is not an easy task.

You need to go back and question your original weighting system.


As a long-standing member on PA, I respect your many contributions. If you'd care to contact me, I'd be willing to spend some time with you in an online meeting and work through some of this stuff and help any way I can.

Just email me.
Dave
Dave Schwartz is online now   Reply With Quote Reply
Old 05-23-2016, 12:17 AM   #9
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by Dave Schwartz

Tagging factors with weighted values is not an easy task.

You need to go back and question your original weighting system.



Dave
I appreciate the response Dave!

Are you saying that, by normalizing all of the individual factor ratings to the same scale, the factor weightings/multipliers will not have the desired effect?

The fact that some factor ratings have a very tight range, from highest to lowest, while others have a wider range, was the very reason I decided to normalize all the factor ratings to the same scale. Obviously, that doesn't work well, as the wider factor ranges get tightened up, while the tighter factor ranges get widened.
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America
raybo is offline   Reply With Quote Reply
Old 05-23-2016, 01:30 AM   #10
whodoyoulike
Veteran
 
Join Date: Aug 2005
Posts: 3,428
I agree with Dave's response which is the reason I asked "how accurate are your ratings". Maybe instead of calculating probabilities based on the ratings, approach it as Dave suggested.

Quote:
Originally Posted by Dave Schwartz
... If you think about it, you don't actually want the speed ratings. You want something that represents how the speed ratings translate into win percentages or impact values.

In other words, you want to ask (and answer) such questions as "How does a horse 3 points below the top horse in the field perform?" (Or 5 points or 8 points, or 17 points.) There are more creative ways to express this but I was trying to keep it simple. ...
whodoyoulike is offline   Reply With Quote Reply
Old 05-23-2016, 01:52 AM   #11
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,909
Quote:
Are you saying that, by normalizing all of the individual factor ratings to the same scale, the factor weightings/multipliers will not have the desired effect?

The fact that some factor ratings have a very tight range, from highest to lowest, while others have a wider range, was the very reason I decided to normalize all the factor ratings to the same scale. Obviously, that doesn't work well, as the wider factor ranges get tightened up, while the tighter factor ranges get widened.
Yes. All factors my scale similarly.

Example: In my system, all factors scale from 1 to 250.

Thus, Quirin Points go from 0 to 8, having 9 actual choices. Therefore, the value of a point is 27 points each and look like this:

0=1-27
1=28-54
2=55-81
3=82-108
4=109-135
5=136-162
6=163-189
7=190-216
8=217-243

The value of Speed Rating = 5 is returned as 149, the center point between high and low.


Speed ratings are scaled from the top of the field down to minus 25 below the top horse. This allows for 25 numbers. Specifically, if the top horse is a 97, he becomes 100 and the lowest horse is graded at 75. Any horse that comes in below 75 gets a 75.

Now that we know there will always be 25 numbers, we know that each number is worth 250/25 = 10 points.
Dave Schwartz is online now   Reply With Quote Reply
Old 05-23-2016, 08:06 AM   #12
maddog42
Registered User
 
Join Date: Jan 2011
Posts: 2,357
Quote:
Originally Posted by raybo
I have not read Fiero's book, but I've heard that one should only consider "actual" win contenders. While that's probably true, it doesn't address my problem, that being assigning win probabilities and their associated decimal odds to all horses in the race. There must be a way to scale the ratings in order to obtain more realistic probabilities and odds.

Sure, it would be easy to consider only those horses with at least a 10% probability, regarding assigning odds, however, there can be horses below that 10% that are decent bets, especially when you consider the price you're likely to get. Lots of people have automated methods of assigning realistic odds to every horse in a race. I guess that is my real question, how do they do it, without ending up with something unrealistic like this example?

I mean, really, in my example, there is a horse with a 105 rating, another with a 104, and another with a 101, with the next one way down at 89. And yet, all 4 of those top horses are only a couple of percentage points apart. The spread should be larger than that, IMO.
As a proponent of Fierro's book, I must admit that your dilemma is a common one for me. About 90 percent of the time I am able to get a field down to 4 or 5 contenders. That other 10 percent is as difficult as peeling an onion to find the differences. The onion gets sliced super thin and is almost transparent. This is something that you already know. There is value in the 6th and 7th contender sometimes. Hard to prove, but I know it is there.
Reality is the only thing that matters. Let your records show you how to manipulate and assign odds. This is complicated by the different distances/categories. At some distances/ tracks my line is a wreck and recognizing this is a way to make money at this game.
__________________
There are more things in Heaven and Earth Horatio, than are dreamed of in your philosophy.
maddog42 is offline   Reply With Quote Reply
Old 05-23-2016, 10:26 AM   #13
davew
Registered User
 
Join Date: May 2011
Posts: 22,638
Quote:
Originally Posted by raybo
Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.
I have been trying to devise a method using standard deviations, but still have some snags. If you did probabilities by hand (from your numbers) what would you get? I guesstimated a probability line, but would have to see a few thousand races (with results) to see correlation with your numbers and results -> what percent of time would 2,3, or 9 win? Being able to go to the tenths would really help in lower odds horses...

#1 -- 72.65 -- 2
#2 -- 101.06 - 16
#3 -- 104.32 - 20
#4 -- 87.72 -- 9
#5 -- 89.90 -- 10
#6 -- 10.20 -- 1
#7 -- 75.89 -- 4
#8 -- 79.96 -- 6
#9 - 105.25 -- 22
#10 - 77.95 -- 5
#11 - 73.84 -- 3
#12 - 69.82 -- 2
davew is online now   Reply With Quote Reply
Old 05-23-2016, 11:51 AM   #14
raybo
EXCEL with SUPERFECTAS
 
raybo's Avatar
 
Join Date: Mar 2004
Posts: 10,206
Quote:
Originally Posted by davew
I have been trying to devise a method using standard deviations, but still have some snags. If you did probabilities by hand (from your numbers) what would you get? I guesstimated a probability line, but would have to see a few thousand races (with results) to see correlation with your numbers and results -> what percent of time would 2,3, or 9 win? Being able to go to the tenths would really help in lower odds horses...

#1 -- 72.65 -- 2
#2 -- 101.06 - 16
#3 -- 104.32 - 20
#4 -- 87.72 -- 9
#5 -- 89.90 -- 10
#6 -- 10.20 -- 1
#7 -- 75.89 -- 4
#8 -- 79.96 -- 6
#9 - 105.25 -- 22
#10 - 77.95 -- 5
#11 - 73.84 -- 3
#12 - 69.82 -- 2
Obviously, at least to me, if the ratings are fairly accurate, then the top 3 horses should win about 60% of the time, over time. Your line has them at 58%, that's very close. Based on that, how did you differentiate between those 3 horses to get their portion of the 58% total probability? It appears that the relationships are not linear, but I haven't a clue as to how to come up with that non-linear scale/slope.

This set of numbers seems promising, regarding the separation/relationship between the 3 ratings. I found the average and then subtracted the 3 ratings from that average.

#2 _ -2.483333333
#3 _ 0.776666667
#9 _ 1.706666667

Based on those numbers this is what the probabilities percentage would be:

16.85
20.11
21.04
__________________
Ray
Horseracing's like the stock market except you don't have to wait as long to go broke.

Excel Spreadsheet Handicapping Forum

Charter Member: Horseplayers Association of North America

Last edited by raybo; 05-23-2016 at 12:00 PM.
raybo is offline   Reply With Quote Reply
Old 05-23-2016, 05:23 PM   #15
lansdale
Registered User
 
Join Date: Jan 2006
Posts: 1,506
?

Quote:
Originally Posted by raybo
Ok you stats gurus, if I have a rating derived from multiple factors, each weighted according to significance, how do I get from that final rating to a projected win probability/percentage?

Let's start with the following final ratings for a race:

#1 -- 72.65
#2 -- 101.06
#3 -- 104.32
#4 -- 87.72
#5 -- 89.90
#6 -- 10.20
#7 -- 75.89
#8 -- 79.96
#9 - 105.25
#10 - 77.95
#11 - 73.84
#12 - 69.82

How do I get from those ratings to a calculated/projected win probability/percentage? I have always thought that you just divide each rating by the sum of all the ratings, but can't seem to find anything related to this type of calculation on the web, and I haven't tried to create a line in a long time, so I'm a bit rusty.
Hi Raybo,

I'm a little confused from what you've said in this thread whether this is a list of variable weights or a list of horses whose output is projected according to such weights. I'm guessing the latter.

If so, what the range of this data seem to resemble to me, since you've mentioned that your method is in the black, is the $net of a given field based on a few simple factors, which might explain the clustering. Also, since you've mentioned 'top 3' ranking as a part of your method, possibly you're penalizing horses who fall out of this grouping- would be consistent with this result. Since your description of your method implies that this is what you have sought to maximize, it would seem to make sense.

If it's not possible that this is what you've done, you already know this. But if it is, I would suggest just moving the decimal point two figures to the left and testing this against a database (you mentioned you're a client of J. Platt), and see how it stands up against a reasonably large sample. BTW, since the mean of even this small sample is 79, which would mean a return of .79 vs. all horses, which is quite close to what I believe is the mean return of all horses by the betting public, this may be quite accurate.

Cheers,

lansdale
lansdale is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 06:27 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.