Math Question [Archive] - Horse Racing Forum - PaceAdvantage.Com

JJMartin

11-16-2013, 02:57 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

OTM Al

11-16-2013, 03:01 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

How about square the number of wins and divide by starts

Ocala Mike

11-16-2013, 05:45 PM

I don't see it as a math problem at all. Without quantifying the "value" of the wins, how do we know horse B has actually "accomplished" more than horse A?

Even if all the wins are of equal "value," it's still a judjment call, i.e,, is a horse with 7 starts and 4 wins more "accomplished" than a horse with 3 starts and 2 wins or not?

Some things can't be reduced to math; IMHO this is one of them.

JJMartin

11-16-2013, 06:35 PM

I don't see it as a math problem at all. Without quantifying the "value" of the wins, how do we know horse B has actually "accomplished" more than horse A?

Even if all the wins are of equal "value," it's still a judjment call, i.e,, is a horse with 7 starts and 4 wins more "accomplished" than a horse with 3 starts and 2 wins or not?

Some things can't be reduced to math; IMHO this is one of them.

I came up with something that I believe is suitable:

(x wins/x starts)*(x wins/100)

To pose this to the earlier example:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

(4/4)*(4/100)= 0.04
(9/10)*(9/100)=0.081

In this case Horse A (9 of 10 horse) has the higher rating.

Another example:

Horse C 7/10
Horse D 5/5

(7/10)*(7/100)=0.049
(5/5)*(5/100)=.05

Now the rating is slightly higher on the 5/5 horse, the opposite of the first example.

Some_One

11-16-2013, 06:38 PM

So if horse C was getting its record in group races while horse d was doing it in 10k claimers at the Mountain, you're going to bet horse d? well at least you'll get great odds.

JJMartin

11-16-2013, 06:41 PM

So if horse C was getting its record in group races while horse d was doing it in 10k claimers at the Mountain, you're going to bet horse d? well at least you'll get great odds.

Not at all, this is for horses that have to meet other criteria first but good observation.

Ocala Mike

11-16-2013, 06:51 PM

Handicapping is more art than science, for me. Obviously, you are working on some grand "unified theory" to use as a selection method. People have been grappling with this for centuries, but it's not for me.

Reducing every entrant to a figure has never been my cup of tea, but I guess that's what makes horse racing. Good luck in your pursuit.

JJMartin

11-16-2013, 06:55 PM

Handicapping is more art than science, for me. Obviously, you are working on some grand "unified theory" to use as a selection method. People have been grappling with this for centuries, but it's not for me.

Reducing every entrant to a figure has never been my cup of tea, but I guess that's what makes horse racing. Good luck in your pursuit.

Thanks, I love reducing horses to figures :)

Clocker

11-16-2013, 08:19 PM

Now the rating is slightly higher on the 5/5 horse, the opposite of the first example.

But does that weighting reflect anything? If your weights were based on claiming price or purse value or some other criteria, that could provide some additional information.

In your example of A vs B and C vs D, I don't see that the results tell you anything.

JJMartin

11-16-2013, 08:41 PM

But does that weighting reflect anything? If your weights were based on claiming price or purse value or some other criteria, that could provide some additional information.

In your example of A vs B and C vs D, I don't see that the results tell you anything.

There are additional criteria in place, I use Excel to process races with my own formulas. It tells me quite a bit actually.

Ocala Mike

11-16-2013, 08:56 PM

In your example of A vs B and C vs D, I don't see that the results tell you anything.

What I was tryng to say; well put!

misscashalot

11-16-2013, 11:21 PM

compare amount of average $ banked per start for each runner
it's an old capping factor

Some_One

11-16-2013, 11:25 PM

Not at all, this is for horses that have to meet other criteria first but good observation.

The more criteria you add, the more the result will align with public consensus.

JJMartin

11-16-2013, 11:28 PM

The more criteria you add, the more the result will align with public consensus.

Except when it doesn't.

JustRalph

11-16-2013, 11:31 PM

The more criteria you add, the more the result will align with public consensus.

2nd the motion

JustRalph

11-16-2013, 11:32 PM

Except when it doesn't.

And how often do you think that will happen?

Just a ball park?

JJMartin

11-17-2013, 12:28 AM

I have my own algorithm so my criteria is not necessarily the common public ones.

shouldacoulda

11-17-2013, 07:42 AM

The only type of race I found this angle useful in is non optional claiming races. Particularly the cheaper ones. It's a guide, not an end all be all. If you use this approach on allowance races you will get eaten alive. Ask me how I know.

classhandicapper

11-17-2013, 12:49 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

You could set a "minimum number of starts".

Let's call it 5.

4 of 4 would then become 4 of 5.

9 of 10 would be better than 4 of 5.

But if the horse was 5 of 5 that would be better than 9 of 10.

3 of 3 vs. 7 of 10 would become 3 of 5 vs. 7 of 10.

You'd have to tinker a little with various scenarios to see what minimum seems to produce the results you are most comfortable with. You could even make it a variable amount. (and test it if you have the data)

Bill Quirin used something like that in one of his consistency methods.

Capper Al

11-17-2013, 01:16 PM

I'd factor either Average Purse Value (APV) or Earnings per Start (EPS) into this somehow. Definitely, all other things being equal, Horse B has more experience and extended its winning streak far beyond breaking its maiden. Horse B in most cases should be preferred. Look at Dave's Percentages and Probabilities 2012 under Money Box for more ideas.

Some_One

11-17-2013, 06:01 PM

I'd factor either Average Purse Value (APV) or Earnings per Start (EPS) into this somehow. Definitely, all other things being equal, Horse B has more experience and extended its winning streak far beyond breaking its maiden. Horse B in most cases should be preferred. Look at Dave's Percentages and Probabilities 2012 under Money Box for more ideas.

I agree, but the public knows this for the most part, won't give you an edge.

Dark Horse

11-18-2013, 02:22 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

You would use a Z-score: (wins-losses)/SQRT sample size. This will give you the standard deviation.

Horse A: 4/SQRT4 = 2.00
Horse B: 8/SQRT10 = 2.53

Horse B, as you intuitively knew, comes out with the better number. Naturally, you would have to factor in the odds as well, but that wasn't part of the question.

JJMartin

11-18-2013, 02:50 PM

You would use a Z-score: (wins-losses)/SQRT sample size. This will give you the standard deviation.

Horse A: 4/SQRT4 = 2.00
Horse B: 8/SQRT10 = 2.53

Horse B, as you intuitively knew, comes out with the better number. Naturally, you would have to factor in the odds as well, but that wasn't part of the question.

I'll run it through the spreadsheet...

DeltaLover

11-18-2013, 04:36 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

Try this:
normalize wins (http://www.themindofagambler.com/normalize_wins.xlsx)

If you need to put more / less empasis on the winning percent you can easily do it by changing the underlined macros.

I created the document in excel compatible format using OO so I think you will not have trouble opening it with either one.

classhandicapper

11-18-2013, 05:58 PM

Try this:
normalize wins (http://www.themindofagambler.com/normalize_wins.xlsx)

If you need to put more / less empasis on the winning percent you can easily do it by changing the underlined macros.

I created the document in excel compatible format using OO so I think you will not have trouble opening it with either one.

What does column "D" mean?

DeltaLover

11-18-2013, 07:00 PM

What does column "D" mean?

D represents an intermediate linear transformation of the win - total vector.
It follows this rule:

D = x1 * C + x2 * B

In this particular case I use:

x1 = 1
x2 = -2

Following this normalization algorithm your unknowns become x1, x2 and the last 1 (x3) that I am using for column E

So, based in a given normalization function which is using the normal weight we can optimize x1, x2, x3 to the fitest values.

The vector [1, -2 , 1] is an approximation to what was given to me from a genetic algorithm using normal weght as one of two parameters (the other was a derivative of BPP) focusing on maximizing winning frequency (was probably is not the best ff for betting purposes)

Ocala Mike

11-18-2013, 08:20 PM

Are you guys sending horses into space on rockets or handicapping horse races?
All that's missing now is a Venn diagram.

JohnGalt1

11-18-2013, 08:39 PM

You didn't mention class, so let's say all races were open $25 claimers other than their maiden wins.

Let me throw in field sizes.

Horse A was 4/4 in 6 horse fields

Horse B was 9/10 in 12 horse fields.

Who beat more horses?

Horse A beat 20 horses or 5 per race.

Horse B beat 99 horses, if in his defeat he finished last, or an average of 9.9 per race. If it finished better than 12th in the defeat it would have beaten even more horses and his average of beating 9.9 would be even greater.

If horse A was in 12 horse fields and horse B was in 6 horse fields, horse A would have beaten 44 horses an average of 11 per race and horse B would've beaten 55 horses an average of 5.5 if it's loss was a last place finish.

JustRalph

11-18-2013, 09:30 PM

Are you guys sending horses into space on rockets or handicapping horse races?
All that's missing now is a Venn diagram.

Reading my mind :lol:

Ocala Mike

11-18-2013, 09:41 PM

Uh, oh, JustRalph and I are on the same page. Don't know about him, but I'm gonna watch Panthers/Patriots. Go, Cam!

formula_2002

11-18-2013, 09:48 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?

4/4= 100
9/10= 90
100+90 =190
100/190= .5263 horse A
90/190= .4736 horse B

.5263+.4736= 100
all normalized relationships = 100

horse B can not be of greater weight than horse A

Actor

11-18-2013, 11:39 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?The question is what factors do you consider in establishing Horse B's difficulty in obtaining such a record? The solution lies in the statistical method called regression. Get yourself a good statistics textbook and study up.

iceknight

11-18-2013, 11:46 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B? Which race did Horse B lose. 1st start or 5th start or last race? I am surprised no one asked this yet. Because as long as Horse B is either running at his class level (or running at the absolute top of his class) and winning, you can only predict wins. but if any conditions change, then the "outlier" matters a lot more, especially if you have to allot your money between one of the two even if they dont run in the same race. This is assuming that they have equal amount of rest and all other conditions being same. Oh, but considering the game is quite crooked, you are better of finding who "They" are going to bet on :lol:

classhandicapper

11-19-2013, 12:09 PM

I think it's understood that an overall assessment of the horse has to include the quality of competition the horse raced against. It probably should also include winning margins. For example if the 4-4 horse won all 4 races by 10 lengths, that might give you a different impression than if he won all 4 by a length or 2. I think he's just trying to isolate the consistency portion of this and assuming all else is equal for this exercise.

JJMartin

11-19-2013, 12:41 PM

D represents an intermediate linear transformation of the win - total vector.
It follows this rule:

D = x1 * C + x2 * B

In this particular case I use:

x1 = 1
x2 = -2

Following this normalization algorithm your unknowns become x1, x2 and the last 1 (x3) that I am using for column E

So, based in a given normalization function which is using the normal weight we can optimize x1, x2, x3 to the fitest values.

The vector [1, -2 , 1] is an approximation to what was given to me from a genetic algorithm using normal weght as one of two parameters (the other was a derivative of BPP) focusing on maximizing winning frequency (was probably is not the best ff for betting purposes)

In cell D10 you have =MIN(D5:D8), why is D2:D4 omitted?

Capper Al

11-19-2013, 12:43 PM

You would use a Z-score: (wins-losses)/SQRT sample size. This will give you the standard deviation.

Horse A: 4/SQRT4 = 2.00
Horse B: 8/SQRT10 = 2.53

Horse B, as you intuitively knew, comes out with the better number. Naturally, you would have to factor in the odds as well, but that wasn't part of the question.

Looks interesting. I'll have to study it.

Thanks

DeltaLover

11-19-2013, 12:46 PM

I think it's understood that an overall assessment of the horse has to include the quality of competition the horse raced against. It probably should also include winning margins. For example if the 4-4 horse won all 4 races by 10 lengths, that might give you a different impression than if he won all 4 by a length or 2. I think he's just trying to isolate the consistency portion of this and assuming all else is equal for this exercise.

It is better to form a null hypothesis and try to reject it than try to understand what is the best assessment of a specific factor. Any scenario than you can express as a function of objective factors can and should be evaluated and either found significant or not, regardless of how logical it sounds from a traditional empirical handicapping approach. I am also very reluctant to any judgment call based on an impression. Ideally all of the components of a handicapping system need to be completely quantitative and eventually automated.

DeltaLover

11-19-2013, 12:48 PM

In cell D10 you have =MIN(D5:D8), why is D2:D4 omitted?

This is a bug. It should be D2. It happened while I was copy/pasting the cells around.

thaskalos

11-19-2013, 12:48 PM

How would you create a math formula that would normalize the following:

Horse A life record wins is 4/4
Horse B life record wins is 9/10

At face value, Horse A won 100% of its races while B won 90%, however Horse B is more accomplished considering the difficulty of obtaining such a record. Any ideas on a mathematical solution that would give more weight to Horse B?
The fact that the 9/10 record is usually more difficult to obtain does not necessarily prove that horse B accomplished the tougher task.

DeltaLover

11-19-2013, 12:50 PM

The fact that the 9/10 record is usually more difficult to obtain does not necessarily prove that horse B accomplished the tougher task.

Still, someone can make the case that a 10/10 is the result of a very conservative campaing. Both examples though, are very extreme and very rarely occur in the real world.

thaskalos

11-19-2013, 12:55 PM

Still, someone can make the case that a 10/10 is the result of a very conservative campaing. Both examples though, are very extreme and very rarely occur in the real world.

The horse's won/loss record sits very low on my list of priorities when handicapping a race...unless the race is on the turf.

DeltaLover

11-19-2013, 12:58 PM

The horse's won/loss record sits very low on my list of priorities when handicapping a race...unless the race is on the turf.

Agree, even on the turf. Actually one of my main angles is first time turf.

classhandicapper

11-19-2013, 01:09 PM

It is better to form a null hypothesis and try to reject it than try to understand what is the best assessment of a specific factor. Any scenario than you can express as a function of objective factors can and should be evaluated and either found significant or not, regardless of how logical it sounds from a traditional empirical handicapping approach. I am also very reluctant to any judgment call based on an impression. Ideally all of the components of a handicapping system need to be completely quantitative and eventually automated.

"Impression" may have been a poor choice of words.

I didn't want to make an objective statement of certainty because a horse that won by 10 lengths all 4 times could easily have drawn into short weak fields all 4 times and the horse that won by a length or 2 could have drawn into a series of monster fields, which in turn would mean something else.

The point I was making is that I think the original questioner is probably aware that other factors like the quality of competition, margins etc... are a relevant part of any final conclusion about some horse's record, but for now he's looking for a way to measure this single component better and assuming "all else is equal".

classhandicapper

11-19-2013, 01:18 PM

The horse's won/loss record sits very low on my list of priorities when handicapping a race...unless the race is on the turf.

It's not usually a big factor, but IMO it can be decisive in some instances.

I'm less literal about final time figures than some people. Final times to some extent are a function of race development as well as the abilities of the horses.

So when I see a horse with an outstanding record (or really terrible one) I notice. In cases where the horses look very similar on figures, all else being equal, I'll take the horse with the better record all day long. That's even more true on turf where slow paces can depress figures for outstanding horses more often even while they keep winning.

JJMartin

11-19-2013, 01:23 PM

It's not usually a big factor, but IMO it can be decisive in some instances.

I'm less literal about final time figures than some people. Final times to some extent are a function of race development as well as the abilities of the horses.

So when I see a horse with an outstanding record (or really terrible one) I notice. In cases where the horses look very similar on figures, all else being equal, I'll take the horse with the better record all day long. That's even more true on turf where slow paces can depress figures for outstanding horses more often even while they keep winning.

I would agree

thaskalos

11-19-2013, 01:32 PM

It's not usually a big factor, but IMO it can be decisive in some instances.

I'm less literal about final time figures than some people. Final times to some extent are a function of race development as well as the abilities of the horses.

So when I see a horse with an outstanding record (or really terrible one) I notice. In cases where the horses look very similar on figures, all else being equal, I'll take the horse with the better record all day long. That's even more true on turf where slow paces can depress figures for outstanding horses more often even while they keep winning.

I consider the horse's record on the turf...because I consider turf-running ability to be a special sort of talent, which cannot be adequately described by a figure.

But I pay no attention at all to a horse's overall record when I am handicapping for the dirt. I may look at a horse's record within a particular class level...but overall records do nothing for me.

jfdinneen

11-20-2013, 06:23 PM

JJ,

Notwithstanding the valid criticisms relating to the many relevant but unknown factors required for a realistic assessment of form, one could begin with a bayesian beta-binomial process by regressing the observed performances with peer group pseudocounts [See [Regression To Mean (http://www.3-dbaseball.net/2011/08/regression-to-mean-and-beta.html) ].
For example, let us assume that both horses are competing at the same class level and that based on a sample of past performances the mean and variance of this peer group are 0.30 and 0.01909 respectively. Using Wolfram Alpha, input [((a/(a+b)) = 0.30), ((ab /((a+b)^2*(a+b+1)))= 0.01909)] generates output a≈3,and b≈7. Using these values, a more accurate measure of both horses true ability is A=(3+4)/(3+7+4)≈39% and B=(3+9)/(3+7+10)≈41%. This process of bayesian updating can be continued using additional information, if necessary [See Market Efficiency (http://forum.sbrforum.com/handicapper-think-tank/573765-market-efficiency-bayesian-probability-estimation-via-beta-distribution.html) ].

John

Capper Al

11-21-2013, 10:24 AM

It is better to form a null hypothesis and try to reject it than try to understand what is the best assessment of a specific factor. Any scenario than you can express as a function of objective factors can and should be evaluated and either found significant or not, regardless of how logical it sounds from a traditional empirical handicapping approach. I am also very reluctant to any judgment call based on an impression. Ideally all of the components of a handicapping system need to be completely quantitative and eventually automated.

Totally disagree. Best guess can get one a selection where no cappers will venture at many times a good payout.

jfdinneen

11-21-2013, 12:23 PM

JJ,

Note to self: never do arithmetic calculations at 01:30.

Horse A=(3+4)/(3+7+4) = 50%
Horse B=(3+9)/(3+7+10) = 60%

John

classhandicapper

11-21-2013, 12:31 PM

But I pay no attention at all to a horse's overall record when I am handicapping for the dirt. I may look at a horse's record within a particular class level...but overall records do nothing for me.

I should clarify.

I look at a horse's overall record in terms of their actual performances and not just wins etc..

But all else being equal, I'll take the consistent "winner" over the consistent "loser" all day long. When a horse wins, you can never tell exactly what was in the tank because sometimes horses just prompt their competitors and only do serious running for 2-3 furlongs. So they are capable of more. You typically just don't find out if they are capable of more until they are actually tested for more. But the consistent winner has at least not proved he's not capable of more like the consistent loser.