PDA

View Full Version : Race structure analysis


cashmachine
05-27-2014, 07:17 PM
I noticed that my program much more successful in one particular type of race: when there are exactly 3 or 4 approximately equal horses and the rest of horses is much weaker (field size is 12-14 horses). My program either doesn't bet or loses badly when there are only 1 or 2 strong horses and rest is much weaker; it is also not successful when there are more than 4 strong horses.

That lead me to a belief that public odds efficiency (in the sense how close they are to real probability of win) is varies with the composition of race, in other word race structure. I cannot yet give good definition of "race structure" but approximately I mean something like: if there is one very strong horse and two good horses (but significantly weaker than the champion), then it is "1-2" race; if there are 3 very strong horses and one good (but not superior) then it is "3-1" race. I hope you get the idea.

I am thinking about separate analysis for every type of race but currently cannot yet give precise definition of "race structure" so I can't program it. I am curious whether somebody is also thinking along these lines? How exactly would you define race structure? Did anybody do any research on this topic? Any ideas would be much appreciated.

whodoyoulike
05-27-2014, 07:55 PM
I noticed that my program much more successful in one particular type of race: when there are exactly 3 or 4 approximately equal horses and the rest of horses is much weaker (field size is 12-14 horses). My program either doesn't bet or loses badly when there are only 1 or 2 strong horses and rest is much weaker; it is also not successful when there are more than 4 strong horses.

That lead me to a belief that public odds efficiency (in the sense how close they are to real probability of win) is varies with the composition of race, in other word race structure. I cannot yet give good definition of "race structure" but approximately I mean something like: if there is one very strong horse and two good horses (but significantly weaker than the champion), then it is "1-2" race; if there are 3 very strong horses and one good (but not superior) then it is "3-1" race. I hope you get the idea.

I am thinking about separate analysis for every type of race but currently cannot yet give precise definition of "race structure" so I can't program it. I am curious whether somebody is also thinking along these lines? How exactly would you define race structure? Did anybody do any research on this topic? Any ideas would be much appreciated.

This is a serious response. Where (which tracks) do you regularly find field sizes of 12 - 14 horses?

Thanks.

cashmachine
05-27-2014, 07:59 PM
This is a serious response. Where (which tracks) do you regularly find field sizes of 12 - 14 horses?

Thanks.

I bet in Hong-Kong; 95% of races there have at least 12 horses.

PhantomOnTour
05-27-2014, 08:22 PM
I bet in Hong-Kong; 95% of races there have at least 12 horses.
If I could get "American style" pp's for Hong Kong I would take a long look at that place.


Your post is interesting and may have more to do with public confusion than anything else.
I have never thought of classifying races by betting structure or strength/number of favorites

classhandicapper
05-27-2014, 08:26 PM
That pretty interesting because my favorite situation is when there are 1-3 horses that are way above the rest and it's a large field. Some of the deadest horses you are ever going to see still take money in fields like that. Cumulatively they can take quite a lot. So the more the merrier. But if the field is very contentious and deep, I'm often just as confused as everyone else about where the value actually is.

Dark Horse
05-27-2014, 08:26 PM
I noticed that my program much more successful in one particular type of race: when there are exactly 3 or 4 approximately equal horses and the rest of horses is much weaker (field size is 12-14 horses). My program either doesn't bet or loses badly when there are only 1 or 2 strong horses and rest is much weaker; it is also not successful when there are more than 4 strong horses.

That lead me to a belief that public odds efficiency (in the sense how close they are to real probability of win) is varies with the composition of race, in other word race structure. I cannot yet give good definition of "race structure" but approximately I mean something like: if there is one very strong horse and two good horses (but significantly weaker than the champion), then it is "1-2" race; if there are 3 very strong horses and one good (but not superior) then it is "3-1" race. I hope you get the idea.

I am thinking about separate analysis for every type of race but currently cannot yet give precise definition of "race structure" so I can't program it. I am curious whether somebody is also thinking along these lines? How exactly would you define race structure? Did anybody do any research on this topic? Any ideas would be much appreciated.

How big is the sample size for this observation and how far back does it go?

cashmachine
05-27-2014, 08:35 PM
If I could get "American style" pp's for Hong Kong I would take a long look at that place.


Here is a typical race card from my source:
http://racinghk.com/premium/formguide/2014-05-28/#7
If you want to get more lines of past runs, click "Show more runs" in the top right corner. This is how race card from HK provider looks like, but I know some people get somewhere race cards from American providers that looks familiar to USA located bettors.

cashmachine
05-27-2014, 08:36 PM
How big is the sample size for this observation and how far back does it go?
I bet about 1,5 years and made about 700 bets during that time.

dannyhill
05-27-2014, 08:37 PM
Because turf racing produces so many horses who in there previous races had trip issues they are often over looked on the tote board. There speed figures are low and they finished well back, so the public often tosses them.

Dark Horse
05-27-2014, 08:38 PM
And what, approximately, is the percentage of races that behave as you describe? Is it the majority? Does it come and go, or is it relatively constant?

I'm just trying to get an initial feel.

PhantomOnTour
05-27-2014, 08:41 PM
Here is a typical race card from my source:
http://racinghk.com/premium/formguide/2014-05-28/#7
If you want to get more lines of past runs, click "Show more runs" in the top right corner. This is how race card from HK provider looks like, but I know some people get somewhere race cards from American providers that looks familiar to USA located bettors.
Thx - those pp's are nice

cashmachine
05-27-2014, 08:49 PM
And what, approximately, is the percentage of races that behave as you describe? Is it the majority? Does it come and go, or is it relatively constant?

I'm just trying to get an initial feel.

I don't know percentages because I can't program it - I can't yet define it. But I instantly recognize the situation simply by looking at the current odds. It happens often enough so that I got conditioned like laboratory rat: every time I see that composition of odds I starting to have a good feeling :). Lol :). Situation happens approximately twice per racing day (out of 8 or 10 races). I also noticed that in such a situation my program makes multiple bets in the quinella place pool (which is unusual), in other words my program thinks that QPL odds are very inefficient and there are multiple overlays.

whodoyoulike
05-27-2014, 08:54 PM
Here is a typical race card from my source:
http://racinghk.com/premium/formguide/2014-05-28/#7
If you want to get more lines of past runs, click "Show more runs" in the top right corner. This is how race card from HK provider looks like, but I know some people get somewhere race cards from American providers that looks familiar to USA located bettors.

Thanks for the link. I've been curious what their PP looked like. Do you have the charts for the above referenced race?

Thanks.

Maximillion
05-27-2014, 08:54 PM
I tend to do better in races like this too,the problem is there are not enough of them.Thats why I prefer to look at many tracks....even if it means I sacrifice having any kind of intimate knowledge of the horses and connections.

cashmachine
05-27-2014, 09:04 PM
Do you have the charts for the above referenced race?

I am not sure what do you mean by "charts". Try to click "PRE-RACE" on the black stripe (command menu) on the top of the page, there they have some info like trainer-jockey etc. stats.

whodoyoulike
05-27-2014, 09:39 PM
I am not sure what do you mean by "charts". Try to click "PRE-RACE" on the black stripe (command menu) on the top of the page, there they have some info like trainer-jockey etc. stats.

Do they have race charts similar to the ones which recap how a race was run by each horse in the race for the PP you provided in the link?

Something similar to this link.


http://www.brisnet.com/cgi-bin/instant.cgi?date=2014-05-26&track=CD&country=USA&race=10&type=inc&print=on

cashmachine
05-27-2014, 09:45 PM
Do they have race charts similar to the ones which recap how a race was run by each horse in the race for the PP you provided in the link?

Something similar to this link.


http://www.brisnet.com/cgi-bin/instant.cgi?date=2014-05-26&track=CD&country=USA&race=10&type=inc&print=on

It looks like you want sectional time and position, like:
http://www.hkjc.com/english/racing/display_sectionaltime.asp?racedate=25/05/2014&Raceno=8&All=0.
For the race card I gave link above this information is not available yet, because it is a FUTURE race.

cashmachine
05-27-2014, 09:50 PM
In general, for the past races info available for free on the HK Jockey Club website: http://racing.hkjc.com/racing/Info/Meeting/Results/English/Local/20140525/st/1
They have info for all races going years back, you just need to change date, track and race number in the url to get whatever you want.

Robert Fischer
05-27-2014, 10:26 PM
You have (at least) two major dynamics going on in your different "race structures".

Distribution of the pool money.
and
Distribution of the quality of the contenders.

I noticed that my program much more successful in one particular type of race: when there are exactly 3 or 4 approximately equal horses and the rest of horses is much weaker (field size is 12-14 horses). My program either doesn't bet or loses badly when there are only 1 or 2 strong horses and rest is much weaker; it is also not successful when there are more than 4 strong horses.
One (perhaps oversimplified) hypothesis to this phenomena would be that in fields which your program identified either relatively few or many contenders, the former race structure selections were too "chalky", while the latter indicated a lack of sufficient information with which to separate contenders.

Another very important factor would be the actual "race structure" as opposed to your program. The public odds could give an indication. If the public odds varied wildly from your program's contender selection (and it wasn't producing overlays), then it could simply be a function of your program's contenders best matching the "set" of public contenders within race structures of 3-4 contenders.


I am thinking about separate analysis for every type of race but currently cannot yet give precise definition of "race structure" so I can't program it. I am curious whether somebody is also thinking along these lines? How exactly would you define race structure? Did anybody do any research on this topic? Any ideas would be much appreciated.
Interesting stuff so far. :ThmbUp:

cashmachine
05-27-2014, 10:39 PM
You have (at least) two major dynamics going on in your different "race structures".

Distribution of the pool money.
and
Distribution of the quality of the contenders.

How about estimating the quality of the contenders by the current odds? Of course odds are not perfect but in average they are very close to the real win probability. I don't think it is practical to separate these two things, otherwise we will not have means to estimate "quality of the contenders".

Magister Ludi
05-27-2014, 11:16 PM
I am thinking about separate analysis for every type of race but currently cannot yet give precise definition of "race structure" so I can't program it. I am curious whether somebody is also thinking along these lines? How exactly would you define race structure? Did anybody do any research on this topic? Any ideas would be much appreciated.

I believe that your "race structure" may be defined by a statistical classification of races according to the subjective probability distribution of each horse's winning chances and a comparison of the mathematical expectation with previous data. This is known as race entropy.

Robert Fischer
05-28-2014, 07:26 AM
How about estimating the quality of the contenders by the current odds?

Using the current odds is a good way to do it.

Robert Fischer
05-28-2014, 07:36 AM
I believe that your "race structure" may be defined by a statistical classification of races according to the subjective probability distribution of each horse's winning chances and a comparison of the mathematical expectation with previous data. This is known as race entropy.

Would this be similar to making pars for Race Structure types?

I'm interested in more about your idea of looking at it as a measure of entropy.

Magister Ludi
05-28-2014, 05:45 PM
Would this be similar to making pars for Race Structure types? No.

I'm interested in more about your idea of looking at it as a measure of entropy.

Entropy is disorder in a closed system. It measures the competitiveness of a race. A maximum entropy race would be a race where all of the horses have the same probability of winning. A minimum entropy race would be a uniform distribution of probabilities over the interval 0 to 1.

One way to quantify race entropy is as follows:

Sum{1/[O(i)+1]^2}/N

where O(i) = odds of the ith horse
N = number of entries

The smaller the value, the more competitive is the race. Dr. Beav was kind enough to share this formula on this forum. My entropy model incorporates numerous factors with the same end as Dr. Beav's formula.

Robert Fischer
05-28-2014, 06:43 PM
Entropy is disorder in a closed system. It measures the competitiveness of a race. A maximum entropy race would be a race where all of the horses have the same probability of winning. A minimum entropy race would be a uniform distribution of probabilities over the interval 0 to 1.

One way to quantify race entropy is as follows:
Sum{1/[O(i)+1]^2}/N

where O(i) = odds of the ith horse
N = number of entries

The smaller the value, the more competitive is the race. Dr. Beav was kind enough to share this formula on this forum. My entropy model incorporates numerous factors with the same end as Dr. Beav's formula.





Very cool.

This makes me want to calculate some races to see what they look like per the formula.

Also interesting that you are using additional factors.

cashmachine
05-29-2014, 12:11 AM
A maximum entropy race would be a race where all of the horses have the same probability of winning. A minimum entropy race would be a uniform distribution of probabilities over the interval 0 to 1.


That's wrong. Maximum of your formula will be achieved when all the horses (except 1) have large odds and one horse has very small odds - in other words when there is one extremely strong favorite and the rest is equally weak. Maximum entropy will be when all of the horses have the same probability of winning.

Hoofless_Wonder
05-29-2014, 12:46 AM
Cashmachine - thanks for the link to those nice PPs. Looks like they have a 14 day trial, and accept Paypal. Only a bit more than a month of racing left in Hong Kong before the summer break, but I'm looking forward to trying these out.

As for the observation on races with 3-4 stronger contenders producing max profits, that sounds like a typical race at Sha Tin. The races there are very formful, and at least half the field can be downgraded due to being off form, bad post, or racing against a bias.

When the remaining contenders are sorted out, it makes sense that there would be more inefficiencies with the larger number that can win. When there are only one or two contenders, the odds reflect it, and there's not much left to shoot for in the win pool - so then the exotics come into play.

It sounds like you have a sophisticated program that can take advantage of this angle - very nice.....

Robert Fischer
05-29-2014, 07:56 AM
That's wrong. Maximum of your formula will be achieved when all the horses (except 1) have large odds and one horse has very small odds - in other words when there is one extremely strong favorite and the rest is equally weak. Maximum entropy will be when all of the horses have the same probability of winning.

I believe ML's statement should be correct.

If all the horses have the same odds (or were all handicapped to have the same probability estimate), - that should be the maximum entropy. Closely matched rivals, high potential for disorder.

If there was a heavy (true)favorite, then that should be the minimum entropy.


Is the formula finding different results?

Sapio
05-29-2014, 09:36 AM
I believe ML's statement should be correct.

If all the horses have the same odds (or were all handicapped to have the same probability estimate), - that should be the maximum entropy. Closely matched rivals, high potential for disorder.

If there was a heavy (true)favorite, then that should be the minimum entropy.


Is the formula finding different results?

Hi Robert

You, Magister and Cash are all correct. The confusion is that I believe Magister meant to say, "A minimum entropy race would be a non-uniform distribution of probabilities over the interval 0 to 1.". More likely a highly skewed beta distribution.

Thomas Sapio

cashmachine
05-29-2014, 02:48 PM
I believe ML's statement should be correct.

If all the horses have the same odds (or were all handicapped to have the same probability estimate), - that should be the maximum entropy.

Let's do the math together. Assume 5 horses, no track take.

Case 1. Extremely strong favorite: $99996 bet on horse 1, $1 bet on the other 4 horses. Odds will be 0.0 for horse 1, and 99999 for each of other horses. Substitute into formula: (1/(0.0 + 1)^2 + 4*1/(100000)^2)/5 = (1 + 4 * 0)/5 = 0.2

Case 2. Equal odds: 20000 bet on each horse. Odds will be 4 for every horse. Substitute into formula: (5/(4+1)^2)/5 = 1/25 = 0.04

As you can see, formula produced much bigger result for the case 1 (extremely strong favorite). In other words, formula gives maximum when there is absolute structure, and formula gives minimum when it is complete chaos. In other words, his formula is NOT entropy, it is mirror opposite to entropy: entropy is maximum when it is complete chaos.

Robert Fischer
05-29-2014, 03:54 PM
Let's do the math together. Assume 5 horses, no track take.

Case 1. Extremely strong favorite: $99996 bet on horse 1, $1 bet on the other 4 horses. Odds will be 0.0 for horse 1, and 99999 for each of other horses. Substitute into formula: (1/(0.0 + 1)^2 + 4*1/(100000)^2)/5 = (1 + 4 * 0)/5 = 0.2

Case 2. Equal odds: 20000 bet on each horse. Odds will be 4 for every horse. Substitute into formula: (5/(4+1)^2)/5 = 1/25 = 0.04

As you can see, formula produced much bigger result for the case 1 (extremely strong favorite). In other words, formula gives maximum when there is absolute structure, and formula gives minimum when it is complete chaos. In other words, his formula is NOT entropy, it is mirror opposite to entropy: entropy is maximum when it is complete chaos.

That looks like a good catch, Cashmachine. :ThmbUp:

If the result is in fact a "mirror opposite to entropy", we could simply take the negative.

Case 1. Extremely strong favorite: ...
Substitute into formula: -(1/(0 + 1)^2 + 4*1/(100000)^2)/5 = -0.2

Case 2. Equal odds: ...
Substitute into formula: -(5/(4+1)^2)/5 = -0.4

This seems to remedy our issue, so long as we don't mind having a negative number.



Let me also pass on an alternative formula for Entropy:

One way of approaching this was discussed in the book "The Theory of Gambling and Statistical Logic" by Richard Epstein. He calculates what he calls the "entropy" of the race as follows:

First calculate the value p * log p for all horses in the race, where p = 1 / (odds + 1).

Then sum the values for all horses and take the negative of that.

He uses LOG rather than squaring, and then takes the negative of the SUM , which finally ends up with a positive number as the result.

Something worth looking at.

Sapio
05-29-2014, 04:11 PM
That looks like a good catch, Cashmachine. :ThmbUp:

If the result is in fact a "mirror opposite to entropy", we could simply take the negative.

Case 1. Extremely strong favorite: ...
Substitute into formula: -(1/(0 + 1)^2 + 4*1/(100000)^2)/5 = -0.2

Case 2. Equal odds: ...
Substitute into formula: -(5/(4+1)^2)/5 = -0.4

This seems to remedy our issue, so long as we don't mind having a negative number.



Let me also pass on an alternative formula for Entropy:


He uses LOG rather than squaring, and then takes the negative of the SUM , which finally ends up with a positive number as the result.

Something worth looking at.

Hi Robert

The "alternative formula for Entropy" will not work, unless all fields have the same number of entries. The ln is base 2 (binary).

Also, TMs formulation is correct and proper. Read the original thread for more info.

Thomas Sapio

cashmachine
05-29-2014, 04:12 PM
Entropy is a method to numerically express some aspect of odds line; it sound interesting and definitely worth playing with to see what it can be used for. However, it won't help me to solve problem of classifying races by structure. You see, the problem is that I want to catch composition of race, something that is derived from odds but on more abstract level.

Let's say we have situation where there is one strong favorite and two good horses, others is much worse. You see, their odds depend also on the number of horses in the race and also on how much worse is the rest of the field. If the rest of the field have odds 30 to 1, when odds of the three best horses will be somewhere around 8 or 10 to 1; if the rest of the field have odds 300 to 1, then three best horses will be like 2 or 3 to 1. From the structure point of view, this is the same type of race, but numerical entropy values will be very different in these two scenario. This is exactly why I can't program it - I want to group such situations together, doesn't matter what is the exact value of odds.

Robert Fischer
05-29-2014, 04:40 PM
Also, TMs formulation is correct and proper. Read the original thread for more info.

Thomas Sapio

Thanks Thomas Sapio

In the original thread, Trifecta Mike in fact states that "The larger the value, the more uncompetitive the race."


That is a good thread
http://www.paceadvantage.com/forum/showthread.php?t=87050&page=1&pp=15

cashmachine
05-29-2014, 09:17 PM
That is a good thread
http://www.paceadvantage.com/forum/showthread.php?t=87050&page=1&pp=15

I just read it and found very interesting idea to use this entropy thing to adjust results of past races, for example in computing the percentage of jockey win, we might weight races according to their "competitiveness" measured by the entropy. From the other hand people who said that there are other things (like margins and finish times) that measure real competitiveness of race better than odds line also make good point.