PDA

View Full Version : Statistical diagnostics


prank
07-27-2007, 12:19 AM
Hi folks,

I've been absent awhile. For those just tuning in, I'm a grad student working on ranking and predicting rankings. I've made quite a bit of progress lately, and a paper will be forthcoming this fall.

Back to racing: do any of you know of handicapping systems that return a prediction and an associated degree of confidence in the prediction? You could say it's like a classification system: predicting True or False (-1 or 1, A or B), with a probability value for the output. In the real world, this would tell you whether or not to gamble.

In addition to some probability model, of course you have to use the posted odds to make a decision on whether to wager and how to wager. Except for the posted odds, are there other indicators or diagnostics that you all know about for various handicapping methods?

Thanks,

Prank

GameTheory
07-27-2007, 01:09 AM
Back to racing: do any of you know of handicapping systems that return a prediction and an associated degree of confidence in the prediction? You could say it's like a classification system: predicting True or False (-1 or 1, A or B), with a probability value for the output. In the real world, this would tell you whether or not to gamble.That's pretty much what handicapping is, although usually expressed in less technical terms. If I like a horse to win a race, and I'm willing to bet him at 3-1 or greater, I've implicitly set a probability value for that horse. Of course, you can do it explicitly as well -- most handicapping software is capable of spitting out an oddsline, not to mention many human handicappers.


In addition to some probability model, of course you have to use the posted odds to make a decision on whether to wager and how to wager. Except for the posted odds, are there other indicators or diagnostics that you all know about for various handicapping methods?You don't have to use the posted odds (which are never final until betting is closed anyway) -- value can be implicit also if you focus your handicapping on factors the public doesn't favor (but which are valid factors) -- in this case, the betting history of the public is your diagnostic.

prank
07-27-2007, 01:41 AM
That's pretty much what handicapping is, although usually expressed in less technical terms. If I like a horse to win a race, and I'm willing to bet him at 3-1 or greater, I've implicitly set a probability value for that horse. Of course, you can do it explicitly as well -- most handicapping software is capable of spitting out an oddsline, not to mention many human handicappers.

Sorry, I didn't make myself quite clear. A better example might be: your system predicts that the horse has a 25% chance of winning. Shorthand: P(horse #k wins) = 0.25; E(performance of horse #k) = 0.25. But, there could be a confidence interval, say +/- 25% confidence interval: E(performance of horse #k; 0.25 quantile) = 0.05, E(performance of horse #k; 0.75 quantile) = 0.3. You might say that that range is way too high, and then consider not betting - the expected performance is 0.25 (i.e. 25% of the time, given all the other factors, you expect the horse to win), but the 75% of the time, the horse's performance (again, conditioned on whatever factors you observe), is that it wins only 30% of the time. I'm tired and may have mixed up my calculations, but the essence is that there could be a large confidence interval / or there is a high degree of uncertainty in the prediction, so you may have a system that says not to wager.

In more basic terms, the variance could be quite high. Whether it's asking 3 buddies for their horse picks or their prediction of the '08 elections, you'd find that the range of predictions could be quite large, so there may be too much risk in placing a bet. Alternatively, your pals may have very strong opinions that horse K has a 50% chance of winning, and they all agree on it.

If you're familiar with bootstrap confidence intervals, that could be used a an example. Alternatively, for binary polls (e.g. vote for Kerry or Bush), there's a margin of error tied to the standard error for a sample of the binomial distribution; for linear regression, one can estimate the standard error for the model's variables, and then estimate a confidence interval for some response value.

In any case, the point would be that if the model not only makes a prediction but expresses a high degree of uncertainty, then don't bet. Or, if there's a tight confidence interval, go ahead and bet, assuming that the posted odds make it favorable to do so.

Thanks for the input!

Prank

GameTheory
07-27-2007, 02:07 AM
In any case, the point would be that if the model not only makes a prediction but expresses a high degree of uncertainty, then don't bet. Or, if there's a tight confidence interval, go ahead and bet, assuming that the posted odds make it favorable to do so.Yes, that's a little different than just predicting True or False with an associated probability value because if just predicting True/False then the probability BECOMES the prediction for betting purposes. (e.g. if the prediction is False with a probability of .80 than that is the same as a prediction of True with a probability of .20)

But if you're talking about making probabilistic predictions with an associated error level/range, that gets a lot more complicated. One way or another you've got to boil it down to "At what price will I bet this horse?" Or if you've got an unacceptably high error range (meaning an unacceptably low confidence level) for a certain race/horse, then maybe you should not make that bet no matter what since having an opinion which you've got no confidence in is the same as having no opinion at all.

The short answer to your original question is probably no. There are no systems or software (specific to handicapping -- there is certainly general data modeling and stat software) that I know of that spit out separate confidence levels along with probability/prediction values. Only stat geeks think about that kind of stuff, and stat geeks have an abysmal record of being successful bettors -- at least relative to their initial optimism that they can beat the game with their superior data analytic skills (i.e. stat geek knowledge). They soon find out that it is not that simple...

K9Pup
07-27-2007, 09:04 AM
Sorry, I didn't make myself quite clear. A better example might be: your system predicts that the horse has a 25% chance of winning. Shorthand: P(horse #k wins) = 0.25; E(performance of horse #k) = 0.25. But, there could be a confidence interval, say +/- 25% confidence interval: E(performance of horse #k; 0.25 quantile) = 0.05, E(performance of horse #k; 0.75 quantile) = 0.3. You might say that that range is way too high, and then consider not betting - the expected performance is 0.25 (i.e. 25% of the time, given all the other factors, you expect the horse to win), but the 75% of the time, the horse's performance (again, conditioned on whatever factors you observe), is that it wins only 30% of the time. I'm tired and may have mixed up my calculations, but the essence is that there could be a large confidence interval / or there is a high degree of uncertainty in the prediction, so you may have a system that says not to wager.

In more basic terms, the variance could be quite high. Whether it's asking 3 buddies for their horse picks or their prediction of the '08 elections, you'd find that the range of predictions could be quite large, so there may be too much risk in placing a bet. Alternatively, your pals may have very strong opinions that horse K has a 50% chance of winning, and they all agree on it.

If you're familiar with bootstrap confidence intervals, that could be used a an example. Alternatively, for binary polls (e.g. vote for Kerry or Bush), there's a margin of error tied to the standard error for a sample of the binomial distribution; for linear regression, one can estimate the standard error for the model's variables, and then estimate a confidence interval for some response value.

In any case, the point would be that if the model not only makes a prediction but expresses a high degree of uncertainty, then don't bet. Or, if there's a tight confidence interval, go ahead and bet, assuming that the posted odds make it favorable to do so.

Thanks for the input!

Prank

You could easily calculate the confidence interval for the entire system, but that isn't what you want. How about with enough data you could subset the historical data and find the confidence interval for your predicted 25% "winners".

prank
07-27-2007, 09:39 AM
Thanks for the replies.
K9Pup - you're right - it could be done through an empirical assessment, but in cases of sparse data (e.g. very unusual race conditions), getting the confidence interval estimates may not be very useful.

GameTheory - in other markets, such as quantitative finance, it can be very important to have a picture of not only the expected return but potential returns at various quantiles. For horse racing, I think that many wagering systems like those by Kelly or Cover don't bother with the confidence interval. I think it matters more in more complicated financial markets or other applications. I'm working on the "other applications" for the moment. :)

As I mentioned before, I keep coming back because horse racing's been at many of the same questions for decades longer than most other fields. I've read a number of papers by people in other fields who are rediscovering models that have been applied for years in racing. I just want to be sure that I'm not guilty of the same. :D

Thanks again,
Prank

Robert Fischer
07-27-2007, 11:27 AM
Sorry, I didn't make myself quite clear. A better example might be: your system predicts that the horse has a 25% chance of winning. Shorthand: P(horse #k wins) = 0.25; E(performance of horse #k) = 0.25. But, there could be a confidence interval, say +/- "X"* ...
*X is my quote here

that is a big point. A lot of handicappers do not accurately allow for the uncertainty. Some old-school 'cappers do, and may not even realize, when they are demanding significantly better odds than "fair" odds for their wagers.

K9Pup
07-27-2007, 03:26 PM
Thanks for the replies.
K9Pup - you're right - it could be done through an empirical assessment, but in cases of sparse data (e.g. very unusual race conditions), getting the confidence interval estimates may not be very useful.



But are you using different methods to handicap those races? Are do you use the same process for each race? If you use the same method each time then I would think that the "unusual" race conditions really wouldn't matter????? Bottomline you want your process to be successful long term, so does the variance of ONE race really matter?

Good4Now
07-27-2007, 05:15 PM
Some players eliminate the chance of to large a window of a possibly variable outcome, i.e. "I don't bet anything but FAST tracks".

Find a copy of "The Zurich Axioms". It'll help you alot more than a forest of decision tree's. To bad academia still wastes it time ( and apparently yours ) by seeking out a way to over state the actual number of variables which really do impact the solution to a poorly posed problem, or one with conditions that evolve over time.

Does the horse I want to bet have four legs and a tail ?
Can he really run when they enter the home stretch ?
Is there another horse in the race who can run faster ?

Robert Fischer
07-27-2007, 07:39 PM
Some players eliminate the chance of to large a window of a possibly variable outcome, i.e. "I don't bet anything but FAST tracks".

Right. Now we are getting into being selective with ventures. A general criteria for a playable race.



Find a copy of "The Zurich Axioms".... Interesting stuff. I had to find a copy and it is a good read so far.

DJofSD
07-27-2007, 08:59 PM
Find a copy of "The Zurich Axioms".

New to me -- thanks!

prank
07-27-2007, 11:47 PM
Good4Now: Thanks for the recommendation on the Zurich Axioms. That's the 2nd time it's been recommended to me, so it's moved up on my list of books to read soon.

As for the factors that matter, there's not much one can argue with about a rigorous approach to modeling. And you're right that a handicapper can refuse to bet on races for which he doesn't have a model (blindfolded jockies racing horses backwards up a sand dune...) or doesn't yet "get". However, in other pursuits, there's not much choice about having to make a ranking decision for whatever data comes down the pike, which is where my work has been focused.

K9Pup: If there's insufficient data for the parameters of a model to achieve some convergence (speaking informally; more formally, I'm referring to the central limit theorem), then the estimates of the parameters would not only be poor, so would any measure of the uncertainty. Consider the example of the sand dune (is there anything crazier?) - stage a race that's insane enough and there's no way to handicap it. How would one measure that a race is insane? Merely that the conditions are too unusual? What is the measure of unusualness? Simply reporting the number of "similar" races in a database is not enough, and cross-validation may be meaningless as well.

These are informal ways of describing the problem. Naturally, I'm looking for the formal ways.

By the way, twice today I came across references in the purely statistical community that underscored how important horse racing is for inspiring the models that interest me. Benter's 1994 paper was cited in one case, and Plackett-Luce models were cited in both instances.

Cheers,

Prank

Good4Now
07-28-2007, 01:23 AM
my base is in calculus thru physics and engineering I'm all in favor of RIGOR.

For three years of physics you are struggling to learn enough calculus to solve problems which require more math than you probably understand. Hopefully you have learned enough "tricks" to get by. Then in the fourth year they sit you down and with a not unkind smile tell you "EVERY THING YOU KNOW IS WRONG".

You feel like a dolt.

In the realm of the very fast, the very large, the very small, da dum, da dum, de dum "THINGS ARE DIFFERENT". But you still feel like a dolt! Realizing through the numbness that you have even more to learn...

Example; ( and I do not do this to be unkind ) with the models available and the coefficients of probability can you tell me the price of March '08 Soybeans on the 27 th of November this year? An answer by next thursday would be fine. So that is not quite a week for a near four month in the future event.

Interesting terrain we are covering, nice chatting with you!

Robert Fischer
07-28-2007, 08:50 AM
Good4Now:
.... However, in other pursuits, there's not much choice about having to make a ranking decision for whatever data comes down the pike, which is where my work has been focused. ...

... If there's insufficient data for the parameters of a model to achieve some convergence (speaking informally; more formally, I'm referring to the central limit theorem), then the estimates of the parameters would not only be poor, so would any measure of the uncertainty. ...

One type of succesful model may use a Ranking Decision that would apply to all scenarios, and most importantly your Measure of Uncertainty would not be poor in scenarios with insufficient data! The measure of uncertainty must correctly identify the lack of data, and then accurately* measure the uncertaintly level accordingly. The program is starting to become insanely complex if it is to tackle a wide array of real world scenarios. *Also you now have to apply a second measure-of-uncertainty to your program. Your mechanism for calculating the measure of uncertainty is only going to be accurate within + - x% . The uncertainty measure itself is uncertain to a degree:D.

Ideally in horse racing, or other pursuits, the mechanism for ranking-decision and the mechanism for the level-of-uncertainty would not be a highly complex "deep blue" supercomputer attempting to calculate thousands of interelated factors...

K9Pup
07-28-2007, 08:59 AM
K9Pup: If there's insufficient data for the parameters of a model to achieve some convergence (speaking informally; more formally, I'm referring to the central limit theorem), then the estimates of the parameters would not only be poor, so would any measure of the uncertainty. Consider the example of the sand dune (is there anything crazier?) - stage a race that's insane enough and there's no way to handicap it. How would one measure that a race is insane? Merely that the conditions are too unusual? What is the measure of unusualness? Simply reporting the number of "similar" races in a database is not enough, and cross-validation may be meaningless as well.


Prank
Obviously I'm not on your level of statistics. I THINK I understand what you are saying above. But my point is still from a "bigger picture" standpoint.

If I have a "significant" sample from a larger population and I calculate a confidence interval of +/- x %. Shouldn't I expect that future results from that large population to fall within that range? Sure there may be these "crazy" races in the population, but I have to assume these types were also in my sample. I guess my assumption is that we what something that works over a LOT of races, not just something that works on one race today. Sometimes we can make things TOO complicated.

singunner
07-28-2007, 12:06 PM
I just finished putting what I call a "confidence statistic" into my program. As in "just" finished. Still running tests on how much it actually helps.

jfdinneen
07-28-2007, 12:43 PM
Max Gunther's Zurich Axioms (http://neif.org/Zurich_axioms.pdf) (1985).

My personal favorite is the second axiom - Always take your profit too soon!

Best wishes,

John

Overlay
07-28-2007, 02:19 PM
http://www.michaelcovel.com/archives/000157.html

(I'm re-posting a separate link to the axioms because I couldn't access the link in jfdinneen's post above.)(Maybe it's just my computer.)

I would take issue with the following "Zurich axioms" as applied to playing thoroughbreds:

"Decide in advance what gain you want from a venture, and when you get it, get out." (That limits your winnings rather than your losses. If you're wagering on situations where you're getting the better of the odds, a greater or more extended volume of play can only increase your returns.)(I know there have been previous threads about this on the board.)

"Beware the historian's trap - it is based on the age-old but entirely unwarranted belief that the orderly repetition of history allows for accurate forecasting in certain situations." (That may be right if you're talking about the certainty that an event will occur, but I don't believe that it would apply when dealing with the probability of an event that has shown an established likelihood of occurrence over an extended period of time (recognizing, of course, that there can still be protracted "runouts", depending on the probability of the specific event under discussion).)

nobeyerspls
07-29-2007, 08:03 AM
Hi folks,

Back to racing: do any of you know of handicapping systems that return a prediction and an associated degree of confidence in the prediction? You could say it's like a classification system: predicting True or False (-1 or 1, A or B), with a probability value for the output. In the real world, this would tell you whether or not to gamble.

In addition to some probability model, of course you have to use the posted odds to make a decision on whether to wager and how to wager. Except for the posted odds, are there other indicators or diagnostics that you all know about for various handicapping methods?

Prank

When we view the discipline of thoroughbred hanicapping from a statistical perspective, we need to remember that we are dealing with horses and not motorcycles. A noble breed, yes, but still bone, blood, and muscle with a high degree of fragility.
That collection of data so important to us known as past performances is incomplete (new bit today, that bad hock finally better, solved the stall-walking problem, etc ad infinitum?). So we do our best with what we have.
For me that means placing races into two categories: playable and unplayable (not enough space here to tell you how). Then in the playable category I further narrow it to the type of wager with the most profit potential. For example, a key horse is 6-1. How much greater can we make the return by using exotic wagers instead of a straight win bet?
I know the qualitative value of this method but I don't think that it can be quantified. In brief, we can find horses that outrun their "posted odds" but we cannot predict an expected return other than a wide range. For example, if we win $14,000 one month and lose $2,000 the next month, does the $6,000 average win mean anything?
One piece of advice. Before you ever wager on a horse, spend a day with a track vet and go with him as he makes his rounds from barn to barn. Then remember that these are the animals that provide our statistical base.

Good4Now
07-29-2007, 10:24 AM
Howzabout we go to Wal-Mart, and find a 220 outlet to plug that Palomino into?

It'll be just like gettin' on Secretariat!

Robert Fischer
07-30-2007, 12:30 PM
measure of uncertainty. I did some win probability estimates for the Street Sense thread. Something I noticed was when assesing the measure of uncertainty, it seems to me that the range is not necessarily symmetrical.

For example you may have Street Sense as win probability of 55% +20% -10% meaning that with the given information his highest reasonable win-probabilty estimate is 75%, and his lowest reasonable win-probability estimate is 45%, while his best win-probability estimate is 55%.

of course it could be stated as 60% +- 15% as well, even if you feel 55% to be the best starting estimate. It may not matter if the actual calculations simply use the range of 45%-75% regardless of your "best estimate".

Opinions?

DJofSD
07-30-2007, 12:36 PM
OK, stupid question time: I'd assume the "+/-" is 1 standard deviation. Isn't that be definition symmetrical?

What are you doing to derive a +20% then a -15%? Or are you measuring skewness?

Good4Now
07-30-2007, 12:49 PM
I would think this is inherent. You only have two outcomes.

In other investments you have a third outcome of abandonment BUT retain some portion of the initial cost.

Robert Fischer
07-30-2007, 02:03 PM
OK, stupid question time: I'd assume the "+/-" is 1 standard deviation. Isn't that be definition symmetrical?

What are you doing to derive a +20% then a -15%? Or are you measuring skewness?

meaning the range of win probability. How much is the most i think my estimate would be off ?

So if I think Street Sense has a 55% chance of winning. You can plug in the odds equation and you get so called "fair odds". But in reality there are significant unknown factors involving Street Sense and the other horses. What if Tiz Wonderful is better than ever and Asmussen has another Curlin? That will lower Street Sense's win probability. What if Cowtown Cat is pumped up? What if Street Sense is in top form? etc...

Robert Fischer
07-30-2007, 02:14 PM
I would think this is inherent. You only have two outcomes.

In other investments you have a third outcome of abandonment BUT retain some portion of the initial cost.

there are two outcomes in terms of Win/Lose , but in terms of performance and expected-performance, there are many degrees. Your estimate of performance is going to drive your win probability estimate. The unknown information and the variety of possible scenarios drive the measure of uncertainty or +- range.

ransom
07-30-2007, 04:12 PM
I only scanned over this thread so I'm probably missing something, but ... it's a mistake to look at only the variance of win percentage. What you really want to look at is the standard error of the mean average payoff. That's really just Statistics 101. You'll probably be frustrated at how wide the confidence intervals will be but, hey, that's how horse racing really is. You can have very large samples of longshots that are profitable and have very little confidence in them winning in the future because of this. Also, don't forget that horse racing can be a moving target (also known in the statistical literature as a non-stationary process). It's a lot like econometric forecasts and we all know how inaccurate those can be. Sorry, but it just ain't Physics.

chickenhead
07-30-2007, 04:30 PM
of course it could be stated as 60% +- 15% as well, even if you feel 55% to be the best starting estimate. It may not matter if the actual calculations simply use the range of 45%-75% regardless of your "best estimate".
Opinions?

If you apply a probability curve to a time or a figure and then run a monte carlo you get what you are looking for, but it is expressed as a single win % that expresses the sum of all the possible scenarios. If you really wanted to tweak each horses individually, you'd want to tweak the prob curve for each one (add more upside, more downside, give it multiple "humps", etc).

It seems like that is easier.

K9Pup
07-30-2007, 10:49 PM
If you apply a probability curve to a time or a figure and then run a monte carlo you get what you are looking for, but it is expressed as a single win % that expresses the sum of all the possible scenarios. If you really wanted to tweak each horses individually, you'd want to tweak the prob curve for each one (add more upside, more downside, give it multiple "humps", etc).

It seems like that is easier.

So you are talking about changing the distribution for each horse from a normal distribution to a ......? But how would you determine what that new distribution should be? I guess a lot of that depends on WHAT you use for the base number in the MC SIM.

chickenhead
07-30-2007, 10:56 PM
So you are talking about changing the distribution for each horse from a normal distribution to a ......? But how would you determine what that new distribution should be? I guess a lot of that depends on WHAT you use for the base number in the MC SIM.

It's not something I do, but I don't see any reason you couldn't do it on a horse by horse basis. Just as a for instance, there are cases where you might think a horse is either going to run big or run up the track, with less likelihood of in between. So a normal distribution wouldn't fit with your opinion at all. There's no rule that says you couldn't tweak it if you wanted, or have a handful of canned distributions you call from based on your opinion (or past results with horses "like this one").

K9Pup
07-31-2007, 06:11 AM
It's not something I do, but I don't see any reason you couldn't do it on a horse by horse basis. Just as a for instance, there are cases where you might think a horse is either going to run big or run up the track, with less likelihood of in between. So a normal distribution wouldn't fit with your opinion at all. There's no rule that says you couldn't tweak it if you wanted, or have a handful of canned distributions you call from based on your opinion (or past results with horses "like this one").

Yeah I understand. With the dogs you might change the distribution somewhat based on the age of the dog. Young dogs probably haven't run their best times yet so you might lean the bell curve to the left. While older dogs are probably not going to get any faster so you lean it to the right. I was just wondering what "reasons" you would use for the horses.

Good4Now
08-02-2007, 09:59 AM
The market is not yet open. I will submit the price of Mar '08 Beans will be in a range of $7.58 to $7.78 on 27 Nov. this year.

Guess it's time to sell-short into rallies. There is an option on that month so you could even buy some "insurance".