PDA

View Full Version : Is this a valid use of confidence intervals?


OCF
04-23-2013, 09:05 PM
I applied a handicapping factor to favorites in all races at two particular tracks until I had 100 bets. I started on a particular date and continued until I reached the 100 bets.

Of those 100 bets/favorites, 52 were winners, i.e. I had a 52% winning percentage. ROI was +18.5% (I know, it all sounds too good to be true.)

The favorite winning % for the same period of time (as the period of time that it took to get to the 100 bets) at those two tracks was 39.4%.

I consulted a confidence interval-calculating website and came up with following:

If sample size is 100,
observed proportion is 52%,
and desired confidence level is 95%,
confidence interval is + or - 9.79%

Is it valid to conclude from this that I can be 95% confident that my handicapping factor outperformed the public by:

at least .52 - .0979 - .394 = approx 2.8%,

or by at most .52 + .0979 - .394 = approx. 22.4%?

I realize the ROI might have been negative at a 42 or 43 winning %, but please set that aside for now.

What I'm really interested in is: am I applying the confidence interval methodology correctly, if I should be applying it to horse racing in the first place?

Thanks in advance to anybody that might be able to help me out with this.

pondman
04-23-2013, 09:19 PM
Is it valid to conclude from this that I can be 95% confident that my handicapping factor outperformed the public by:


The confidence level will change with every sample. As your sample size increases you can be 95% confidence in the confidence interval (I think that's how you say it.) Have fun with it! Don't take it so serious, because I believe in the long run this method will crash.

OCF
04-23-2013, 09:59 PM
The confidence level will change with every sample. As your sample size increases you can be 95% confidence in the confidence interval (I think that's how you say it.) Have fun with it! Don't take it so serious, because I believe in the long run this method will crash.

Thanks pondman. Not to worry, the only thing i'm serious about for now is having a good time killing time ;) .

In regards to crashing in the long run, i definitely hear you. I'd tend to agree with you more if my wiining percentage was say 45. But to me thats the point of the CI, i.e. i can be 95% confident that my long-term winning % would be 42 to 62. At 42 the roi would probably be breakeven at best, but there stll seems to be a lot of promise. At least enough to add another 100 to the sample.

Or maybe i know just enough to be dangerous (to myself)?

Greyfox
04-24-2013, 12:08 AM
I applied a handicapping factor to favorites in all races at two particular tracks until I had 100 bets. I started on a particular date and continued until I reached the 100 bets.

Of those 100 bets/favorites, 52 were winners, i.e. I had a 52% winning percentage. ROI was +18.5% (I know, it all sounds too good to be true.)

The favorite winning % for the same period of time (as the period of time that it took to get to the 100 bets) at those two tracks was 39.4%.


Thanks in advance to anybody that might be able to help me out with this.

Good on you OCF.

1. I know statistics well, but I can't help you out and I'm not sure why you would ask us. If you're doing that well keep it up. But it is only 100 trials.

2. I don't really understand that comment 100 bets/favorites unless your handicapping factor pointed to favorites to bet on and favorites to avoid.

3. If you have that angle, i.e. handicapping factor, I'd suggest that you keep it to yourself and not worry about confidence intervals.
(Having said that, there are some pretty bright minds on this board who might find it now that you've said that you are doing that well.)
Essentially, you seem to have come up with a handicap factor, proven only over a very limited number of trials, that seems to differentiate favorites from false and vulnerable favorites. Good on you. :ThmbUp:

OCF
04-24-2013, 06:53 AM
Good on you OCF.

1. I know statistics well, but I can't help you out and I'm not sure why you would ask us. If you're doing that well keep it up. But it is only 100 trials.

2. I don't really understand that comment 100 bets/favorites unless your handicapping factor pointed to favorites to bet on and favorites to avoid.

3. If you have that angle, i.e. handicapping factor, I'd suggest that you keep it to yourself and not worry about confidence intervals.
(Having said that, there are some pretty bright minds on this board who might find it now that you've said that you are doing that well.)
Essentially, you seem to have come up with a handicap factor, proven only over a very limited number of trials, that seems to differentiate favorites from false and vulnerable favorites. Good on you. :ThmbUp:

Thanks for your encouraging words Greyfox. Now you've got me paranoid about somebody else "backing in" to what I'm doing! ;) I'm only half-serious, but i should be careful if this experiment actually continues to return good results.

I can see that i didn't word the "100 bets/favorites" comment well, but you were able to interpret it correctly, i.e. i was indeed distinguishing between favorites to bet on and favorites to avoid.

You are correct about the limited number of trials. It strikes me as a variation on pondmans comment about the long run. All i can say is that is the reason for the use of the confidence interval, it is an attempt to improve my interpretation of where i stand regardless of the small sample.

Ray2000
04-24-2013, 08:55 AM
I'd say your reasoning is valid when applied to Strike Rates, where the outcome is binomial i.e. ...Win/Loose, Yes/No, Black/Red etc

And I know you mentioned "please set aside" the ROI but I had to check the "Could have been Luck" Factor, on these stats and it's pretty low... 1 in a 100 if the Expected Return on your bets is the typical -10% ROI one gets betting favorites.

The attached file does the math on what should be similar to your results list. You can plug in the actual numbers if you want.

Magister Ludi
04-24-2013, 09:20 AM
3. If you have that angle, i.e. handicapping factor, I'd suggest that you keep it to yourself and not worry about confidence intervals.
(Having said that, there are some pretty bright minds on this board who might find it now that you've said that you are doing that well.)

Mr. Greyfox is correct, as usual. Play your cards close to the chest.

OCF
04-24-2013, 09:29 AM
Mr. Greyfox is correct, as usual. Play your cards close to the chest.

Good advice by both of you absolutely, i was even thinking about folks like you when I said what i said above about paranoia and somebody "backing in"!

OCF
04-24-2013, 09:31 AM
I'd say your reasoning is valid when applied to Strike Rates, where the outcome is binomial i.e. ...Win/Loose, Yes/No, Black/Red etc

And I know you mentioned "please set aside" the ROI but I had to check the "Could have been Luck" Factor, on these stats and it's pretty low... 1 in a 100 if the Expected Return on your bets is the typical -10% ROI one gets betting favorites.

The attached file does the math on what should be similar to your results list. You can plug in the actual numbers if you want.

Very helpful Ray. I had even written off the possibility of being able to analyze the ROI, so that's a big +.

SchagFactorToWin
04-24-2013, 09:50 AM
OCF,

You may want to look at doing a t-test instead of confidence intervals. You'll need the odds your horses went off at and the odds of the other favorites you're comparing it to.

I had a sample size of over 1000 picks with a win rate that was higher than the favorites win rate. But when comparing the odds of the favorites with the odds my picks ran at, it was not statistically significant.

OCF
04-24-2013, 10:10 AM
OCF,

You may want to look at doing a t-test instead of confidence intervals. You'll need the odds your horses went off at and the odds of the other favorites you're comparing it to.

I had a sample size of over 1000 picks with a win rate that was higher than the favorites win rate. But when comparing the odds of the favorites with the odds my picks ran at, it was not statistically significant.

More good advice. i'll have to do some research on t-tests, but I intend to do just that.

You can probably tell I'm not somebody who usually thinks in terms of CI's and T-tests. I vaguely remember learning about CI's in the Intro to Statistics class I barely passed 30 years ago, which I had to do in order to get a business degree. They came up again when I was reading commentary about polls from the recent presidential election.

To a large degree being able to think about how these ideas apply to horseracing and talk about them with people who are smarter than me is it's own reward. Making some $ would be nice, but it is secondary.

I'm definitely not one who enjoys betting with negative expectations, no matter how slightly negative, and just hoping to overcome the negative expectations with luck. My impression is there's a lot of that kind of betting out there. Nothing inherently wrong with that I suppose, but it's just not me.

So thanks again for the help, and keep the good ideas coming!

raybo
04-24-2013, 11:15 AM
I don't think you need to worry about the "smart" guys out here backing into what you're doing because most of them have probably already done it, in one form or another. It's the others you should be worried about.

That being said, with the public hitting 39% in your small sample, why would you feel better if your hit rate was 45%? In a small sample of 100 races, one could have results much better than 52%, using almost any method.

Do some more testing, many more races, and report back.

And by the way, racing is only "negative expectation" in context of the participants, as a whole, the public odds. An individual participant's expectation can indeed be positive, if his/her method is good enough and applied consistently enough.

OCF
04-24-2013, 11:31 AM
I don't think you need to worry about the "smart" guys out here backing into what you're doing because most of them have probably already done it, in one form or another. It's the others you should be worried about.

That being said, with the public hitting 39% in your small sample, why would you feel better if your hit rate was 45%? In a small sample of 100 races, one could have results much better than 52%, using almost any method.

Do some more testing, many more races, and report back.

I didn't express myself clearly. I definitely would not have as much optimism if the W% was 45.

The sentence I highlighted surprises me but I'm willing to take it at face value.

Maybe this will tie it all together - if the W% was 45 I might not spend any more time on the experiment. But between the CI analysis, 52 WP, and positive ROI I'm encouraged enough to go to the time and effort of doing some more testing. But not to bet what for me would be big $.

raybo
04-24-2013, 12:20 PM
I didn't express myself clearly. I definitely would not have as much optimism if the W% was 45.

The sentence I highlighted surprises me but I'm willing to take it at face value.

Maybe this will tie it all together - if the W% was 45 I might not spend any more time on the experiment. But between the CI analysis, 52 WP, and positive ROI I'm encouraged enough to go to the time and effort of doing some more testing. But not to bet what for me would be big $.

The sentence you highlighted only means that depending on the 100 race sample you choose, the results could be either extremely good, or extremely bad, or anything in between. And, I suggest that you not bet any money at all in your testing. That's what testing is all about, testing, not playing. Once your initial "testing" is done, and proves reliable, then you can proceed to real money wagering, to 'test" your ability to remain consistent in your total approach. Many times a handicapping method performs great, until the player starts betting, then it goes down the tubes because the player isn't consistent to the method. Any method is only as good as the consistency of the player in following it, to the "T".

OCF
04-24-2013, 12:54 PM
The sentence you highlighted only means that depending on the 100 race sample you choose, the results could be either extremely good, or extremely bad, or anything in between.

Are you saying that the CI analysis I did was essentially worthless? It's OK with me if you are. :) In a nutshell that's what I'm trying to get opinions on, why I started the thread.

pondman
04-24-2013, 12:58 PM
Probably the biggest flaw to applying statistics and distributions to handicapping is that most players are not selecting their samples at random. Usually they are selected by a performance of some kind. In the case of favorites, your samples is based on the crowds opinion. This can get messy on you. Any time you take what appears to be a stable system and throw in random external events, such as a hot day or a blizzard, things generally go down hill. I place the unknown higher than most, at 60%. Even the best of horses has about a 40% chance of winning (my opinion.) This puts your minimum odds at 9-5. So you need to be careful with the favorites.

raybo
04-24-2013, 01:09 PM
Are you saying that the CI analysis I did was essentially worthless? It's OK with me if you are. :) In a nutshell that's what I'm trying to get opinions on, why I started the thread.

It's not worthless, for what it is. That being, in context to the sample of races you tested. I'm saying you must test it against many other sets of races, for it to mean anything going forward.

Small samples of races are fine, but test many different small samples.

OCF
04-24-2013, 01:34 PM
Probably the biggest flaw to applying statistics and distributions to handicapping is that most players are not selecting their samples at random. Usually they are selected by a performance of some kind. In the case of favorites, your samples is based on the crowds opinion. This can get messy on you. Any time you take what appears to be a stable system and throw in random external events, such as a hot day or a blizzard, things generally go down hill. I place the unknown higher than most, at 60%. Even the best of horses has about a 40% chance of winning (my opinion.) This puts your minimum odds at 9-5. So you need to be careful with the favorites.

Interesting, the reason I like favorites is I'm realistic about my poor tolerance for losing streaks. I'd like to think I have my eyes wide open about the challenges I'm taking on in return, but you given me some more to think about.

OCF
04-24-2013, 01:40 PM
Small samples of races are fine, but test many different small samples.

Now there's something i can chew on. My natural inclination is to continue adding to the population and apply the same filters to increase the sample size.

But maybe i should start a new population and sample? Or do both? It wouldn't be that hard to do both with the same data.

pondman
04-24-2013, 02:02 PM
Interesting, the reason I like favorites is I'm realistic about my poor tolerance for losing streaks. I'd like to think I have my eyes wide open about the challenges I'm taking on in return, but you given me some more to think about.

I think your theory is similar to perfect blackjack strategy, and sitting at the table with the idea of staying as long as possible. That's okay. You may have spotted the variable, which will keep you there for the longest. However anytime you become part of the game, large enough to make any serious money, your wagers will make your stable method unstable. You might want to see if there isn't a secondary variable that beats you 50% of the time. What type of horses beats the favorite in your method? And adding those to your play-- either horizontally or with an additional bet.

JJMartin
04-24-2013, 02:05 PM
Since distribution is never even and smooth, if you repeated this process of 100 races at a time, you would invariably have entire batches of net losses. Think of all the races that didn't win in the 100 race sample and put them all in a row. IMO, single factor applied methods won't be profitable long term.

JJMartin
04-24-2013, 02:18 PM
I applied a handicapping factor to favorites in all races at two particular tracks until I had 100 bets. I started on a particular date and continued until I reached the 100 bets.

Of those 100 bets/favorites, 52 were winners, i.e. I had a 52% winning percentage. ROI was +18.5% (I know, it all sounds too good to be true.)

The favorite winning % for the same period of time (as the period of time that it took to get to the 100 bets) at those two tracks was 39.4%.

I consulted a confidence interval-calculating website and came up with following:

If sample size is 100,
observed proportion is 52%,
and desired confidence level is 95%,
confidence interval is + or - 9.79%

Is it valid to conclude from this that I can be 95% confident that my handicapping factor outperformed the public by:

at least .52 - .0979 - .394 = approx 2.8%,

or by at most .52 + .0979 - .394 = approx. 22.4%?

I realize the ROI might have been negative at a 42 or 43 winning %, but please set that aside for now.

What I'm really interested in is: am I applying the confidence interval methodology correctly, if I should be applying it to horse racing in the first place?

Thanks in advance to anybody that might be able to help me out with this.

Are you determining the favorite after the race or at some point prior?
Remember the "favorite" is not always known until after the race is done and all pools have been calculated. Since you cannot determine who the fav will be with 100% accuracy, your results will be flawed when you attempt to use this in real time betting. Try running this method again using the m/l or if your able to, wait until 1 or 2 minutes before post time and record the fav then. You would have to do this anyway if you were actually betting.

raybo
04-24-2013, 03:36 PM
Interesting, the reason I like favorites is I'm realistic about my poor tolerance for losing streaks. I'd like to think I have my eyes wide open about the challenges I'm taking on in return, but you given me some more to think about.

I think that if you did some more study, you'd find that betting favorites will have some pretty long losing streaks, at times, when you take into account the low prices you are going to get on them, those losing streaks can seem endless.

raybo
04-24-2013, 03:53 PM
Now there's something i can chew on. My natural inclination is to continue adding to the population and apply the same filters to increase the sample size.

But maybe i should start a new population and sample? Or do both? It wouldn't be that hard to do both with the same data.

When you talk about 100 races, you're basically talking about 9 or 10 cards of races. That's hardly enough to bet confidently in the future on, isn't it?

Continuing to add to the 100 races should be done, but then once you have a larger sample, go back and select many smaller sets of races from that larger sample, each set at random (or as close to random as you can), and see how each small set performs. Some will do better but some will do worse. Which way do the majority of them perform? Are a few consecutive hits, or larger than average payouts skewing your overall results? I would think that, when looking at 2 tracks, you should have at least a whole meet's worth of races, for each track, in order to start getting an idea of what works and what doesn't, at those 2 tracks in combination, maybe more than 1 meet's worth each. For single track evaluations, 200 races would be a minimum for me. My individual track databases contain at least 20-24 cards, and that database is used just for eliminations for win contention, then I'm betting multiple horses to win based on minimum odds requirements. If I were betting a single horse per race, 220-240 races wouldn't be enough for me to be able to bet confidently.

pondman
04-24-2013, 04:15 PM
Since distribution is never even and smooth, if you repeated this process of 100 races at a time, you would invariably have entire batches of net losses. Think of all the races that didn't win in the 100 race sample and put them all in a row. IMO, single factor applied methods won't be profitable long term.

I know of several. I think you could find one at most tracks. But a betting strategy would require maintaining a minimum of odds, and doesn't have any crowd requirements. I think sampling only from favorites isn't random enough to make money at the track.

The OP might have a better understanding if he changes to a Fuzzy set theory. At what degree are his horses a subset of a larger set? In truth all horses will be part of the larger "able to win" set. But in a number of observable instances a methods will give you a subset that will be above 40% and will pay better than 9-5. In many cases they'll be longer priced horses. And in some cases will be long shots, if you are willing to wait. This is a way to make money.

OCF
04-24-2013, 06:01 PM
I'm going to bow out now, There's more than enough here to keep me thinking and busy for a long time. Thanks to everyone who responded, I truly appreciate it.

JJMartin
04-24-2013, 09:31 PM
I know of several. I think you could find one at most tracks. But a betting strategy would require maintaining a minimum of odds, and doesn't have any crowd requirements. I think sampling only from favorites isn't random enough to make money at the track.

The OP might have a better understanding if he changes to a Fuzzy set theory. At what degree are his horses a subset of a larger set? In truth all horses will be part of the larger "able to win" set. But in a number of observable instances a methods will give you a subset that will be above 40% and will pay better than 9-5. In many cases they'll be longer priced horses. And in some cases will be long shots, if you are willing to wait. This is a way to make money.
To be clear, a single factor as in for example the horse with the highest last race Beyer or the horse with the lowest class drop in its last race or even the horse at post position #1 and that single factor is the whole of your criteria used to make a betting selection. Otherwise if we are defining a single factor as a composite of various combined factors and/or calculations then we are not clear on what exactly a single factor is. If you know of several in terms of the first description, put them down in a book and I'll buy it because after several years of my own personal research and testing I am confident there is no such thing in existence.

Robert Goren
04-24-2013, 09:52 PM
To be clear, a single factor as in for example the horse with the highest last race Beyer or the horse with the lowest class drop in its last race or even the horse at post position #1 and that single factor is the whole of your criteria used to make a betting selection. Otherwise if we are defining a single factor as a composite of various combined factors and/or calculations then we are not clear on what exactly a single factor is. If you know of several in terms of the first description, put them down in a book and I'll buy it because after several years of my own personal research and testing I am confident there is no such thing in existence.You are looking at the wrong things. You need to look at things that most bettors ignore. Buy some 1995 general handicapping book and then start looking at things not covered in the book. If you are bright enough, you find something. Just don't expect it to work at all tracks.

zerosky
04-24-2013, 10:00 PM
It has been observed that numbers are like people, if your torture them long enough they will tell you anything.
My approach is to to find the standard error of the coefficient of the log odds ratio.

In this example your sample strike rate of 52% corresponds to a population mean at the 95% level of somewhere between 40%-64%

good luck with the project.

JJMartin
04-24-2013, 10:13 PM
You are looking at the wrong things. You need to look at things that most bettors ignore. Buy some 1995 general handicapping book and then start looking at things not covered in the book. If you are bright enough, you find something. Just don't expect it to work at all tracks. My personal method includes the use of information outside of what is found in the pp's and doesn't use speed figures or class fyi. I was addressing the idea that a profitable system can be devised based on applying 1 single factor and absolutely no other criteria. Then I asked how pondman was defining what a single factor is, did you read all the posts?

pondman
04-25-2013, 04:57 PM
My personal method includes the use of information outside of what is found in the pp's and doesn't use speed figures or class fyi. I was addressing the idea that a profitable system can be devised based on applying 1 single factor and absolutely no other criteria. Then I asked how pondman was defining what a single factor is, did you read all the posts?

How far down into the subset can I go? There are tracks that will give you positive ROIs if you play all maiden winners during the month of March. Tweak that with a couple variables and your ROI goes double digit.

raybo
04-25-2013, 05:24 PM
How far down into the subset can I go? There are tracks that will give you positive ROIs if you play all maiden winners during the month of March. Tweak that with a couple variables and your ROI goes double digit.

That's at least 2 factors, maiden winners and month of March. Single factor means 1 factor.