PDA

View Full Version : statistics question


SchagFactorToWin
09-01-2009, 09:27 PM
I am working on fine-tuning my method of estimating today's race speed using past speed. BTW, I play the harness tacks only, but this question is about statistics, so it shouldn't matter.

I have a database of actaul winning times in one column and my system's estimate in the next column. I have the Excel spreadsheet set up so that by changing one cell (let's call it A1), I can change all of my system's estimates.

What I hope to do is try various values for A1 and run statistical variance tests to see which value is most accurate.

But I'm unsure which test(s) to run. I'm thinking "t-Test: Two-Sample Assuming Equal Variances". Or "F-Test Two-Sample for Variances".

Most of the tutorials I find use social science or scientific examples. Are there any gambling statistics experts out there? Any books that address gambling issues for someone who hasn't studied stats for 20 years?

garyoz
09-01-2009, 09:55 PM
Don't think I'm an expert, but believe that equal variances is appropriate assuming a large sample size. There are confounding factors such as pace, distance, class, etc. not to mention track variant when you draw your sample. Don't know how you are breaking down your analysis.

As an aside, why run tests of statistical significance? Think it will be an exercise in frustration. You aren't going to be publishing this in an academic journal where you need .95 or .99 probability. Eyeballing could be good enough.

Relative to unsolicited advice, you might want to go to Handicapper's Data Warehouse and read anything that Mark Cramer has written about his Projected Speed Ratings relative to your approach. I know that race shape/pace play a role (run style/position) in his projections. It could be helpful.

markgoldie
09-01-2009, 10:10 PM
I don't have the slightest idea of how to do this. But as a man who has spent 50 years of his life handicapping and playing both harness and thoroughbred races, I can tell you without fear of any possible contradiction that there is no possible handicapping advantage to be gained by a generalized prediction of a final race time. If you are a harness-racing specialist, you must know that paceand flow dictate the final time, along, of course, with overall track speed. Predicting pace and flow does have some handicapping merit, although it is very difficult on a consistent basis and it is far more dependent on individual horses racing in the class than any class history. Interestingly, due to the slipstream effect in harness racing and lack thereof in t-bred racing, faster paces almost always lead to faster final times in h-racing, while the reverse is true in t-bred racing.

Bottom line: if you are doing this exercise for some handicapping advantage, I'd suggest you use your time in other areas.

garyoz
09-01-2009, 10:25 PM
Don't disagree Goldie, but you can develop a feel doing statistical analysis. You can learn about the game as your approach inevitably fails. I know I spent too much time trying to apply linear and non-linear modeling--when I was young and foolish. But some people like that stuff. Some people even think programming is fun.

You can get a feel and learn stuff when crunching numbers if you ask the right questions, but ultimately given a large enough sample size everything regresses toward the mean which tends to be equivalent to the takeout--in my experience.

SchagFactorToWin
09-02-2009, 12:04 AM
Speed projections are just one part of my system. I find it important to be able to adjust past speeds for post position. I developed it 20 years ago and call it the Schag Factor :)

I am happy with my methodology- it's been good enough to give me a 57% win rate, a 1.25 Win ROI, and a 33% exacta win rate (1/2,3) with a 1.69 EX ROI (all 2009 figures).

I had an epiphany last week and realized that I could tweak my method. Preliminary results show promise, so I just want to make sure I'm testing correctly.

Marlin
09-02-2009, 12:30 AM
I am happy with my methodology- it's been good enough to give me a 57% win rate, a 1.25 Win ROI, and a 33% exacta win rate (1/2,3) with a 1.69 EX ROI (all 2009 figures).Now thats what I call grinding it out.

CBedo
09-02-2009, 02:14 AM
This doesn't really answer your question, but if you want to fine tune your estimate, you can use the solver (in data window) to set constraints & limits on different cells and then have it maximize or minimize a cell by changing other cells. It will do some form of linear programming to find your "best fit" answer.

markgoldie
09-02-2009, 10:36 AM
Speed projections are just one part of my system. I find it important to be able to adjust past speeds for post position. I developed it 20 years ago and call it the Schag Factor :)

I am happy with my methodology- it's been good enough to give me a 57% win rate, a 1.25 Win ROI, and a 33% exacta win rate (1/2,3) with a 1.69 EX ROI (all 2009 figures).

I had an epiphany last week and realized that I could tweak my method. Preliminary results show promise, so I just want to make sure I'm testing correctly.
Schag: I guess I should congratulate you on those great results. You have apparently improved on the invention of the wheel. I would have thought that adjustments in speed (and here we're talking about final times) for post position would be solely and completely dependent on early position, subsequent pace, and subsequent flow positioning without any regard whatsoever for starting position. I also would have thought that a back-analysis history of speed from differing starting positions would be useless since the current form of the animal is so crucial. Furthermore, since individual driver ability is so pivotal in h-racing, that factor as well would have to be incorporated into such history.
After thinking about your initial post a bit, I thought there might be some merit in the exercise if you were attempting to do a short-term analysis of a racing card (chart analysis) in order to adjust a presumptive track variant. Such presumptive variant, for example, is used by Trackmaster in their creation of their published speed figs. And it is made with the same flawed methodology of class pars versus actual class final times that is used in t-bred fig creation. Or, you might be involved in creating your own, more accurate variant in which final-time prediction (known as the projection method) could be of some use (although there are much better methods for this purpose).
But I guess from what you wrote that this is not your aim. Anyway, kudos on your great results and if you want to discuss any of this further, send me a PM so we don't bore the t-bred people to death.

46zilzal
09-02-2009, 11:59 AM
Read Mlodinow's The Drunkard's Walk: How RANDOMNESS Rules our Lives to see where statistical analysis is full of BIG HOLES

SchagFactorToWin
09-02-2009, 01:06 PM
Now thats what I call grinding it out.

Cough, cough... oh, sorry- I seem to have a lot of chalk around!

SchagFactorToWin
09-02-2009, 01:11 PM
Read Mlodinow's The Drunkard's Walk: How RANDOMNESS Rules our Lives to see where statistical analysis is full of BIG HOLES

I have read it. Applying it to handicapping, I think it shows that one's win/loss cycles are certainly random. But I don't think you can then go on to say that whichever horse wins is random. If that were true, wouldn't every post position have similar win rates? Or am I misunderstanding your point?

SchagFactorToWin
09-02-2009, 01:21 PM
Schag: I guess I should congratulate you on those great results. You have apparently improved on the invention of the wheel. I would have thought that adjustments in speed (and here we're talking about final times) for post position would be solely and completely dependent on early position, subsequent pace, and subsequent flow positioning without any regard whatsoever for starting position. I also would have thought that a back-analysis history of speed from differing starting positions would be useless since the current form of the animal is so crucial. Furthermore, since individual driver ability is so pivotal in h-racing, that factor as well would have to be incorporated into such history.
After thinking about your initial post a bit, I thought there might be some merit in the exercise if you were attempting to do a short-term analysis of a racing card (chart analysis) in order to adjust a presumptive track variant. Such presumptive variant, for example, is used by Trackmaster in their creation of their published speed figs. And it is made with the same flawed methodology of class pars versus actual class final times that is used in t-bred fig creation. Or, you might be involved in creating your own, more accurate variant in which final-time prediction (known as the projection method) could be of some use (although there are much better methods for this purpose).
But I guess from what you wrote that this is not your aim. Anyway, kudos on your great results and if you want to discuss any of this further, send me a PM so we don't bore the t-bred people to death.

I'm not really asking that my results be believed. I have found from reading this forum that everyone wants to be profitable, but whenever someone claims to be, they are accused of exaggerating, or worse. I think it says more about the psychology of horseplayers, than about handicapping methods.

Be that as it may, my problem with pace and flow is that there are so many variables that it seems to be a method with a low probability of success. What I am trying to do is to predict final times. Each horse's time projection has it's own Normal Distribution. Think of each race as a series of bell curves- the greater the overlap of the curves, the tighter the race.

markgoldie
09-02-2009, 02:12 PM
What I am trying to do is to predict final times. Each horse's time projection has it's own Normal Distribution. Think of each race as a series of bell curves- the greater the overlap of the curves, the tighter the race.
Fine. No argument. My point is that this exercise has two major problematic categories which make it's predictive relevance near worthless. They are:
(1) Final times are deeply influenced by: (a) pace (b) the flow the horse was in (sometimes loosely called "trip") and (c) the track variant for the card in which the horse performed.
(2) The current form of the horse which may remain constant over 3 or 4 races if you are lucky. And, as we know, the last piece of form information on the horse (his most recent race) has more relevance than each successive further race back in diminishing order of relevance.

Without a perfected methodology for assessing the affects of the above, the bell curves of final speed themselves are useless and therefore, their overlap as well.

garyoz
09-02-2009, 08:25 PM
Think of each race as a series of bell curves- the greater the overlap of the curves, the tighter the race.

Now that is the most dangerous idea that I have read for awhile. Normal distribution for speed figures for an individual horse??? Maybe if they adjusted ala Ragozin, but even then you have aging, track condition, trainer change, etc.

I thought initially you were trying to compare two populations: final time and projected time. Perhaps I misunderstood.

Not even getting into this. Great that you have such a huge ROI using excel for forecasting. Who would of thought it could be so easy? :bang:

SchagFactorToWin
09-03-2009, 02:22 PM
Now that is the most dangerous idea that I have read for awhile. Normal distribution for speed figures for an individual horse???

I'm not saying that I actually do that, I'm saying to imagine what that would look like. I would think that most handicappers try to predict a winning time (if not final, than at least for certain segments). After all, the fastest horse wins.

If one predicts that a horse will run in 2:00, what one is really saying is something along the lines of: 'there's a 95% chance that this horse will run between 1:59/2 and 2:00/3'. Every numerical prediction has it's own normal distribution. That's what I was trying to get at.

Cratos
09-03-2009, 11:11 PM
I am working on fine-tuning my method of estimating today's race speed using past speed. BTW, I play the harness tacks only, but this question is about statistics, so it shouldn't matter.

I have a database of actaul winning times in one column and my system's estimate in the next column. I have the Excel spreadsheet set up so that by changing one cell (let's call it A1), I can change all of my system's estimates.

What I hope to do is try various values for A1 and run statistical variance tests to see which value is most accurate.

But I'm unsure which test(s) to run. I'm thinking "t-Test: Two-Sample Assuming Equal Variances". Or "F-Test Two-Sample for Variances".

Most of the tutorials I find use social science or scientific examples. Are there any gambling statistics experts out there? Any books that address gambling issues for someone who hasn't studied stats for 20 years?

I don’t know if this website will help, but it might be a good place to start.

http://www.graphpad.com/www/Book/Choose.htm

Warren Henry
09-03-2009, 11:43 PM
I have read it. Applying it to handicapping, I think it shows that one's win/loss cycles are certainly random. But I don't think you can then go on to say that whichever horse wins is random. If that were true, wouldn't every post position have similar win rates? Or am I misunderstanding your point?
Zilly is like a jukebox. Punch a number and you always get the same response. You punched the statistics number and got his "doesn't work" response. This saves him a lot of time because he doesn't have to actually read posts or think.

PaceAdvantage
09-04-2009, 03:14 AM
Zilly is like a jukebox.Or, further evidence to bolster my theory that he is simply some sort of beta AI project eminating out of MIT...

Tom
09-04-2009, 07:42 AM
Berkley.

SchagFactorToWin
09-04-2009, 09:18 AM
I don’t know if this website will help, but it might be a good place to start.

http://www.graphpad.com/www/Book/Choose.htm

Thanks, that looks perfect. The search continues for 'The Idiot's Guide to Statistics for Gamblers'. I'll have to add it too my 'books I'm going to write someday' list.