PDA

View Full Version : Minimum # of races?


jackad
10-03-2001, 12:19 AM
When testing a system without the benefit of a db, what is the minimum number of races you require in the sample before having any confidence in the results shown? Before being willing to bet with the system?
Jack

PaceGuy
10-03-2001, 09:13 AM
As a general rule, I don't trust anything unless it has shown a profit over a recent sample of at least 5000 races. I have tested hundreds of ideas, play types, and angles, - some of which have shown remarkable promise in a short test of 100 to 300 races- but then almost always fall completely apart when subjected to a longer test. What I am always trying to do is spot what actually happens in the long run. Sadly, it's been my experience that you just can't do that, at least in thoroughbred horse racing, with small sample sizes.

Of course, I make extensive use of a database. Trying to do without one takes far too much time.

Tigercrash
10-03-2001, 09:22 AM
Jack,

I would want a minimum sample of 100 races showing profit (preferably more) before I would consider a system a potentially viable approach. Then I would test against at least one other sample of 100 before I would move toward the window.

Dick Schmidt, I think, suggests 900-1000 for valid statistical testing.

You were at the HSH May seminar weren't you? Remember the "McBurney Method"? What I got out of Don's method was for sample testing... Don would take smaller sample sizes (say 25 races), but many, many samples to test his approaches. I've found this to be very helpful and more realistic of day-to-day OTB betting.

Kyle

Tigercrash
10-03-2001, 09:29 AM
PaceGuy's approach may explain why he's a rich man...and I'm not! <LOL>

Seriously though, and to clarify, when I say sample sizes of 100, I mean 100 races that exactly match the criteria I have set, not just 100 races (of which only 10-20 may match).

Goes without saying, the bigger (much bigger) the sample size, the bettor (pun intended).

Kyle
:-)

GR1@HTR
10-03-2001, 09:35 AM
Jackad,


Me thinks there is no correct answer. I've run all kinds of test including one, which included 80K races, and still I don't think that is enough....Few reasons why....

1) Generally speaking, it is all back fitting. In other words, we are whittling down specific data to find the highest ROI/Win%. That is fine if every race is run the same way, but of course that is not the case.
2) The game is changing daily. What people bet last year might not be what people are looking for this year. Say for example a group like HSH, HTR, BRIS w/ their Prime Power determine that if you bet XYZ combo you would get an ROI of 1.10. Now you have thousands of extra dollars bet on those plays this year therefore lowering the ROI to a negative situation. Like chasing a pot of gold at the end of the rainbow.
3) Large boxcar payoffs will alter the ROI dramatically. A few freakish $100 plus winners will turn a .75 ROI play to a 1.20 ROI play that looks beautimus.


IMHO, it is best to break data samples down not by sample size but by time period. For example, if your spot play shows a profit over say 10 out of the last 12 months they you have a good play. It is possible for a play to have a positive ROI and only be profitable over 2/12 months due to boxcar mutuals. By breaking data down by time period, you have solved that problem.

Few more random thoughts about computer generated spot plays:

1) IMHO, you better play them all. If you start cherry picking from those spot plays, one will most likely end up with all the $6 and $7 plays and not the $25 plus mutuals that make the play profitable. Also, if you don't play them all/everyday the day you don't play them will be the day where your $50 plus mutual comes in. It's just the way it works.
2) Play a very low percentage of bankroll. 1/2% or so. Say play 2% of your bankroll and have 25 plays in a day. You just put 50% of your bankroll dependent upon what happens that day.
3) Generally speaking, if you want a profitable play, aim for 12% to 18% winners.

That is my 2 cents, back to the salt mines for now.

Rick Ransom
10-03-2001, 12:38 PM
My opinion is that you should use a minimum number of wins rather than a minimum number of races. For example if you use 100 wins as the minimum, you'd need 500 races for a 20% method but 1000 races for a 10% method. The reason I like that approach is that the average price is just as important as the win percentage in determining the profitability of a method.

I'm not saying that this is always an adequate minimum though for the reasons previously mentioned. I've seen methods based on 2000 races fall apart and I've seen some based on 200 races that have been remarkably stable over a 20 year period. Stability of the results seems to have to do mostly with how public the factors are that you're using. There have been periods when some simple handicapping factors did surprisingly well for a year or two. But then it becomes widely known and the prices suffer.

Tom
10-03-2001, 02:29 PM
My opinion but if I see something happen twice, I'm on it for a while. Too many "patterns" are short lived and go away long before the long run. I saw on day one at Finger Lakes one year that that trainer X got beat with two big favorites that looked well meant. I bet against him for the next few weeks until he started hitting on the second start over the track. One time, I saw a guy ship a NY bred down to Philly and win with him-paid only a few buck at NYOTB but $25.00 at Philly (this was before pools were con-mingled and Philly was only offerd on Tuesdays. He did this 5 times that I ever saw and I was lucky enough to catch numbers 4 and five=both paid over $30.
I like putting real money on the line when I evaluate and angle or pattern-it really sinks home that way.
Using HTR, I saw three races in row one Saturday where the exacta contianed the top three "Total" rankings in Impact. I caught six exactas that day. The next day, after the third race in a row where this didn't work, I was off it and it only hit once the rest of the card.
Tom

Rick Ransom
10-03-2001, 04:51 PM
Tom,

I have to agree with you as far as trainer angles are concerned; two or three wins usually establishes a pattern, then it goes away. I haven't found that short term track bias was worth keeping track of though. Others may disagree with me though because they may be using different data than I am.

alyingthief
10-14-2001, 12:17 PM
i would hasten to add to all this, that a test conducted on races from, say, hollywood summer, will have a different result from one conducted on data from a santa anita winter meet--if your method cranks on favorites in the claiming 20k region at the former, i can damn well assure you it will collapse miserably at the latter.

in fact, i wonder if, given the nature of track soil consistency, and its interaction with a horse's muscle composition (bias), many of these tests might show much higher win rates merely by qualifying one's selection on bias suitability. (now, THERE'S a subject for a test!)

however, to answer your question specifically, there are formulae to determine the necessary number of trials for validation, statistics abounds with them. get a math book!!!
don't, i repeat, don't take comments like, oh, a hundred, gee, i gotta see 5,000.......KNOW!

PaceGuy
10-15-2001, 01:03 AM
Statistics and math theory notwithstanding, I'll stick by my original post. My own experiences strongly indicate that methods profitable over a large number of current racing data tend to outperform methods only profitable over smaller sample sizes and/or stale data. Applying math theory in this area has never helped me win money. Finding strong angles currently being ignored by the betting public, when I can see a history of it using a database, has.

Rick Ransom
10-15-2001, 01:21 PM
alyingthing,

Bias may be the reason that many speed-oriented methods are improved by using the best recent race over today's track. That doesn't help you much during the first couple of weeks of a meet though when nobody has a recent race over the track. I don't think using races from the previous meet will help either because they dig up the track between meets so frequently.

The subject of when a sample is likely to be predictive of the future is a subject that could fill an entire book on it's own. Statistics is somewhat helpful but you have to remember that we're dealing with variables like win price that change over time. The real question is whether your method will likely win in the future, not whether you have an accurate estimate of the win probability. The best way I've found so far is to look at how much my method combined with actual odds improves predicted finish position compared with actual odds alone. If the difference is significant, you MAY win in the future. If it isn't significant, you definitely WON'T.

alyingthief
10-15-2001, 02:29 PM
generally, statistics performs what tasks are asked of it: the seeming intransigence of thoroughbred data to this generality is often more a product of (seemingly) peripheral factors actually possessing greater impact value than those we are keying on--so, for instance, if summer racing has a significantly greater percentage of favorites winning, that may be THE defining characteristic for a test we are running, and just the factor ignored in our evaluation. as to the statement about higher odds being some kind of sign, that consideration is usually built into the formulae we use. if we want to know the number of trials needed for a certain ROI, we can ascertain at what confidence level we are willing to accept the results of our trial, and that in turn requires both the kinds of odds we see, and the % of winners that have these odds.

mostly, it seems to me, the reservations we entertain about statistical validity in horseracing are derived from the nature of our questions. in fact, a statistically viable sample IS viable but, to repeat, it amalgamates many factors perhaps ignored by the tester, and which are of decisive importance to the study in hand. we unfortunately do not have the luxury of a laboratory in which to conduct our tests, so that the environment in which the event we wish to measure takes place is of equal importance to the event itself. most of the questions we horseplayers ask, i have noticed, are altogether too general, or altogether too specific. we want to know if this is a verity, not in what circumstances it is a verity--and so complain about data fitting, when the results of our study reveal highly specific situations in which our factors apply.

of course, to be of use, a factor should have a fundamental relationship to the racing environment: and so, we should really confine ourselves to questions of a fundamental kind, asking under what circumstances such and such will change, and what the nature of that change will be. to sum up, the failure of the statistical method accrues to the statistician, in asking questions not sufficiently pertinent or distinct, and so we get pretty useless results. i think what i am trying to say here, is that the real object of our study--the overall racing environment, and what is fundamental to it--is normally ignored, whether out of haste or greed or whatever; we are neither asking the right questions nor in the right way.

Rick Ransom
10-15-2001, 04:23 PM
alyingthief,

The tradeoff between using a sample that is large enough and one that is specific enough to the situation is, I think, the biggest dilemma in testing horses racing systems or methods. It takes a great deal of experience to narrow down your testing to only those ideas which have some chance of success. I always have the feeling that I've stumbled on a lot of good ideas in the past and not benefited from them because I couldn't rationalize the logic behind them.

It's interesting to read about statistical analysis in other areas such as financial markets or economics in general. They seem to have pretty much the same problems we have. Everytime they think they understand how things work, something changes. Horse racing looks easy compared with trying to predict how terrorism will affect the stock market and economy.