Spot Play Sample Size [Archive] - Horse Racing Forum - PaceAdvantage.Com

View Full Version : Spot Play Sample Size

horseracing101

12-10-2015, 06:04 PM

Long time lurker first time poster.
Last week I can across a spot play. In the past 5 days looking at all thoroughbred races at all North American tracks this spot play has found 8 betting opportunities. All 8 have cashed for the win. All selections have been at or below 2/1 off odds and either favorites or co-favorites. How many races will it take to come up with a semi accurate hit rate.

EMD4ME

12-10-2015, 06:30 PM

Welcome aboard :)

Can you give some details as to why you liked these plays?

horseracing101

12-10-2015, 07:17 PM

Welcome aboard :)

Can you give some details as to why you liked these plays?

These picks were chosen off of pace figures and speed figures only with no regards to anything else.

Thanks

Dave Schwartz

12-10-2015, 07:22 PM

The answer depends a great deal upon the odds on the horses involved.

For example, if your system is picking low odds horses (as you mentioned) I would say 400 or so would give you a pretty good handle on reality.

At very long odds, I would offer for consideration that some common statistical tables begin with 30 observations as a minimum. Thus, a system that depended upon longshots should have a minimum of 30 wins in the highest odds category.

I can tell you that several of my users have developed and tested complex systems that were highly profitable at 800 races that regressed to solid losers by 1,200 races (though still far outperforming the public).

BTW, one must not forget two things when testing:

1. Seasonal component. (Many factors perform very differently in the summer as opposed to the winter.)

2. Bull vs Bear. Longshot systems test very well in a period that produced higher-than-normal prices (i.e. The Bull), while favorite-based systems perform better during lower-than-normal price periods (i.e. The Bull).

ReplayRandall

12-10-2015, 07:43 PM

1. Seasonal component. Many factors perform very differently in the summer as opposed to the winter.

Dave, do you have to give EVERYTHING away.. :lol: ... But what you've said here is ONE of the most powerful factors in all of horse racing....'Tis the SEASON.

horseracing101

12-10-2015, 08:00 PM

Thanks for the input Dave.
The results kind of caught me off guard because I've never been that lucky before. But after your answer I looked at this streak calculator http://www.sportsbookreview.com/betting-tools/streak-calculator/
Based on my average odds so far .82 or 55% probability once you hit a series of 800 races there is a 95% chance of hitting 8 in a row.

Just kind of racks my brain that 8 for 8 hit which has 0.837% chance of happening in a series of 8.

Would you trust back testing or just wager away going forward until it breaks?

horseracing101

12-10-2015, 08:20 PM

Dave Schwartz

12-10-2015, 10:12 PM

Would you trust back testing or just wager away going forward until it breaks?

That is a really great question.

The answer, of course, depends upon you. Specifically, the level of importance that winning and losing has in your existence as a horse player.

If you're a fun player with a dream, I'd say, "What the heck?" and just go for it.

If you're looking for an answer because you'd like to put up a $20,000 bankroll and start playing seriously, well, you're going to need greater confidence.

A short story.
Ed Bain, the trainer stats guy who is famous for his 4+30 approach (i.e. 30% winners and at least 4 wins for a factor), caused me to build some really killer trainer stats into our current software. The cool thing about our trainer stats is that we can take a whole bunch of factors - there are about 145 per horse - and produce a single number for each trainer.

BTW, that number is extremely powerful. Alas, it also correlates closely with the tote board. (What a surprise.)

It was a short leap to applying a version of the 4+30 approach (all user-programmable adjustments, of course) and building some relative-to-the-field strength version of the factor counts.

Long story short, I tested an approach that produced a count and a percent of the total counts in the entire field. At a certain threshold level I pulled the trigger. At a certain level I watched as the plays (which come up about one every 35 races or so) won their first 48 races in succession.

Yes, I said, FORTY-EIGHT.

(BTW, I wrongly reported this on one of my HSH videos as 62.)

Now, there were so many $2.10 and $2.40 payoffs in the group, and a BIG payoff was like $3.60, but I thought, "WHO CARES? IT WINS ALL THE TIME!"

Then it lost. Then I won 5 or 6 more and it lost again. Then I won a dozen and it lost 2 in a row. By the time I got to 120 bets, I had come to the conclusion that there was absolutely NO WAY to make money with these horses!

The real kicker was that I eventually built a strategy to automatically play AGAINST these horses whenever they came up!

My Point
Do not be surprised if your system just dries up as the sample size increases.

Dave

Hoofless_Wonder

12-11-2015, 12:38 AM

Would you trust back testing or just wager away going forward until it breaks?

Your post reminds me of some of the "winning" chalk players who claim to make profits over the long term on low odds horses - I'm skeptical, but there's more than one way to skin a cat in this game. Personally, I thinking you're taking the worst of it in the risk of ending up on many underlays, as well as the "mysterious" poor performances that chalk will throw in, especially on the lower circuits.

My opinion is to start wagering and track hit rate and payouts. If you've hit eight in a row all at less than 2-1, by New Year's you'll know if you've simply had a streak of winners to soon be replaced by a streak of losers, or if you're on to same spot play approach that hits 60 or 75 percent or maybe even higher. I'm guessing you'd have to exceed 50% to break even, and probably higher - is your average payout $4.00?

Besides sample size, I think another consideration is the complexity of your spot play method. Certainly pace and speed numbers take into account countless other handicapping factors, and to simply use those popular derivatives is also asking for trouble, but I could be wrong. The method does seem selective, and that's certainly critical to success.

If nothing else, next year be sure to load up on those plays from December 6th to December 10th...... :)

raybo

12-11-2015, 01:08 PM

Dave, do you have to give EVERYTHING away.. :lol: ... But what you've said here is ONE of the most powerful factors in all of horse racing....'Tis the SEASON.

I doubt Dave gave away much. Time of meet and time of year has always been important, at least to me and a few others I know anyway. What I call "the environment of the track" includes not just prevailing weather and surface conditions, but also the horse colony, jockey colony, trainer colony, etc.. When any of those things change, "the environment of the track" changes, and that can cause the racing to change.

MJC922

12-16-2015, 06:53 PM

I'd strongly urge to never trust back testing by itself. The fit is typically why it works and even sample sizes of many thousands of races can be fit to noise producing what appears to be an invincible ROI. Almost invariably you'll find it will lose the take when forward tested. If not then you obviously have something or at least you're on the right path. I'm back testing on 2600 races at the moment and forward testing 1500. Good luck with 8 races. Don't think I'm not rooting for you though! :)

NorCalGreg

12-16-2015, 08:43 PM

flatstats

12-18-2015, 11:17 AM

Dave is correct in that the odds matter. If your method picks 2/1 shots then the sample size can be much shorter than if it picks 20/1 shots.

Sample size is not everything either. You still need to work out if the results you have observed are likely to be due to chance or not. Again the odds will help you here as they will indicated if something is winning at a rate much more than the odds expected it too.

Here's what I do when researching spot plays.

1. Justify It
If you find a good stat totally justify why it would happen first, rather than accept the figure. Punters would generally look at a stat and just accept it without understanding why it would happen. If they would just filter out the crap at this stage now it saves a lot of time later on.

So if a Trainer is good with 2yo claimers at one course ask why? Is it because he is good with 2yos in general, or claimers in general? Do they mostly tend to be favourites? Is it one particular owner? Try and find out why before proceeding.

A good example is John Gosden in the UK. He specifically targets horses to peak as 4yos and then ships them off to stud when they are at their peak. Other trainers target 3yo races, or keep the horses in training to win older Group races or Handicaps but Gosden is totally focused on 4yos. Here are his stats for 4yo Turf runners:

109 Wins
508 Runs
22% Strike Rate
1.19 A/E
40% ROI

2. Sample Size (based on Odds)
One method of determining sample size is to sum all the odds chance of winning and seeing if the total hits a specific figure. I have two figures: Medium (5) and Strong (10) Ideally you want a strong figure but here in the UK there are fewer and more diverse races so you sometimes have to rely on the medium setting.

In the Gosden example above we need to sum all the odds of those 508 runners, where the odds are the odds chance of winning

e.g. If a horse goes off at 2/1 then the odds chance of winning = 1 / (2 + 1) = 0.33. If a horse goes off at 9/1 then the odds chance of winning = 1 / (9 + 1) = 0.10

Add up all the 0.33, 0.2, 0.05, 0.4 ... for all 508 runners and you will get 91.71. This is the Expected Number of winners from those 508 runners and is used as an indicator of sample size.

The figure is way above the Medium setting (5) and the Strong thresholds (10) so this indicates that the 508 runs so far is totally acceptable to use so we can proceed to the next step. If the figure was below the thresholds then we can not proceed. We should wait until the threshold is reached (more of his 4yos have run) before we continue to 2.

(note: ideally you need to compensate for track take / bookmakers over-round / Betfair commission when working out odds. For simplicity that is ignored in this example)

3. Random Noise / Luck or Valid?
We now need to check if the results we have observed so far is likely to be due to a lucky streak, or randomness, or could be valid and worth relying on. For this test we will use the ARCHIE figure.

ARCHIE is a statistical test based on a chi-square formula. It was simplified for use for betting purposes by Steve Tilley. Some of the original articles for that subject have gone but you can read an archive here:

Archie, a method of evaluating systems (https://web.archive.org/web/20041009202829/http://www.0dd5.com/matharchie.shtml)

Basically you again need to know the odds of all runners and using those figures you end up with a figure that can be used to indicate if the results are due to chance, or not. The higher the figure the better. I use Medium (3) and Strong (5).

Looking at the Gosden example he produces an Archie of 3.98, which is acceptable. If the figure was much less than this then we could just be observing a lucky streak, which may not continue in the future. We would therefore want to ditch this spot play right now and not continue if the figure was low.

4. Sandbox It
One flaw with the above technique is that it is sort of backfitting (even though you may have total justification of why it should work). To prevent that you should have split the data into two or three parts. One for initial analysis, one for testing on fresh data, another for testing on further live, fresh data.

If you have not done that (or prefer not to) then you need to sandbox the results from this date forward.

Make a note of the results and the date you created this spot play. Then start recording the results from that date onwards. This is called the Live Results.

All you do now is to repeat the processes in 2. and 3. You observe the results each day and if you pass 2. and 3. then you are out of the sandbox and only then do you start following the spot play.

If the Live Exp figure has not reached 5 (ideally 10) then you keep the spot play in the Sandbox and wait until it reaches the desired threshold

Once the threshold has been reached check the ARCHIE figure. That figure will determine if you have a playable system or not. This doesn't mean that if your Exp is > 5, your ARCHIE > 3 and your Live Exp > 5, your Live ARCHIE > 3 that you are going to get $$$$. It just means that the results you have observed so far have a good sample size, pass a basic statistically test and would therefore be slightly more robust than random guessing.