tilson here...My question [Archive] - Horse Racing Forum - PaceAdvantage.Com

View Full Version : tilson here...My question

tilson

05-15-2001, 09:51 AM

Hi I am Tilson,
Some of you may know me from other forums but I am new here. Anyway a "friend" from another forum turned me on to this site and I do have a question.
If any of you have ever constructed "systems" in the past i would like to pose a question to you.
At about how many race samples would it take for your ROI and Win precentage to become cyclically repetitive ?
In other words, I realise that the larger the sample the more accurate/less devient the averages.
I could be wrong but i would suspect that at maybe 200 or 300 races.....addittional samples of the same size would be likely to yield VERY similar averages and thus represent a type of "cyclical repeat"....NO ?
I also read something a fellow wrote here....in the handicapp forum about a book he read...and would like him to DEFINE "impact value"
Thank in Advance
Tilson

Larry Hamilton

05-15-2001, 11:26 AM

This is a fascinating question that comes up all the time. I think the following analogy helps.

First, let's assume that the question is what size sample is sufficient to determine our pick's success rate (win or lose).

If you use 10 races, there are 1024 possible outcomes of your pick in 10 races win or lose. (2^10). Now let's say your latest test you won 7 and lost 3. This means there are 210 ways to win (7*6*5). Does this mean you can accept this calculation (7 out of 10) to predict your future picks? More math: 210 (possible winning ways)/ 1024 total possible ways) = 20%. This is your false alarm rate.

Since the question is what sample size is adequate, I am not going to go into the math. Suffice to say, if you can accept a 20% false alarm (and most numbers guys cant) then you are in business. The correct question then, is what risk am I willing to take on my methodlogy being bogus.

Two things should be obvious though, the higher your win% the lower your false alarms and secondly, the higher your sample size, the lower your false alarms.

05-15-2001, 11:40 AM

Larry, isn't there an important component missing from the "false alarm" equation -- namely price? It's very feasible to have a "system" that historically has recorded low hit rates but large mutuels and, therefore, still have been very profitable over time.

Boxcar

Larry Hamilton

05-15-2001, 11:50 AM

Truthfully, I don't Know how to include price in false alarm. The relationship I set up above was just to determine what is the likelyhood my pick will not conform to my methodology.

You could build a separate "truth table" to determine what are acceptable incomes against costs, but I must have slept thru the class about dynamic multip;e varibles used as input to the false alarms model hehehehe

NNMan

05-15-2001, 12:47 PM

Don't know if your were referring to me but I did
mention Imapct Values in one of my previous posts,
so I'll define it here.

An impact value is a statistical measure used to
determine if a chosen factor has any useful significance
within the context it is used. It is best explained with
an example.

Trainer A has an overall win% of 18% and a win% for
six furlong claiming sprints of 24%. This increases
to 28% when Jockey B is up. So, the impact value of
claiming races run at 6 furlongs for Trainer A =

24/18 = 1.33, meaning Trainer A is 33% more likely
to win a F6CLM race than other race types. It is a
strength area for this trainer.

and when Jockey B is up, this increases to =

28/18 = 1.55, IV for F6CLM w/Jockey B for Trainer A

You could also compute the impact value of Jockey B
in 6 furlong claiming sprints for this trainer:

28/24 = 1.16, IV for Jockey B for Trainer A's F6CLM races

The IV can be used to uncover hidden patterns of
significance and, insignificance, within your data.
An IV >1 is a positive indicator for the handicapping factor. A negative IV (<1) can also be useful in evaluating handicapping factors. The caveat here is that your data sample size must be large enough to support meaningful conclusions. The small sample pitfall is one most commonly encountered when performing this type of analysis and should be strictly avoided. You could use the "false alarm"
computation mentioned above to avoid it.

IV analysis is very limited however because is does not
allow analysis of multiple factors at one time and therefore
can not account for the nonlinear interdependencies that
are inherent in a lot of handicapping factors.

Cheers,

Dave Schwartz

05-15-2001, 12:59 PM

Not to mention the fact that the top jockey MIGHT have had his mounts in small fields, further skewing the results.

Que

05-15-2001, 02:28 PM

tilson,

Welcome to the board. With regards to your question about sample size... I'm afraid that there is no definitive answer. When I was doing extensive modeling in the financial industry this was a typical question.... i.e. how many time bars do you need to identify a trend? Unfortunately there's no definitive answer. Obviously the more data you use the more accurate your answer (for the now historical data). The paradox, is the higher your confidence in the trend, the more likely the trend will be over. Horseracing is similiar. For example, how many races do you need to identify a track bias... one, two, thirty, or three thousand races? How many races do you need to determine if the ROI on Bob Baffert horse's improve with first-time lasix? Larry summed up the answer in an early post--it all depends on how much risk you're willing to accept with one caveat. That is, in some cases larger samples are not always better than the smaller ones.

Que.

Rick Ransom

05-15-2001, 03:06 PM

There are so many problems in analyzing this kind of situation that we could discuss it for 100 years and never come up with "the truth". I mentioned some of the problems in the money management thread.

Answering a question like this always involves making some assumptions. For instance, was the sample selected at random or was it the best you found out of 100 things that you tested? Is the win percentage really likely to stay the same in a different sample when tested at different tracks, a different meet of the same track, under different weather conditions, and so forth.

I haven't even mentioned how prices will vary over time. Sometimes they will be depressed by the popularity of an approach. Worse yet, your approach may go through an unlucky period and you may abandon it just before that huge longshot hits. That's not entirely superstition, because the price is likely to increase as people abandon the method.

For those of you who know statistics, this is called a nonstationary process, and it drives economists crazy.

As a practical matter, the only thing you can do is assume that things won't be as good in the future as they were in the past (regression to the mean). In my experience, if you are playing systems (spot plays with a fixed set of rules), they may work much better at some locations than others. I used to play an early speed system that was great at about 1/3 of the tracks but bad elsewhere. Didn't matter to me at the time because I was betting in Nevada race books and could check all of the tracks in the country without even buying a Racing Form. Another system I know about works extremely well at California tracks, but would have to be changed to work at east coast tracks because of differences in the value of recency.

Most of the tests I've seen that were done on systems were using huge samples from all of the tracks in the country and the results were nearly always disappointing. I think this is why some people say that all systems are losers. Some of the ones most likely to survive depend on things that can't be found in the past performances and certainly would have never been sold or submitted for testing to someone with a huge database who might later (did) start selling selections instead of research.

I use a technique that involves how well my method predicts finish position as compared with actual odds. When I'm reasonably confident that I have something worthwhile, I start a small bankroll and play the system/method with it. If I was right, the bankroll grows and I bet more with won money. If I'm wrong, I lose very little.

Rick Ransom

05-15-2001, 03:42 PM

By the way, sometimes you win money on a system for reasons other than you know or want to know about. When I was in Northern California I played a system that showed most of its profits when a certain jockey was riding a non-favorite. If he was on a favorite, forget about it, no chance for the horse to win. Some of you may remember them taking Northern California races off the board for a while in Vegas and investigating this particular jockey. He later mysteriously disappeared into the bay and was found dead some time later. Someone I met in a race book once claimed to know all of the details. I just told him I didn't want to know.

Some friends and I played a lot of claimed horses at New York tracks and did very well. In going over the results, we probably could have called it the "Bet on Oscar Barrera to Perform Miracles With Claimed Horses" system. You old timers know what I'm talking about here.

Tom

05-15-2001, 06:14 PM

Sometimes you will uncover short term patterns that are profitable but then go away. you have to bet them while they last and get off them when the data shows they are cold. I think Mark Crammer refered to this as eclectic handicapping? or some such thing. We have trainers at FL that one year are 30% with layoff horses and the next year they are "oh-fers."
Sometimes FE horses and TAM horses are throw outs and some years they are automotics.
My rule of thumb is once I see something happen twice, start looking for it again. If you wait for all the data to come in, you might be too late to cash in.
Tom

Rick Ransom

05-15-2001, 07:21 PM

Tom,

That seems to be especially true when analyzing connections rather than just the horse itself. Horses don't care if they make an extra buck, but people do. It's not all just winning bets either. The claiming business is all about deceiving the other owners.