PDA

View Full Version : Help with impact values


CBedo
09-02-2006, 02:01 AM
After having one of the most bizarre few days of handicapping (and the most profitable), I went out this evening and had a few cocktails. So now I am back at home, buzzed and thinking about the world in terms of impact values, lol.

Let's say in a sample of nine races, there are 48 entries. Of those 48, 24 are black and 24 are brown. But brown wins 6 of the 9 races. So brown has 2/3 of the winners and 1/2 of the runners. Is the impact value for being brown (2/3)/(1/2), or 1.33? Just trying to make sure I'm thinking about this right.

Thanks for putting up with me,

CBedo

Overlay
09-02-2006, 02:13 AM
Looks right to me, at least according to the way Davis and Quirin calculated IV. (I believe Nunamaker used a weighted approach that considered variations in individual race field sizes, and that makes the IV's come out slightly different than those arrived at using Davis'/Quirin's method. His reasoning for doing it made sense, but I've never been able to figure out how you can go back and reconstruct or verify values like that if all you have to work with is aggregate group data.) And then there's also the A/E approach, which takes the odds of the horses into consideration, and looks at whether horses with a particular characteriastic actually performed better than expected, given their odds.

CBedo
09-02-2006, 01:17 PM
Where can I find some info on Nunamaker's numbers? What you refer to is part of what I was thinking about last night (thinking slowly today).

In the example, you might think with the 1.33 impact value, that brown has an advantage over black with a impact value less than one. But what if you then saw that of the nine races, eight had four runners each--one black and three brown. In these eight races, black two and brown won six--exactly what you would expect. Then, in the ninth race, there was a 16 horse field, all black!

So, in reality, brown and black won exactly the proportion of races they should have, but the impact values make brown look more attractive.

Just thinking,

CBedo

Overlay
09-02-2006, 01:43 PM
I bought Mike Nunamaker's Modern Impact Values in 1995. He didn't say anything there about it, but when I was working with the values, I cross-checked some of them by figuring Quirin-style IV's from Nunamaker's raw statistics on numbers of horses and numbers of winners, and comparing them against the IV that Nunamaker listed. Almost none of the values came out exactly the same, so I wrote Nunamaker a separate letter asking about it. He must have gotten that question a lot, because he sent me a preprinted sheet discussing that issue, and explaining that the impact values in his book were different because of his weighting the values based on individual race field size. As I said, I couldn't figure any way to reconstitute Quirin's data from Winning at the Races (so that I wouldn't be comparing apples to oranges), since I didn't have visibility of the field sizes of the races in Quirin's samples. So, before working with Nunamaker's values, I recalculated them using the Davis/Quirin method. The individual factor differences were usually not that large, but they might have significantly affected the final value when considering a number of different factors.

PlanB
09-02-2006, 01:55 PM
Why bother with Impact Values? What do you do with them? I say that
for 2 main reasons: #1, What is the distribution of IVs? #2, How do you
control for randomness in IV values?

ryesteve
09-02-2006, 02:17 PM
#2, How do you control for randomness in IV values?
I'll take this one...
"Large samples"

Overlay
09-02-2006, 04:19 PM
The distribution of impact values for a given factor generally mirrors the winning percentages associated with the varying degrees of the factor, but reveals the true power of each degree in a way that raw percentages don't by showing to what extent the performance of each degree exceeds or falls short of its "fair share" of victories (regardless of how high or low the winning percentage may be as a stand-alone number).

garyoz
09-02-2006, 04:37 PM
Why bother with Impact Values? What do you do with them? I say that
for 2 main reasons: #1, What is the distribution of IVs? #2, How do you
control for randomness in IV values?

An additional problem is that you can't add them together even if they are valid and reliable. For example if a horse has a measured impact values of 1.33 for a "speed" variable and 1.20 for a "class" variable you can't combine them, because they more than likely measure the same thing. Classier races have faster times than those of a lesser class. In effect, you'd be double counting. This is one of the factors that has driven the development of "power numbers." Which I personally think just point to underlays.

Red Knave
09-02-2006, 04:58 PM
Where can I find some info on Nunamaker's numbers? You could PM him. He posts here.

Overlay
09-02-2006, 07:05 PM
An additional problem is that you can't add them together even if they are valid and reliable. For example if a horse has a measured impact values of 1.33 for a "speed" variable and 1.20 for a "class" variable you can't combine them, because they more than likely measure the same thing. Classier races have faster times than those of a lesser class. In effect, you'd be double counting. This is one of the factors that has driven the development of "power numbers." Which I personally think just point to underlays.

You're right about the dangers of double counting. But what about the utility of factors where the upper and/or lower ends of the impact value range are sufficiently high or low that they qualify as independent variables that function free of influence from other elements? And as for power numbers and similar multi-variable measures, couldn't you also calculate impact values for them, which, in turn, would demonstrate that it was possible to meaningfully combine impact values from diverse factors, while also providing a tool for judging when a horse was underlaid or overlaid, based on those calculated values or percentages?

PlanB
09-02-2006, 07:36 PM
Overlay, anyone who could do all that could cap w/out Impact Values.
That stat is a total dead end; the dependence factor can be easily
dealt with, with a simple chi square that figs the ~correction, but still
the question reamains: what do you do with the IVs? And I'll give you
4:1 that IVs vary wildy from sample to sample. If you cannot specify the
shape & algorithm of its distribution, stay away. (this is my last post tonight)
Good night.

robert99
09-02-2006, 07:40 PM
IVs may be most useful to give a quick guide and understanding as to what factors may be most relevant. The low value or "negative" IVs are equally as important as the high ones. You could combine factors for IVs but there are a huge number of combinations out there and would it be combining two factors, five, ten and which 2, 5 or 10?

What if single IVs were say 1.5, 0.3 and 0.9 for 3 different charactics of a horse? Is the 1.5 sufficent to outweigh the poor figures or are the poor figures saying that this horse is one of the 1.5 IV horses that is not destined to be a winner?

IVs are not rigorous data to base probability odds upon and I think that here A/E is the better way to go as the horse expectation (a summation of all its relevant race characteristics) and field size are all taken into account.

Horse racing is full of non-independent data and it is a critical analysis subject which probably deserves its own thread.

garyoz
09-02-2006, 08:19 PM
You're right about the dangers of double counting. But what about the utility of factors where the upper and/or lower ends of the impact value range are sufficiently high or low that they qualify as independent variables that function free of influence from other elements? And as for power numbers and similar multi-variable measures, couldn't you also calculate impact values for them, which, in turn, would demonstrate that it was possible to meaningfully combine impact values from diverse factors, while also providing a tool for judging when a horse was underlaid or overlaid, based on those calculated values or percentages?

IMHO you are correct on all of the above. I was taught in grad school to do a correlation matrix and if two variables had a correlation greater than .70 than to not use them in a linear equation or try to combine them in some type of factor analysis (aka a power number). Racing variables are highly correlated.

If you wanted to try to develop some power numbers or combination variables, you could try cluster analysis or factor analysis. They have their own internal tests for significance, internal validity etc. Personally I haven't used those statistics for years, but SPSS is pretty turnkey. Theoretically you could build models that reflect probability of winning by using those variables that you built and regress them against a probit or logistic function. I really don't want to get back into the logistic regression discussion--it has been over discussed on this board--do a term search--you'll see. In the end I don't think it would be worth the time.