PDA

View Full Version : Regression Analysis


Dave Schwartz
01-08-2007, 01:07 PM
If I have two tables of rankings and I want to know which has the "smoother fit" what regression approach should I use?

Or is that even the right approach?


Dave
(Who has less interest in statistics and more in how to make a cyber cockroach.)

Greyfox
01-08-2007, 01:43 PM
I'm not a statistician, but if memory serves me right,
a statistician might ask:
1. Are you ranking different variables or one variable? (i.e. in each list of rankings are there big apples, medium apples, and small apples versus
big apples, big oranges, big pears, medium apples.....etc.)
2. what are you trying to "fit" your predictors to?

Dave Schwartz
01-08-2007, 02:17 PM
LOL

You guys sure know how to complicate a simple question (but thank you for taking the time to reply at all).

And it is amazing what one can find in a Google search. While awaiting a simple answer I found what I was looking for and wrote the code to implement it.

I have an IV Table based upon rankings. I simply want to smooth the line because I think it will produce a better answer than the unsmoothed.



Dave

garyoz
01-08-2007, 02:23 PM
Dave, regression analysis theoretically requires ratio level data. You have ordinal (or maybe interval--depending on your assumptions of distribution of your rankings) level data (hierarchy of data levels: nominal, ordinal, interval, ratio). Ratio level requires a zero point. Can you use raw figures in the analysis?

Regression is really just an exercise in matrix algebra. One solution to using categorical level data is to use dummy variables (1=yes variable present, 2=no variable not present). But it works only for dichotomous categories (gender 1=male 0=not male, etc.) The 0 value drops the "not" category out of the model and you can measure for the effect and significance of one value of the dichotomous variable.

Also you would need to regress against (dependent variable) the probability of winning in the form of a logistic regression function, an S-shaped curve.

Glad you found a simple solution

formula_2002
01-08-2007, 02:58 PM
It's a good question, one I have oftened tried to answer by using emperical data. I use incremental odds analysis wrt rank1...n, increments odds analysis wrt rank2....n and then incremental odds of all rank combinations where the yield in all cases is a flat bet roi and actual wins/expected wins.
It's grunt work, but with the tools we have today (computer programing) its do-able for more of us.

Joe M

formula_2002
01-08-2007, 02:59 PM
LOL

You guys sure know how to complicate a simple question (but thank you for taking the time to reply at all).

And it is amazing what one can find in a Google search. While awaiting a simple answer I found what I was looking for and wrote the code to implement it.

I have an IV Table based upon rankings. I simply want to smooth the line because I think it will produce a better answer than the unsmoothed.



Dave


Dave send the google site along to me. I want to see what makes you happy. ;)

ryesteve
01-08-2007, 03:12 PM
I have an IV Table based upon rankings. I simply want to smooth the line because I think it will produce a better answer than the unsmoothed.
Ah, that makes sense. From your original post, I had a hard time grasping what exactly you were looking for.

Dave Schwartz
01-08-2007, 04:07 PM
Joe,

First, an apology. One does not ask for help and then criticize the guy who asked for more information before providing the help. That was unkind of me.


As for the Google site... I was actually looking for some simple code to do what I desired... a simple R^2 regression to smooth the values in a table. The code was so simple- about 20 lines - eventually I found it in something I wrote about 25 years ago. (Yes, I still have some of that.)

Dave

skate
01-08-2007, 04:20 PM
Joe,

First, an apology. One does not ask for help and then criticize the guy who asked for more information before providing the help. That was unkind of me.


As for the Google site... I was actually looking for some simple code to do what I desired... a simple R^2 regression to smooth the values in a table. The code was so simple- about 20 lines - eventually I found it in something I wrote about 25 years ago. (Yes, I still have some of that.)

Dave

im certain you are sure with your request.

i would think that the 'Smooth Values' would negate a distinction. i'm not saying you should not do what you do, but i ask myself 'why'?

PlanB
01-08-2007, 07:34 PM
Okay, maybe this post brings you geek-mathers head to head with data. Should rankings be revisted. Sounds tough (smug?) to think you can SCORE abilities on some point scale? I don't really know, but what if you set certain standards & Just rank each horse on each standard? Try to determine some set # of standards, maybe 5-6 & if a horse ranks very low in "X" # then out. hehe, you fill in the details, I'll pay for your service.

ryesteve
01-08-2007, 08:04 PM
i would think that the 'Smooth Values' would negate a distinction. i'm not saying you should not do what you do, but i ask myself 'why'?
By "smoothing", he means to make the progression from high to low more uniform, not artificially flatten the slope.

singunner
01-08-2007, 08:58 PM
While scoring subjective abilities is, well, subjective, there are any number of factors in a horse race that are objective (at least, that's what we're lead to believe). Finishing time, length of race, lengths beaten, position at various calls, jockey, trainer, etc, etc. If there aren't already programs that attempt to compare all these objective criteria, I'd be surprised.

prank
02-03-2007, 05:17 AM
Okay, maybe this post brings you geek-mathers head to head with data. Should rankings be revisted. Sounds tough (smug?) to think you can SCORE abilities on some point scale? I don't really know, but what if you set certain standards & Just rank each horse on each standard? Try to determine some set # of standards, maybe 5-6 & if a horse ranks very low in "X" # then out. hehe, you fill in the details, I'll pay for your service.

While using just ordinal rankings is somewhat useful, over the last 5 months I've come to the conclusion that converting to quantiles is a far better way to approach many problems. It also makes it clearer what's going on in dependency relations.

For instance, suppose you have a "universe" of 5 horses, and you're ranking them on weight: 1,2,3,4,5, with 1 being the heaviest. Then, you could convert these to quantiles: 0.2, 0.4, 0.6, 0.8, 1 (using: P(X <= x) definition: Pr(Wt(X) is less than or equal to the weight x). This way, you incorporate the total size of either the sample or some superset in addition to the ranking of the item being considered.

Amazingly, quantile regression only came of age in the last few years, because it tends to require much more computing power than the more common types of regression.

Prank

traynor
02-04-2007, 02:28 AM
The underlying problem with rankings is the assumption that each "gap" between the ranks is equivalently valued. For projected final times, one entry is top at 1:09:4, second is 1:09:3 and third is 1:10:4. Ranking assumes the difference between third and second is the same as that between second and first.

prank
02-04-2007, 08:47 AM
The underlying problem with rankings is the assumption that each "gap" between the ranks is equivalently valued. For projected final times, one entry is top at 1:09:4, second is 1:09:3 and third is 1:10:4. Ranking assumes the difference between third and second is the same as that between second and first.

To be more precise, a *linear function* of rankings has that assumption. Nonlinear functions can be used to get other results. In practice, what gets so many people messed up with rankings is that they only use a linear function, and thus run headlong into the problem you described.

Quantiles are nice because we can examine quantiles of two different data sets and see if they come from a similar distribution, without concern for shift or scale changes. (This is known as a Q-Q plot, where the axes are the actual values, but, for each sample, there is a 1-1 relationship between quantiles and actual values.) There are certainly ways to be dangerous in using quantiles, too, but what can be done...