PDA

View Full Version : Bootstrapping


mwilding1981
10-02-2009, 07:16 AM
Could I please get some advice on the best way of implementing bootstrapping. I am using the tennis tournament outlined by GT in another post and am looking to find correct weights for the final power figure by bootstrapping it against an unseen set of data. I am taking a random selection of data from the training set and adjusting it, comparing it to a random selection of data from the validation set and seeing how it compares. Is this the correct way to be doing it or should I be using one data set and taking random samples for training and validation? New to creating my own bootstrapping method so please answer in baby talk :) What would you suggest is a good way of looking at the training and validation data and deciding whether it is becoming overfitted or optimal? Any good starting bootstrap methods or articles would be very helpful, thanks.

mwilding1981
10-02-2009, 07:28 AM
Forgot to mention that at the moment I am testing the probabilities against the actual amount of horses that win. E.g. what is the strike rate of .10 probability horses, the aim is to get it as close to .10 as possible without overfitting. I am not happy that this is the best way however as with small samples this means there are not enough horses in odds ranges to make reliable assessments, is there a better way that would be recommended?

Ray2000
10-02-2009, 11:41 AM
This paper is developed using Harness race data from Finland but the statistical analysis (and bootstrapping) can be applied to other databases.

At least, it is somewhat readable.


http://joypub.joensuu.fi/publications/other_publications/suhonen_market/suhonen.pdf


wikipedia has useful info on bootstrapping