|
|
07-06-2014, 12:53 PM
|
#1
|
Registered User
Join Date: Jun 2014
Posts: 79
|
Small Sample Size?
I am wondering as you tried to develop your own "system" so to speak, or things you look for inside each race... what do you consider a good enough sample size of races? 100? More?
|
|
|
07-06-2014, 02:58 PM
|
#2
|
broken-down horseplayer
Join Date: Feb 2008
Location: Portland, OR area
Posts: 2,090
|
More is better, but sometimes the numbers aren't there, but are still worth looking at - especially trainer angles.
There's plenty of posts on this here - search on "sample".
Rather than worry about sample size, perhaps a look at "expected results" is in order:
http://www.hoof.demon.co.uk/archie.html
__________________
Playing SRU Downs - home of the "no sweat" inquiries...
Defying the "laws" of statistics with every wager.
|
|
|
07-06-2014, 04:10 PM
|
#3
|
Librocubicularist
Join Date: Jun 2010
Location: Ohio
Posts: 10,466
|
Quote:
Originally Posted by coljesep
I am wondering as you tried to develop your own "system" so to speak, or things you look for inside each race... what do you consider a good enough sample size of races? 100? More?
|
My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?
__________________
Sapere aude
|
|
|
07-06-2014, 04:59 PM
|
#4
|
Veteran
Join Date: Aug 2005
Posts: 3,428
|
Quote:
Originally Posted by Actor
My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?
|
Interesting, I thought it was supposed to be at least 30 e.g., Dow Jones Industrials. Doesn't the statistical T-test also suggests at least 30?
I think any sample only needs to be representative of the population. Be careful of your sample not being representative.
|
|
|
07-06-2014, 05:25 PM
|
#5
|
The Voice of Reason!
Join Date: Mar 2001
Location: Canandaigua, New york
Posts: 112,810
|
I started on a 3 race sample at Los Al T Breds.
I'm 5 for 9 overall.
If I waited for a 1,000 sample, it would be next year.
When this stops, I will find another one to play.
Short term is where you make money.
__________________
Who does the Racing Form Detective like in this one?
|
|
|
07-06-2014, 06:16 PM
|
#6
|
Out-of-town Jasper
Join Date: Nov 2009
Posts: 2,364
|
Quote:
Originally Posted by Tom
I started on a 3 race sample at Los Al T Breds.
I'm 5 for 9 overall.
If I waited for a 1,000 sample, it would be next year.
When this stops, I will find another one to play.
Short term is where you make money.
|
Not so much short term, but being one of the first ones to use a methodology/angle. If I use an 1,000 race sample to verify profitability before betting, that makes me 1,000 races too late.
__________________
“If you want to outwit the devil, it is extremely important that you don't give him advanced notice."
~Alan Watts
|
|
|
07-06-2014, 10:09 PM
|
#7
|
Registered User
Join Date: Jan 2006
Posts: 28,546
|
Quote:
Originally Posted by Actor
My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?
|
And it still backfired.
__________________
Live to play another day.
|
|
|
07-07-2014, 08:24 AM
|
#8
|
Registered User
Join Date: Nov 2002
Posts: 30,398
|
Quote:
Originally Posted by thaskalos
And it still backfired.
|
Years ago using then a computer running a CPM operating system and a primitive version of lotus 123, set up a program using Scott's first book, How Wll Your Horse Run Today? Every Saturday brought my printouts (dot matrix) with me to OTB. Was soon quite annoyed losing every Saturday after all that work, but worse came to know two OTB regulars. A pair of wonderful elderly ladies who would cash tickets quite often using NYC Daily News public handicapper Russ Harris (chalh heavy)
I would not trust Scott, and I have since got to the point of being able to test systems using automatic modeling techniques, including length of time periods and sample size.
Depending on what factors I or the program choose to model, and what track I was playing and what time of year, the ideal sample size or time period of a model was all over the place.
But have came to the conclusion very old data often got stale, as well as very short models ---a few days were too small.
__________________
The inmates have taken over the asylum.
Last edited by hcap; 07-07-2014 at 08:26 AM.
|
|
|
07-07-2014, 09:10 AM
|
#9
|
EXCEL with SUPERFECTAS
Join Date: Mar 2004
Posts: 10,206
|
I database by individual track, and keep the most recent 24-30 cards in the database, ideally 240-260 races. I use the database for eliminations only. In shorter meets I will go back to the previous year using cards from the same time of meet and time of year, looking for similar environmental conditions.
Last edited by raybo; 07-07-2014 at 09:14 AM.
|
|
|
07-07-2014, 09:27 AM
|
#10
|
The Voice of Reason!
Join Date: Mar 2001
Location: Canandaigua, New york
Posts: 112,810
|
At my age, 9 races IS the long run!
__________________
Who does the Racing Form Detective like in this one?
|
|
|
07-07-2014, 09:54 AM
|
#11
|
Registered user
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
|
Quote:
Originally Posted by hcap
Years ago using then a computer running a CPM operating system and a primitive version of lotus 123, set up a program using Scott's first book,
|
CP/M rocked
Way better that MS-DOS who eventually became the market's standard
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein
|
|
|
07-07-2014, 12:22 PM
|
#12
|
Registered User
Join Date: Nov 2002
Posts: 30,398
|
Quote:
Originally Posted by DeltaLover
CP/M rocked
Way better that MS-DOS who eventually became the market's standard
|
My introduction to computers.
Only thing I remember other than using William Scott, was sometimes when I shifted the paper in my trusty dot matrix printer, often the image to print on my computer screen shifted too---spooky
__________________
The inmates have taken over the asylum.
|
|
|
07-07-2014, 03:00 PM
|
#13
|
Registered User
Join Date: Nov 2003
Posts: 1,230
|
This isn't a test of a method, but Ed Bain would play trainers if they had a 30% win with at least 4 wins for the categories he found important.
If a trainer was 4 for 7 with first after claim it would qualify as a bet even though there were only 7 instances, because in his experience this was enough of a history for a positive expectation.
So as not to hijack this important thread, I will start a new thread on my data from 2013 Bain like trainer and jockey results.
Thanks coljesep for a reason for me to stop procrastinating.
**************************
When Marc Cramer would test an angle, he would eliminate the top pay off so as not to skew the ROI with one extremely huge win.
|
|
|
07-07-2014, 06:16 PM
|
#14
|
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,908
|
Quote:
Interesting, I thought it was supposed to be at least 30 e.g., Dow Jones Industrials. Doesn't the statistical T-test also suggests at least 30?
|
At least 30 winners in each category.
Thus, if you were looking at odds, and had broken the horses into (say) 5 classes and the upper class was 30/1 and above, you would want 30 winners in that group, at 30/1 or higher.
|
|
|
07-07-2014, 06:45 PM
|
#15
|
Veteran
Join Date: Aug 2005
Posts: 3,428
|
I viewed the question a little differently. I thought he was looking for an adequate sample size. If you had a database population of say 10k, you could take a random sample of around 30 records and use the T-test formula (there is an F-test, z-test and probably others) to determine it's confidence level and then compare the results to another similar random sample of approx. the same size. If the confidence level was similar to the previous, you would have a confidence level that your sample(s) were representative of the population as a whole.
Btw, I recall that there are formulas to calculate what an appropriate sample size should be for a given population size. But, my statistics knowledge is limited.
|
|
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|