PDA

View Full Version : Small Sample Size?


coljesep
07-06-2014, 12:53 PM
I am wondering as you tried to develop your own "system" so to speak, or things you look for inside each race... what do you consider a good enough sample size of races? 100? More?

Hoofless_Wonder
07-06-2014, 02:58 PM
More is better, but sometimes the numbers aren't there, but are still worth looking at - especially trainer angles.

There's plenty of posts on this here - search on "sample".

Rather than worry about sample size, perhaps a look at "expected results" is in order:

http://www.hoof.demon.co.uk/archie.html

Actor
07-06-2014, 04:10 PM
I am wondering as you tried to develop your own "system" so to speak, or things you look for inside each race... what do you consider a good enough sample size of races? 100? More?My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?

whodoyoulike
07-06-2014, 04:59 PM
My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?

Interesting, I thought it was supposed to be at least 30 e.g., Dow Jones Industrials. Doesn't the statistical T-test also suggests at least 30?


I think any sample only needs to be representative of the population. Be careful of your sample not being representative.

Tom
07-06-2014, 05:25 PM
I started on a 3 race sample at Los Al T Breds.
I'm 5 for 9 overall.
If I waited for a 1,000 sample, it would be next year.
When this stops, I will find another one to play.
Short term is where you make money.

therussmeister
07-06-2014, 06:16 PM
I started on a 3 race sample at Los Al T Breds.
I'm 5 for 9 overall.
If I waited for a 1,000 sample, it would be next year.
When this stops, I will find another one to play.
Short term is where you make money.
Not so much short term, but being one of the first ones to use a methodology/angle. If I use an 1,000 race sample to verify profitability before betting, that makes me 1,000 races too late.

thaskalos
07-06-2014, 10:09 PM
My personal opinion is 1000 races or more. When I took statistics in college I was taught that the magic number was 20, but I would not bet money on a sample that small. William L. Scott used a sample of 500 to 600 races in developing his system described in How Will Your Horse Run Today?
And it still backfired. :)

hcap
07-07-2014, 08:24 AM
And it still backfired. :)Years ago using then a computer running a CPM operating system and a primitive version of lotus 123, set up a program using Scott's first book, How Wll Your Horse Run Today? Every Saturday brought my printouts (dot matrix) with me to OTB. Was soon quite annoyed losing every Saturday after all that work, but worse came to know two OTB regulars. A pair of wonderful elderly ladies who would cash tickets quite often using NYC Daily News public handicapper Russ Harris (chalh heavy) :lol:

I would not trust Scott, and I have since got to the point of being able to test systems using automatic modeling techniques, including length of time periods and sample size.

Depending on what factors I or the program choose to model, and what track I was playing and what time of year, the ideal sample size or time period of a model was all over the place.

But have came to the conclusion very old data often got stale, as well as very short models ---a few days were too small.

raybo
07-07-2014, 09:10 AM
I database by individual track, and keep the most recent 24-30 cards in the database, ideally 240-260 races. I use the database for eliminations only. In shorter meets I will go back to the previous year using cards from the same time of meet and time of year, looking for similar environmental conditions.

Tom
07-07-2014, 09:27 AM
At my age, 9 races IS the long run!

DeltaLover
07-07-2014, 09:54 AM
Years ago using then a computer running a CPM operating system and a primitive version of lotus 123, set up a program using Scott's first book,


CP/M rocked :ThmbUp:

Way better that MS-DOS who eventually became the market's standard

hcap
07-07-2014, 12:22 PM
CP/M rocked :ThmbUp:

Way better that MS-DOS who eventually became the market's standardMy introduction to computers.

Only thing I remember other than :ThmbDown: using William Scott, was sometimes when I shifted the paper in my trusty dot matrix printer, often the image to print on my computer screen shifted too---spooky :lol:
:confused:

JohnGalt1
07-07-2014, 03:00 PM
This isn't a test of a method, but Ed Bain would play trainers if they had a 30% win with at least 4 wins for the categories he found important.

If a trainer was 4 for 7 with first after claim it would qualify as a bet even though there were only 7 instances, because in his experience this was enough of a history for a positive expectation.

So as not to hijack this important thread, I will start a new thread on my data from 2013 Bain like trainer and jockey results.

Thanks coljesep for a reason for me to stop procrastinating.



**************************

When Marc Cramer would test an angle, he would eliminate the top pay off so as not to skew the ROI with one extremely huge win.

Dave Schwartz
07-07-2014, 06:16 PM
Interesting, I thought it was supposed to be at least 30 e.g., Dow Jones Industrials. Doesn't the statistical T-test also suggests at least 30?

At least 30 winners in each category.

Thus, if you were looking at odds, and had broken the horses into (say) 5 classes and the upper class was 30/1 and above, you would want 30 winners in that group, at 30/1 or higher.

whodoyoulike
07-07-2014, 06:45 PM
I viewed the question a little differently. I thought he was looking for an adequate sample size. If you had a database population of say 10k, you could take a random sample of around 30 records and use the T-test formula (there is an F-test, z-test and probably others) to determine it's confidence level and then compare the results to another similar random sample of approx. the same size. If the confidence level was similar to the previous, you would have a confidence level that your sample(s) were representative of the population as a whole.

Btw, I recall that there are formulas to calculate what an appropriate sample size should be for a given population size. But, my statistics knowledge is limited.

Actor
07-08-2014, 01:39 AM
And it still backfired. :) :lol: :lol: :lol:

Did it?

What is your source for that? It might explain why I couldn't make it work. :bang:

Actor
07-08-2014, 01:43 AM
CP/M rocked :ThmbUp:

Way better that MS-DOS who eventually became the market's standard
CP/M begat UNIX.

UNIX begat MS-DOS

MS-DOS begat Windows

Frustration begat Linux

Ray2000
07-08-2014, 08:21 AM
The number of races (wagers) needed before you can have any confidence that they mean anything depends on whether you're looking at hit rate or ROI. For Hit Rates, the statistical calculations are easy but must be stated in the proper way:

How many wagers do I need in my test to be (?)% sure that my HitRate is true within a certain Margin of Error (X).

The formula is :
Number of Races needed = HitRate times (1 minus HitRate) times "Z" all divided by ("X"^2)
where
"Z" = 2.71 to be 90% sure
....= 3.84 to be 95% sure
....= 6.63 to be 99% sure

"X" = the margin of error you'll accept, say ±5%

Example: for an observed HitRate of 35% and you want to be 95% confident that the HitRate is significant within ±5%

You'll need ...

(0.35 x (1 - 0.35) x 3.84) / (0.05 x 0.05) = 350 races

reducing the margin of error to a tighter ±3%----> needs 970 races



For the number of wagers needed when testing ROI the calculations get more involved. The outcome is not binomial (hit/miss, yes/no) but includes hit rate and average return when successful. Also the distribution of returns is not normal (bell shaped curve) and outliers mess things up so usual statistic formulas are "iffy". For ROI the best way to evaluate your system is too calculate the P value by using Z scores or T test as mentioned by whodoyoulike. This will give you an indication of the results being attributed to just LUCK.

coljesep
07-08-2014, 08:48 AM
What would be a satsifactory ROI over say 100 races? Anything in the plus, but would small negative be considered realistically satisfactory?

Capper Al
07-08-2014, 08:53 AM
At least 30 winners in each category.

Thus, if you were looking at odds, and had broken the horses into (say) 5 classes and the upper class was 30/1 and above, you would want 30 winners in that group, at 30/1 or higher.

Dave,

This isn't how I remember you on this topic. Wasn't a 1000 races too small before?

Capper Al
07-08-2014, 08:56 AM
I just bought 12000 race results from BRIS. There are areas that I still need more races to cover. For instance, not many handicap races to choose from.

hcap
07-08-2014, 09:01 AM
A word of caution. Often an increase in r.o.i or win % over many races, can be
artificially obtained by "back fitting". Confidence in a method and the odds of a method or system going forward should also be verified by taking what appears to be a dynamite system and testing it on a set of races not used to devise that system in the first place.

traynor
07-08-2014, 04:16 PM
I would suggest paying a LOT of attention to Dave's suggestion. I would also suggest adding a ratio component. The higher the ratio of winners to sample size, the more likely it is that the scenario could be predictive.

30 winners in a 300 race sample is one thing. 30 winners in a 3000 race sample is quite another. Some claim to make exacta wagers on the basis of 1% probability. That may make sense, if one has a database of 100,000 or so races to generate that figure. Wagering on something with a frequency of a few percentage points in anything less than several thousand races--AND that has not been thoroughly tested to assure normal distribution on other sets of races--is probably not a good idea.

Small(ish) samples can be useful, if the strike rate/sample size is decent. 10% is low end, more is better. Again, the initial sample should be replicated on other groups of races before giving it serious (as in betting real money) consideration.

traynor
07-08-2014, 04:28 PM
A word of caution. Often an increase in r.o.i or win % over many races, can be
artificially obtained by "back fitting". Confidence in a method and the odds of a method or system going forward should also be verified by taking what appears to be a dynamite system and testing it on a set of races not used to devise that system in the first place.

Absolutely. If a scenario only works on a specific chunk of races and has to be tweaked frequently "because racing changes," one should be extremely skeptical. "Normal distribution" implies that if something happens 10 times in 100 races, testing it on another (randomly selected) sample of 100 races should produce (approximately) the same number of hits. If it doesn't it is likely only applicable to the specific sample of races "studied."

hcap
07-08-2014, 05:10 PM
Absolutely. If a scenario only works on a specific chunk of races and has to be tweaked frequently "because racing changes," one should be extremely skeptical. "Normal distribution" implies that if something happens 10 times in 100 races, testing it on another (randomly selected) sample of 100 races should produce (approximately) the same number of hits. If it doesn't it is likely only applicable to the specific sample of races "studied." Otherwise known as "back fitting."Many "systems" sold, if not just plain hype, suffer from the system creator/seller starting with a few across the board rules and as the testing continues the system seller/creator will see that perhaps "just" a slight modification to the basic rules would have gotten a winner missed by the ordinal qualifications. (Usually a $30 payoff :lol: )

Also a minor twist would be.......

"if I only eliminated that horse i would have avoided another loser" and a new elimination rule is born

A few dozen rounds of this maybe good intentioned fudge, and Voilą an artificially naive manipulation of the overall bottom line gets done, and often may encourage the creator to daydream about retiring early :lol: :lol:

Unfortunately the paper and pencil sellers are not the only ones. Fast computers sometimes provide sophisticated variations on "if I only zigged instead of zagged" at speeds 1000's of times faster than manual back fitting of our pencil and paper fudging system developer.

RPM a long time system mill, seems to follow this practice :cool:

thaskalos
07-08-2014, 06:58 PM
Many "systems" sold, if not just plain hype, suffer from the system creator/seller starting with a few across the board rules and as the testing continues the system seller/creator will see that perhaps "just" a slight modification to the basic rules would have gotten a winner missed by the ordinal qualifications. (Usually a $30 payoff :lol: )

Also a minor twist would be.......

"if I only eliminated that horse i would have avoided another loser" and a new elimination rule is born

A few dozen rounds of this maybe good intentioned fudge, and Voilą an artificially naive manipulation of the overall bottom line gets done, and often may encourage the creator to daydream about retiring early :lol: :lol:

Unfortunately the paper and pencil sellers are not the only ones. Fast computers sometimes provide sophisticated variations on "if I only zigged instead of zagged" at speeds 1000's of times faster than manual back fitting of our pencil and paper fudging system developer.

RPM a long time system mill, seems to follow this practice :cool:

Not to mention the PHILLIPS RACING NEWSLETTER. :ThmbDown:

This rag has supposedly researched and tested HUNDREDS of handicapping systems...and they've identified over a hundred that they've deemed to be "highly profitable".

And about 98% of the players STILL lose in this game.

traynor
07-08-2014, 08:40 PM
Not to mention the PHILLIPS RACING NEWSLETTER. :ThmbDown:

This rag has supposedly researched and tested HUNDREDS of handicapping systems...and they've identified over a hundred that they've deemed to be "highly profitable".

And about 98% of the players STILL lose in this game.

I agree completely about PRN "systems." However, they are no worse than many CALDs (computer-assisted losing devices). Many seem to believe that numbers generated by a computer are somehow of a higher order of reality than everyday numbers. That is, the ordinary constraints that would apply to numbers generated on any other topic are believed to be held in abeyance when crunching horse race numbers.

As in, "Never mind all that statistics nonsense, this is how it works in the real world." Amazing.

thaskalos
07-08-2014, 08:48 PM
I agree completely about PRN "systems." However, they are no worse than many CALDs (computer-assisted losing devices). Many seem to believe that numbers generated by a computer are somehow of a higher order of reality than everyday numbers. That is, the ordinary constraints that would apply to numbers generated on any other topic are believed to be held in abeyance when crunching horse race numbers.

As in, "Never mind all that statistics nonsense, this is how it works in the real world." Amazing.
The only difference that I can see between the computer-assisted systems and the pencil-and-paper ones is that the computer-assisted systems are more expensive. I guess the vendors of these systems think that the higher price-tag gives them added credibility. :rolleyes:

Tom
07-08-2014, 09:18 PM
This rag has supposedly researched and tested HUNDREDS of handicapping systems...and they've identified over a hundred that they've deemed to be "highly profitable".

Profitable tot eh guy using it or the guy selling it? :D

BUT, I do have to give them credit for one thing - I first read in PN about this guy who was winning lots of races and had some computer program he was selling. It was Doc Sartin.

thaskalos
07-08-2014, 09:26 PM
Profitable tot eh guy using it or the guy selling it? :D

BUT, I do have to give them credit for one thing - I first read in PN about this guy who was winning lots of races and had some computer program he was selling. It was Doc Sartin.

Well then...I guess I owe the rag an apology. 1 out of 500 ain't bad. :)

clemkadiddle
07-08-2014, 09:32 PM
I can't comprehend using a prescribed sample size. I know a lot of die-hards...in their efforts to find ways to win...resort to accumulating a body of data and mining it for stats.

Personally, I started this attempt several years ago but found it to be a waste of time.

What I did do...was start developing a "tool" to measure the amount of work being performed by the horse during the running of the race. After 8 years of throwing "mathematical darts" at the situation, I seemed to have developed a plausible model that equates efforts occurring in 5 furlong races all the way to 12 furlong races.

I used Excel (with Excel's VBA language to write macros) to first calculate the track variant from both DRF (Beyer) and BRIS speed figures. I deal strictly in feet-per-second and find flaws with "adding a fifth of second for each length" and the theory of "parallel time" on which speed figures are based. (They are helpful though, with regard to track variant.)

Still using Excel, I kept throwing ideas at the situation. What I can tell you is that:

1. Parallel time charts are derived from the base-10 logarithms of the average feet per second, where the values of the logarithms differ by .0062 per furlong. Separate charts must be kept based on the number of turns navigated.

2. Horses expend energy exponentially; the faster they run, their energy becomes consumed at an increasingly greater rate. This was the reason behind parallel time and speed ratings, but the fact is that the final time and speed rating is the end result of the pace; that is why it is so important to derive one's own pace calculation tool.

I apologize for not divulging the entire knowledge base that I acquired (because that was 8 years of research and I doubt if anyone in my position would offer that up for free), but what I did provide should give you all a head start.

My background: Senior Programmer/Analyst for a global manufacturing company that provides components to support Oil and Gas, Transportation, and Power Generation. Thoroughbred handicapping has always been a hobby, but I wasn't interested in developing a "tool" until I actually questioned myself regarding my background, interest, having Excel available as the platform, and an opportunity to follow through...happy I did...

I just goes to show that the good things do not come easy. I know you stat folks and data miners feel that same way.

Good luck to everyone.

Capper Al
07-09-2014, 10:17 AM
Speed is easy in a spreadsheet or in a program. Try making class figs like BRIS' RR and CR.

whodoyoulike
07-09-2014, 06:14 PM
I can't comprehend using a prescribed sample size. I know a lot of die-hards...in their efforts to find ways to win...resort to accumulating a body of data and mining it for stats.

Personally, I started this attempt several years ago but found it to be a waste of time.

I guess we disagree right off.

What I did do...was start developing a "tool" to measure the amount of work being performed by the horse during the running of the race. After 8 years of throwing "mathematical darts" at the situation, I seemed to have developed a plausible model that equates efforts occurring in 5 furlong races all the way to 12 furlong races.

Is this similar to a pace calculation?

... I deal strictly in feet-per-second and find flaws with "adding a fifth of second for each length" and the theory of "parallel time" on which speed figures are based. (They are helpful though, with regard to track variant.)

I agree there is a flaw using "adding a fifth of second for each length" but I calculated this using a sample of Trakus info when they started publishing it at Dmr and I believe GP several years ago. I found it wasn't that far off. But, I use my findings.

... 2. Horses expend energy exponentially; the faster they run, their energy becomes consumed at an increasingly greater rate. This was the reason behind parallel time and speed ratings, but the fact is that the final time and speed rating is the end result of the pace; that is why it is so important to derive one's own pace calculation tool.

This is kind of my idea of pace handicapping.

I apologize for not divulging the entire knowledge base that I acquired (because that was 8 years of research and I doubt if anyone in my position would offer that up for free), but what I did provide should give you all a head start.

I understand your reasons to not provide details, others may not but, I wouldn't worry about it. I'm hoping you can contribute your insights and experiences. I think we're all attempting to better understand this thing we call horse handicapping.

Btw, I'm not a spokesperson for this site. I just enjoy discussing handicapping ideas.

And, good luck to you.

clemkadiddle
07-09-2014, 09:16 PM
Thanks for your interest.

I do have several books and read them over and over again. One thing that I never wanted to do was read anything by Sartin or Brohammer. My reasons were simple:

1. I did not want to inherit someone else's mindset in seeding my thought processes. I wanted to let racing portray itself for what it was mathematically...without being influenced by someone else's opinion so that I could analyze it from a pure beginning.

2. If Sartin or Brohammer were really winning at the races with their methods, they would not be writing and publishing books.

Point #2 goes for most authors. However, my mentor...who I considered a genuine professional handicapper...used to say buy whatever book you can find because if there is one thing in that book that can save you the price of a bet it is worth the price of the book.

Some of the best books are the oldest. I don't have anything that is newer than 30 years old.

Here are some of the best works:

Ray Talbot's "Thoroughbred Horseracing 'Playing for Profit'"
Bob Heyburn's "Fast and Fit Horses"
William Scott's "Investing at the Racetrack" and "How Will Your Horse Run Today?"
Andy Beyer's "Picking Winners" and "The Winning Horseplayer"

Get to know parallel time charts. Basically, find a way to reverse-engineer the final time where the 100 point score would have been awarded and look up the final time for that distance on a parallel time chart. Figure out the average feet-per-second for both. Divide the 100 point FPS by the parallel time chart's FPS and that gives you the track variant as a ratio. You can then use that inside the race on the fractional times (and corresponding FPS at each call) to derive some meaningful pace figures in terms of FPS. Don't resort to figuring averages as you analyze the race; try to pull each segment out separately so that you can truly see the horse speeding up and slowing down.

Don't worry about what was happening on the lead; concentrate on what the specific horse's performance is telling you.

The real magic occurs over time, as you analyze this series of numbers. You will see what I am talking about eventually. Again, it is very important to isolate each segment of the race.

clemkadiddle
07-09-2014, 09:19 PM
BRIS figs dissolve in averages. One needs to isolate each specific segment of the race to see what is truly going on.

I am aware of these figures, but they are worthless. The only real figure is the final "speed" figure which allows me to calculate the track variant.

Both BRIS and DRF would love to have the model I developed...and it ain't in any software other than my little Excel spreadsheet.

raybo
07-09-2014, 09:39 PM
BRIS figs dissolve in averages. One needs to isolate each specific segment of the race to see what is truly going on.

I am aware of these figures, but they are worthless. The only real figure is the final "speed" figure which allows me to calculate the track variant.

Both BRIS and DRF would love to have the model I developed...and it ain't in any software other than my little Excel spreadsheet.

I do the same thing in Excel, segmentally, but don't have to resort to reverse engineering any figures. And, my segmental fps calcs are available, for free, in the AllData Project workbook. This is really no big secret. By the way, I haven't used 1/5 second, or any other static multiplier, for a beaten length in years, it's simply incorrect.

traynor
07-09-2014, 10:27 PM
Thanks for your interest.

I do have several books and read them over and over again. One thing that I never wanted to do was read anything by Sartin or Brohammer. My reasons were simple:

1. I did not want to inherit someone else's mindset in seeding my thought processes. I wanted to let racing portray itself for what it was mathematically...without being influenced by someone else's opinion so that I could analyze it from a pure beginning.

2. If Sartin or Brohammer were really winning at the races with their methods, they would not be writing and publishing books.

Point #2 goes for most authors. However, my mentor...who I considered a genuine professional handicapper...used to say buy whatever book you can find because if there is one thing in that book that can save you the price of a bet it is worth the price of the book.

Some of the best books are the oldest. I don't have anything that is newer than 30 years old.

Here are some of the best works:

Ray Talbot's "Thoroughbred Horseracing 'Playing for Profit'"
Bob Heyburn's "Fast and Fit Horses"
William Scott's "Investing at the Racetrack" and "How Will Your Horse Run Today?"
Andy Beyer's "Picking Winners" and "The Winning Horseplayer"

Get to know parallel time charts. Basically, find a way to reverse-engineer the final time where the 100 point score would have been awarded and look up the final time for that distance on a parallel time chart. Figure out the average feet-per-second for both. Divide the 100 point FPS by the parallel time chart's FPS and that gives you the track variant as a ratio. You can then use that inside the race on the fractional times (and corresponding FPS at each call) to derive some meaningful pace figures in terms of FPS. Don't resort to figuring averages as you analyze the race; try to pull each segment out separately so that you can truly see the horse speeding up and slowing down.

Don't worry about what was happening on the lead; concentrate on what the specific horse's performance is telling you.

The real magic occurs over time, as you analyze this series of numbers. You will see what I am talking about eventually. Again, it is very important to isolate each segment of the race.

Do the same with the fps of the horse divided by the fps of the leader at each call. What the Sartin advocates called Pace of Horse/Pace of Race (POH/POR). It generates a percentile relationship of each horse's speed at each segment of the race, compared to the leader. It is not rocket science, and no dogma is attached. It is just another set of numbers that one can use to interpret what actually happened in the race.

For one thing, it will point out the situations in which "early pace" and "late pace" are misinterpreted when only comparing fps values between entries, ignoring what they were doing in relation to the leader while earning those values. Basic pace analysis.

What did/do you really think of Heyburn?

I think Jay Hovdey was the first I heard of using the percentile for variants, way back. Works well. Especially when weighted by race segment.

Dave Schwartz
07-10-2014, 12:05 AM
Dave,

This isn't how I remember you on this topic. Wasn't a 1000 races too small before?


I am not saying that 1,000 races is "enough." He asked about Z-tables starting at sample of 30. I was simply saying that 30 winners in a category was required to get a z-score on that category.

Think of it like this: I have 30 bets at 30/1. If I have 2 winners out of 30 should I be excited? Probably not. However, if I had 30 winners out of 450 bets that is time for some degree of excitement.

30 starts in that category is pretty worthless.

Billnewman
07-10-2014, 03:16 AM
[QUOTE=clemkadiddle]BRIS figs dissolve in averages. One needs to isolate each specific segment of the race to see what is truly going on.

I am aware of these figures, but they are worthless. The only real figure is the final "speed" figure which allows me to calculate the track variant.

Both BRIS and DRF would love to have the model I developed...and it ain't in any software other than my little Excel spreadsheet.[/QUOTE

I don't understand why you would say bris figures are worthless if you're using their variant? Making that daily variant is the single most difficult part of figure handicapping. Just because you're using a different set of formulas to spit that variant through you are basically using the same numbers as everyone else that uses bris. Unless you change the distance of a beaten length for deceleration as the race progresses. That is what I'd guess you are hinting at.

traynor
07-10-2014, 10:10 AM
[QUOTE=clemkadiddle]BRIS figs dissolve in averages. One needs to isolate each specific segment of the race to see what is truly going on.

I am aware of these figures, but they are worthless. The only real figure is the final "speed" figure which allows me to calculate the track variant.

Both BRIS and DRF would love to have the model I developed...and it ain't in any software other than my little Excel spreadsheet.[/QUOTE

I don't understand why you would say bris figures are worthless if you're using their variant? Making that daily variant is the single most difficult part of figure handicapping. Just because you're using a different set of formulas to spit that variant through you are basically using the same numbers as everyone else that uses bris. Unless you change the distance of a beaten length for deceleration as the race progresses. That is what I'd guess you are hinting at.

Only until one has refined his or her methodology, and eliminated the indecision that causes one to believe that "artistic interpretations" of the data are required. They are not.

DeltaLover
07-10-2014, 11:16 AM
Making that daily variant is the single most difficult part of figure handicapping.

I disagree.

The reason lies in the evaluation of a specific TV estimation algorithm..

Think about...

How you are going to decide that one algorithm is better than the other? Do you have any ideas?

Billnewman
07-10-2014, 12:59 PM
I guess I don't understand what you are doing. When I make a variant i don't use a algorithm. Mine is made by what I believe to be reality. It might change midway through a card by if the jockeys say they are adding to much water, or if the wind dies out. I thought you use algorithms for trainer moves. Like drop 2 class levels or distance change.

clemkadiddle
07-10-2014, 08:42 PM
I don't use THEIR variant...I reverse-engineer based on their final speed figure and comparing it to standard parallel time charts.

clemkadiddle
07-10-2014, 08:53 PM
I trust my own work. When I have the same horse running different distances...or even the same distance for that matter in the current form cycle and the speed ratings vary...my calculations ironically put these efforts within a point.

Not that I am really calculating pace, BECAUSE I DON'T. I use pace to calculate THE AMOUNT OF WORK BEING PERFORMED IN THE COURSE OF THE RACE.

That is the only true measurement...because distance becomes taken out of the equation and the confusion that varying distance poses to handicapping also disappears.

For instance, how would anyone liked to have had Danza in the Arkansas Derby? I didn't have him, I admit...but I used that race in my research and now I have a tool that puts him right in the midst. In fact, his prior race as a 2YO at 6.5 furlongs killed anything in that field. You can bet the next time an opportunity like this presents itself I will get my hooks into it.

I also used Chitu's 6 furlong effort prior to the Robert B. Lewis as part of my research, in order to validate my calculations on Danza.

At this point I would be tempted to divulge the foundation behind this algorithm. For the moment, think about the point where a horse starts to expend racing energy rather than gallop. That's the point where one needs to start their research.

Well...gotta go for now. I have several races to review for Saturday and the data entry task awaits.

whodoyoulike
07-10-2014, 09:09 PM
I trust my own work....
At this point I would be tempted to divulge the foundation behind this algorithm. For the moment, think about the point where a horse starts to expend racing energy rather than gallop. That's the point where one needs to start their research.

Well...gotta go for now. I have several races to review for Saturday and the data entry task awaits.

FYI, you'll notice that you start off with a 400 vCash amount. I understand that each time you divulge or provide helpful info, you're awarded an addional 25 vCash. If it happens to be incorrect there's an 11 vCash reduction, disregard my totals because I was testing the program.

coljesep
07-10-2014, 10:33 PM
Really appreciate the dialogue here!

Tom
07-10-2014, 10:35 PM
FYI, you'll notice that you start off with a 400 vCash amount. I understand that each time you divulge or provide helpful info, you're awarded an addional 25 vCash.

Yes, that is true. :blush:

raybo
07-10-2014, 11:10 PM
Dang, I guess I've been wasting my time here!

whodoyoulike
07-10-2014, 11:10 PM
Don't think I haven't noticed every time you post.

raybo
07-10-2014, 11:17 PM
I trust my own work. When I have the same horse running different distances...or even the same distance for that matter in the current form cycle and the speed ratings vary...my calculations ironically put these efforts within a point.

Not that I am really calculating pace, BECAUSE I DON'T. I use pace to calculate THE AMOUNT OF WORK BEING PERFORMED IN THE COURSE OF THE RACE.

That is the only true measurement...because distance becomes taken out of the equation and the confusion that varying distance poses to handicapping also disappears.

For instance, how would anyone liked to have had Danza in the Arkansas Derby? I didn't have him, I admit...but I used that race in my research and now I have a tool that puts him right in the midst. In fact, his prior race as a 2YO at 6.5 furlongs killed anything in that field. You can bet the next time an opportunity like this presents itself I will get my hooks into it.

I also used Chitu's 6 furlong effort prior to the Robert B. Lewis as part of my research, in order to validate my calculations on Danza.

At this point I would be tempted to divulge the foundation behind this algorithm. For the moment, think about the point where a horse starts to expend racing energy rather than gallop. That's the point where one needs to start their research.

Well...gotta go for now. I have several races to review for Saturday and the data entry task awaits.

Hmmm - I thought horses start expending energy/doing work as soon as the gate opens, and that sometimes they start expending energy before the gate opens.

We wouldn't want you to divulge your hard work, so do yourself a favor and keep it a secret.

thaskalos
07-11-2014, 01:52 PM
I trust my own work. When I have the same horse running different distances...or even the same distance for that matter in the current form cycle and the speed ratings vary...my calculations ironically put these efforts within a point.

Not that I am really calculating pace, BECAUSE I DON'T. I use pace to calculate THE AMOUNT OF WORK BEING PERFORMED IN THE COURSE OF THE RACE.

That is the only true measurement...because distance becomes taken out of the equation and the confusion that varying distance poses to handicapping also disappears.

For instance, how would anyone liked to have had Danza in the Arkansas Derby? I didn't have him, I admit...but I used that race in my research and now I have a tool that puts him right in the midst. In fact, his prior race as a 2YO at 6.5 furlongs killed anything in that field. You can bet the next time an opportunity like this presents itself I will get my hooks into it.

I also used Chitu's 6 furlong effort prior to the Robert B. Lewis as part of my research, in order to validate my calculations on Danza.

At this point I would be tempted to divulge the foundation behind this algorithm. For the moment, think about the point where a horse starts to expend racing energy rather than gallop. That's the point where one needs to start their research.

Well...gotta go for now. I have several races to review for Saturday and the data entry task awaits.

So...it is your contention that horses NEVER run "bad races"?

whodoyoulike
07-11-2014, 02:59 PM
I am not saying that 1,000 races is "enough." He asked about Z-tables starting at sample of 30. I was simply saying that 30 winners in a category was required to get a z-score on that category.

Think of it like this: I have 30 bets at 30/1. If I have 2 winners out of 30 should I be excited? Probably not. However, if I had 30 winners out of 450 bets that is time for some degree of excitement.

30 starts in that category is pretty worthless.


I think you've changed the scope of the OP's question. I was responding to Actor's comment of at least 20 for an adequate sample size in post #3. But, continuing with your line of thinking, say you were looking at (30-1) odds winners for a:

6f race on the dirt for males 3up and the reason the (30-1) won was

A. Most of the field ran the 1/2 in 43 - 44 while the (30-1) ran it in 45.5 - 46.5

or

B. The (30-1) won because they were able to improve their form and run the 1/2 in 44.8 - 45.5

Which would you rather do, analyze approx. 30 representative races or 1000? How many of the 1000 races would you review before you were comfortable to make a decision? Would it be 750, 850 or 1000?

I realize Tom would have made up his mind after 3 but, I suspect it's because he's a degenerate horse player (seems like a lot of us are on here).

Tom
07-11-2014, 03:27 PM
For instance, how would anyone liked to have had Danza in the Arkansas Derby?

I bet him.

castaway01
07-11-2014, 03:49 PM
Not so much short term, but being one of the first ones to use a methodology/angle. If I use an 1,000 race sample to verify profitability before betting, that makes me 1,000 races too late.

I'm with you and Tom on this. I like to play trainer angles, and you've got to move early and before everyone else is onto it. Once the angle becomes obvious then even if the trainer is still winning, there's usually not much money to be made. I respect the guys who are grinding out profits with huge databases to determine what works and doesn't, but I believe in striking while the angle is hot.

raybo
07-11-2014, 05:29 PM
I'm with you and Tom on this. I like to play trainer angles, and you've got to move early and before everyone else is onto it. Once the angle becomes obvious then even if the trainer is still winning, there's usually not much money to be made. I respect the guys who are grinding out profits with huge databases to determine what works and doesn't, but I believe in striking while the angle is hot.

IMO, large database players and small database players are operating in totally different worlds. The large database players are probably more likely to stay with a certain method longer than the small database player, who is constantly updating his smaller database with recent races and deleting older, less valid races, from the database.

TrifectaMike
07-11-2014, 08:00 PM
I am wondering as you tried to develop your own "system" so to speak, or things you look for inside each race... what do you consider a good enough sample size of races? 100? More?

Whatever sample size makes you happy plus 1

Mike

TrifectaMike
07-11-2014, 08:05 PM
I disagree.

The reason lies in the evaluation of a specific TV estimation algorithm..

Think about...

How you are going to decide that one algorithm is better than the other? Do you have any ideas?

Great question DL... followed by the beauty of silence

Mike

clemkadiddle
07-11-2014, 09:06 PM
Here's a sampling. The first line "pace figures" are simply the FPS adjusted for variant and turned into a usable number. To get the FPS, add 80 and divide by 3. The "work" figures are where the analysis is made on how much energy spent in the race. The running total is a collection of the "work" during the race; the grand total is the final figure. There's an additional algorithm that's made on this series. All I can say is that I only bet dirt races at 8 furlongs and over; the horse with the better late numbers gets the distance.

Gulfstream 2nd, 7-12-14

01. Chilean Boy
furlongs 2.0 2.0 2.0 2.0
pace figures 80 82 83 74
work/furlong 6.3 6.8 7.2 4.7
work subtotal 12.6 13.7 14.4 9.5
work running total 26.3 40.7 50.2

furlongs 2.0 2.0 2.0 2.3
pace figures 84 82 78 74
work/furlong 7.4 6.8 5.7 4.8
work subtotal 14.8 13.6 11.4 9.6
work running total 28.4 39.8 49.4
================================================
02. Wrol Up
furlongs 2.0 2.0 2.0 1.0
pace figures 94 90 77 65
work/furlong 10.9 9.6 5.5 2.5
work subtotal 21.7 19.1 11.0 2.5
work running total 40.9 51.8 54.3

furlongs 2.0 2.0 2.0 2.3
pace figures 83 82 77 71
work/furlong 7.2 7.0 5.6 3.8
work subtotal 14.4 14.0 11.2 7.7
work running total 28.4 39.5 47.2
================================================
03. Whiskey Tap
furlongs 2.0 2.0 2.0 2.0
pace figures 82 92 83 72
work/furlong 7.0 10.0 7.2 4.1
work subtotal 14.0 20.0 14.5 8.1
work running total 33.9 48.4 56.6

furlongs 2.0 2.0 2.0 2.5
pace figures 81 88 83 69
work/furlong 6.5 8.8 7.3 3.5
work subtotal 13.0 17.6 14.6 7.0
work running total 30.6 45.2 52.2
================================================
04. Red Hills
furlongs 2.0 2.0 2.0 2.5
pace figures 82 86 83 70
work/furlong 6.9 8.0 7.1 3.7
work subtotal 13.9 16.0 14.3 7.4
work running total 29.8 44.1 51.5

furlongs 2.0 2.0 2.0 2.0
pace figures 81 91 84 71
work/furlong 6.7 9.8 7.4 4.0
work subtotal 13.4 19.6 14.8 7.9
work running total 33.0 47.8 55.7


================================================
05. Feels Like Flying
furlongs 2.0 2.0 2.0 0.5
pace figures 94 91 75 63
work/furlong 10.7 9.8 4.8 2.1
work subtotal 21.4 19.6 9.6 1.0
work running total 41.0 50.6 51.7

furlongs 2.0 2.0 2.0 1.0
pace figures 95 91 78 64
work/furlong 11.2 9.8 5.7 2.4
work subtotal 22.4 19.7 11.4 2.4
work running total 42.1 53.5 55.9
================================================
06. Giacomo the Great
furlongs 2.0 2.0 2.0 1.0
pace figures 95 87 75 66
work/furlong 11.1 8.3 5.1 2.9
work subtotal 22.3 16.6 10.1 2.9
work running total 38.9 49.0 51.9

furlongs 2.0 2.0 2.0 2.0
pace figures 90 84 73 68
work/furlong 9.3 7.4 4.3 3.1
work subtotal 18.7 14.8 8.7 6.2
work running total 33.5 42.2 48.5
================================================
07. Global Question
furlongs 2.0 2.0 2.0 2.0
pace figures 85 87 79 71
work/furlong 7.8 8.5 6.0 4.0
work subtotal 15.7 17.0 12.1 8.0
work running total 32.7 44.8 52.8

furlongs 2.0 2.0 2.0 2.0
pace figures 80 86 80 75
work/furlong 6.4 8.0 6.2 4.8
work subtotal 12.7 15.9 12.4 9.7
work running total 28.7 41.0 50.7
================================================
08. Double Judge
furlongs 2.0 2.0 2.0 0.5
pace figures 99 95 75 60
work/furlong 12.6 11.1 4.8 1.6
work subtotal 25.3 22.3 9.7 0.8
work running total 47.6 57.2 58.0

furlongs 2.0 2.0 2.0 1.0
pace figures 98 92 80 64
work/furlong 12.3 10.2 6.4 2.3
work subtotal 24.6 20.5 12.7 2.3
work running total 45.1 57.8 60.1
================================================

Fingal
07-13-2014, 01:38 PM
IMO, large database players and small database players are operating in totally different worlds. The large database players are probably more likely to stay with a certain method longer than the small database player, who is constantly updating his smaller database with recent races and deleting older, less valid races, from the database.

Personally I look at sample size as it relates to confidence. When I started back in the 80's I would have a years worth of DRF's, then it progressed to only a meet or a month, currently on the desk next to me it's only a week. Now it's more of a confirmation of my methods.

traynor
07-13-2014, 03:16 PM
Personally I look at sample size as it relates to confidence. When I started back in the 80's I would have a years worth of DRF's, then it progressed to only a meet or a month, currently on the desk next to me it's only a week. Now it's more of a confirmation of my methods.

I think it is really quite simple. If one is seeking Truth-With-a-Big-T, one needs a VERY large database, and will probably spend one's life (as MANY others have and continue to do) chasing rainbows. And very little time wagering (or, if wagering, making little or no profit).

If one is seeking profit, one uses whatever size "database" one has found to be profitable, using whatever methods one has found to be profitable. Take the money, keep the profit, enjoy life. It is that simple.

Personally, I eschew both databases that are too large and databases that are too small. Like Goldilocks' porridge, my databases are just right. However, that is only true for the specific methodology I use. YMMV.