PDA

View Full Version : Backfitting Revisited


bucktron
06-20-2008, 08:23 PM
Much has been said about back fitting over the years on this board. I have a question for the software users and developers that incorporate this method into their handicapping decisions. Has anyone ever developed a model with a win percentage greater than 67% when betting two horses? Minimum training set requirements:

1. 15000 plus races
2. Random distances, classes and surfaces
3. 20 or more tracks

What were the results of this model when analyzing a test set of the same size and composition? From everything I have read, the size and ramdomness of training sample will not make any difference and the model will be unprofitable when presented with a similar size test set. I would just like to hear the results and insite from those who have tried it.

Hammerhead
06-20-2008, 09:46 PM
Far as I know backfitting will fail. It may win for a time or two but over the long run is domed to fail as all races and horse and temperments plus track conditions will never be identical. I wish it was so simple.

BCOURTNEY
06-21-2008, 01:56 AM
Can we define backfitting? Not sure I understand what the OP means.

Overlay
06-21-2008, 02:14 AM
Backfitting refers to developing a handicapping method based on past race results that would have produced a given winning percentage or ROI if followed in those races. The assumption is that the future will duplicate the patterns that produced the past results. An example would be the type of elimination checklists that appear around the time of the Kentucky Derby each year, where the developer of the list tells you that if you had applied the checklist in the order given to each year's Derby field until only one horse was left, you would have had the Derby winner for the last (fill in the blank) years.

ranchwest
06-21-2008, 03:36 AM
I don't think it is reasonable to anticipate specific results ranges based on backfitting without it being a very sophisticated methodology. Whatever your past results, the future results will differ.

Each race is its own puzzle.

andicap
06-21-2008, 09:37 AM
If you've read "Bet with the Best 2" you'll see the issues with backfitting.

In Watchmaker's chapter he outlines a spot play system using stakes races that has elicited a 5% profit from 2002-2006. But in one year the play lost 12%. it was profitable in the other three years. But what if you had backfitted a system the year BEFORE it lost. You would have thrown in the towel BEFORE the system returned to profitability. After one year of probability and one year of loss you would have been confused -- does this work or not? So maybe in the profitable third year you're not using it. And on and on.

The best solution to successfully using backfitting to develop spot plays?
The ones who say they are doing well over at HTR say they have develeped a mutual fund of spot plays so that one is not working another one would be.
Of course the key is using a method after testing it on a large enough sample size, etc. etc But the warning is, even after you've done that it could still lose for a decent period of time before turning around. OR, what initial results could have been what pollster consider an "outlier" and not work consistently again.

sjk
06-21-2008, 10:16 AM
I believe that if you bet every horse in the past 10 years whose name begins with "A" that started the 8th race from the 8th post you would have made a profit.

The ones that started the 8th race from the 10 post did even better.

I would look for solid reasons why a backfitted method should have worked before I would put much trust in it.

Or just be on the lookout for those "A" horses in the 8th. (There are many such combinations)

rokitman
06-21-2008, 10:18 AM
Price decay is a continuous problem in some subsets but forward reliability will be relative to the expertise of the researcher. It's not set in stone that plays born of DB mining will go to pooh. But it is certain if you if you are not extremely good at it.

Very bad news for the next new guy entering the thoroughbred DB mining biz: you are not very good at at. And you probably never will be. It's unlikely that you're smart enough. And even less likely that you will put in the necessary time.

misscashalot
06-21-2008, 11:07 AM
1. 15000 plus races
2. Random distances, classes and surfaces
3. 20 or more tracks


Your premise is wrong
If you mix all primary colors you get gray
The procedure here is to separate factors
because subsets have different results
therefore should be isolated.

example
2 turns races have fewer winning favorites than 1 turn
2 yr olds have higher winning favs than other age groups
Turf races have less winning favs than dirt
and there are more differences other than winning favs

and different tracks play differently due to quality of stock

singunner
06-21-2008, 12:58 PM
The idea of "back-fitting" is something every statistician in their right mind will tell you to avoid. We take specific precautions to AVOID this sort of thing. It's sometimes referred to as "shaping", because you make your results fit your past data. Past data can only be used to attempt to identify trends. These trends must then be tested FORWARD against future races with absolute vigor to avoid polluting your test data with your research data.

On the other hand, if you really want to know more about this, you need to ask drug companies, politicians and the media. They're the masters of deriving their desired results regardless of past data.

cj
06-21-2008, 01:41 PM
You can make a lot of money using databases. The key is to know how to write complex queries and use factors not available to the general public.

Tom
06-21-2008, 03:42 PM
I like sssssssshort term trends that a db can uncover.
Andy, at SA last winter, SP at SA was hitting over 33% and paying back an roi of over 4! for a few weeks. I hopped on that train until it ran out of steam - for my short trip, I was rewarded nicely. The HTR robot is the second thing I do in the morning, after pour a coffee. I look for tracks producing and ignore the rest.