Making your own handicapping system. - Horse Racing Forum - PaceAdvantage.Com

Capper Al · 10-25-2013, 06:48 AM

"The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill." Albert Einstein

Oldman (my handicapping partner of sorts) and I often discuss the madness to our methods.Oldman has a database and handicaps with that and does well. I haven't built my database yet. I have spent 5 or more years writing a C++ program that feeds large spreadsheets, one for maidens and another for non-maidens. Oldman and I approach solving the handicapping riddle from opposite ends of the do-it yourself system cappers.

There have been many posts recommending the sensibility of record keeping and finding your own niche in the game. And yes I totally agree. But the DIY handicapper doesn't have the resourses of the commercial software vendor that sponsor PA, so the individual on his own has to choose their method of attack on how to tool up to play the game. Our time becomes our limited resource.

Having and using a database does give one the power of Percentages and Probabilities answering the most valued racing question of what happens in this situation. But I grew up in the paper and pencil period of handicapping before personal computers. I loved generating figures with a calculator for speed or pace or the dot system and many more other systems. In doing this I came to believe that the riddle to handicapping was best solved in the alchemy of the formulas being used. Now I persist in this direction as I'm about to rewrite my system instead of build my database. Why? Because I only have time to pursue one path. I believe from writing my original application that now I understand the game more and can build a better application. After my rewrite, I will build my database.

HUSKER55 · 10-25-2013, 10:26 AM

first off, you can buy data to build a database so I think creating an application you are content with to be the most important.

If you always feel "wanting" then something is wrong and you "know it". you might not know what, but you know something is amiss.

that is why I keep records of where my application hits and misses and why.

One day....I will hit the big one!

good luck with your endeavors

traynor · 10-25-2013, 10:29 AM

Quote:

Originally Posted by Capper Al

"The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill." Albert Einstein

Oldman (my handicapping partner of sorts) and I often discuss the madness to our methods.Oldman has a database and handicaps with that and does well. I haven't built my database yet. I have spent 5 or more years writing a C++ program that feeds large spreadsheets, one for maidens and another for non-maidens. Oldman and I approach solving the handicapping riddle from opposite ends of the do-it yourself system cappers.

There have been many posts recommending the sensibility of record keeping and finding your own niche in the game. And yes I totally agree. But the DIY handicapper doesn't have the resourses of the commercial software vendor that sponsor PA, so the individual on his own has to choose their method of attack on how to tool up to play the game. Our time becomes our limited resource.

Having and using a database does give one the power of Percentages and Probabilities answering the most valued racing question of what happens in this situation. But I grew up in the paper and pencil period of handicapping before personal computers. I loved generating figures with a calculator for speed or pace or the dot system and many more other systems. In doing this I came to believe that the riddle to handicapping was best solved in the alchemy of the formulas being used. Now I persist in this direction as I'm about to rewrite my system instead of build my database. Why? Because I only have time to pursue one path. I believe from writing my original application that now I understand the game more and can build a better application. After my rewrite, I will build my database.

Databases can be as misleading as they can be informative. That is, raw data is just that--data rather than information. Your approach may be best. Figure out what information you need first, then build the database to provide you with that information. That will avoid the all-too-common problem of continual "fishing expeditions" that turn up imaginary trends in the raw data that only exist when viewed in small clumps and small samples.

I use a "database" but it is comprised of handicapped races--not raw data. Meaning I have automated the process of "handicapping" individual races and use the output of that application as input for the data mining apps. It works a LOT better than chasing rainbows in raw data. For one thing, because the handicapping app automatically cleans the data (including the elimination of outliers), the information that results from the data mining process is substantially more reliable, and rarely misleading. That is a lot more than can be said of apps massaging raw, uncleaned data.

DRIVEWAY · 10-25-2013, 10:31 AM

Do you have criteria to determine whether a race is playable? I'm not talking about identifying contenders and searching for odds value, I'm talking about prior to any wagering answering the question - "Is this race a playable beatable type of race".

If you need handicappers to evaluate and comment on your specifications or testers to work your program and provide feedback, let me know. It's fun collaborating on these projects.

Good Luck in your endeavor.

Robert Goren · 10-25-2013, 11:35 AM

I am not a data base handicapper, but something I have always been curious about is how you handle track bias in your data. Do you ignore them? Throw races that have them? Have separate data bases for the most common?

DeltaLover · 10-25-2013, 11:42 AM

Quote:

Originally Posted by Capper Al

"The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill." Albert Einstein

Oldman (my handicapping partner of sorts) and I often discuss the madness to our methods.Oldman has a database and handicaps with that and does well. I haven't built my database yet. I have spent 5 or more years writing a C++ program that feeds large spreadsheets, one for maidens and another for non-maidens. Oldman and I approach solving the handicapping riddle from opposite ends of the do-it yourself system cappers.

There have been many posts recommending the sensibility of record keeping and finding your own niche in the game. And yes I totally agree. But the DIY handicapper doesn't have the resourses of the commercial software vendor that sponsor PA, so the individual on his own has to choose their method of attack on how to tool up to play the game. Our time becomes our limited resource.

Having and using a database does give one the power of Percentages and Probabilities answering the most valued racing question of what happens in this situation. But I grew up in the paper and pencil period of handicapping before personal computers. I loved generating figures with a calculator for speed or pace or the dot system and many more other systems. In doing this I came to believe that the riddle to handicapping was best solved in the alchemy of the formulas being used. Now I persist in this direction as I'm about to rewrite my system instead of build my database. Why? Because I only have time to pursue one path. I believe from writing my original application that now I understand the game more and can build a better application. After my rewrite, I will build my database.

The fact that you are using C++ to generate spreadsheets while your friend
prefers a data base does not necessary mean that you are approaching handicapping
from opposite ends. The difference relies on the methodology and not on the
mechanisms used.

I am not sure that a database gives one the power of percentages and
probabilities as you say here. This seems like a very primitive approach to the
process of computerized handicapping. Before you will be in the position to
maintain any valid expectation for some positive results, who need to address
challenges like improving data quality, defining a solid theoretical foundation
of meta handicapping, getting familiar with applicable domains of arterial
intelligence and then follow up with a long series of trial and error
experiments. Let me give you an example: You say that you love generating
figures, have you ever think that instead of creating them based on your
experience it might be possible to develop an algorithm to allow the computer to
derive a whole universe of indexes and use them for handicapping? More than this
you need also to understand how to make a judgement call about the quality and
effectiveness of each index and figure and how to use it more effectively,
generating a probability distribution based on it, a ranking sequence of
anything else.

HUSKER55 · 10-25-2013, 01:19 PM

delta lover, I read your post and maybe I am in a rut and don't realize it. [not to worry...this isn't the first rodeo]

can you give an example using speed or pace. Thinking here is it should be easy to understand what you mean from the example.

But alas, maybe I am wrong.

thanks

DeltaLover · 10-25-2013, 03:29 PM

Quote:

Originally Posted by HUSKER55

delta lover, I read your post and maybe I am in a rut and don't realize it. [not to worry...this isn't the first rodeo]

can you give an example using speed or pace. Thinking here is it should be easy to understand what you mean from the example.

But alas, maybe I am wrong.

thanks

Hope the following is helpful:

Let's assume that we have a specific methodology to calculate some sort of a
speed figure (similar to Beyer or Brawn). For this step we need the algorithm to
derive this figure. Think as an example Beyer's method. Note that we have
infinite ways to model the past performance data to a figure, so the computer
can be used here, to automatically create its logic.

Each individual figure is not of much of a value until we somehow compose them
to a rating (similar to bris prime power) that can describe each horse with
a unique number. Again as before there are infinite ways to create the rating,
among other factors, we can consider recency, earnings, connections etc to
estimate the weight of each individual figure.

After we construct this rating for each horse the next step is to convert it to
some scale that will be easy to back test. One obvious way is for example to
create a probability distribution based on them. This process can again be
assisted by the computer by applying a genetic program for example.

The last step is to take these probabilities and backtest them through
historical data to see how well the fit. For example we might try to see how
well this probabilities predict the winner. The way to do so, is another model,
that can be as simple as comparing it against the final odds for example. This
comparison will later allow us to compare two set of figures and decide which
one is better than the other, or in other words it has better fitness.

Note, that this approach is not limited to the winning horse and speed figures
only. We can create for example another set of figures trying to predict who
will be on the lead in the first call of the race or even what is the horse that
will finish last.

Capper Al · 10-25-2013, 04:21 PM

Quote:

Originally Posted by traynor

Databases can be as misleading as they can be informative. That is, raw data is just that--data rather than information. Your approach may be best. Figure out what information you need first, then build the database to provide you with that information. That will avoid the all-too-common problem of continual "fishing expeditions" that turn up imaginary trends in the raw data that only exist when viewed in small clumps and small samples.

I use a "database" but it is comprised of handicapped races--not raw data. Meaning I have automated the process of "handicapping" individual races and use the output of that application as input for the data mining apps. It works a LOT better than chasing rainbows in raw data. For one thing, because the handicapping app automatically cleans the data (including the elimination of outliers), the information that results from the data mining process is substantially more reliable, and rarely misleading. That is a lot more than can be said of apps massaging raw, uncleaned data.

Thanks fellow developer. I appreciate this coming especially from you. This is my belief exactly. Take this as a simple item to deal with: I have an algorithm that handles what to do if there isn't a good BRIS speed figure available for a PP line. Just the data cleaning situations are necessary, besides all the complication of handicapping theory.

Capper Al · 10-25-2013, 04:24 PM

Quote:

Originally Posted by DeltaLover

The fact that you are using C++ to generate spreadsheets while your friend
prefers a data base does not necessary mean that you are approaching handicapping
from opposite ends. The difference relies on the methodology and not on the
mechanisms used.

I am not sure that a database gives one the power of percentages and
probabilities as you say here. This seems like a very primitive approach to the
process of computerized handicapping. Before you will be in the position to
maintain any valid expectation for some positive results, who need to address
challenges like improving data quality, defining a solid theoretical foundation
of meta handicapping, getting familiar with applicable domains of arterial
intelligence and then follow up with a long series of trial and error
experiments. Let me give you an example: You say that you love generating
figures, have you ever think that instead of creating them based on your
experience it might be possible to develop an algorithm to allow the computer to
derive a whole universe of indexes and use them for handicapping? More than this
you need also to understand how to make a judgement call about the quality and
effectiveness of each index and figure and how to use it more effectively,
generating a probability distribution based on it, a ranking sequence of
anything else.

You right we could and do have some similar approaches, but I don't look at raw data as much as he does for calculations.

Capper Al · 10-25-2013, 04:32 PM

Quote:

Originally Posted by HUSKER55

first off, you can buy data to build a database so I think creating an application you are content with to be the most important.

If you always feel "wanting" then something is wrong and you "know it". you might not know what, but you know something is amiss.

that is why I keep records of where my application hits and misses and why.

One day....I will hit the big one!

good luck with your endeavors

My confidence is in my approach. I have designed and developed several major applications for the Department of Defense. What I mostly know is that I have been profitable each year for the past 5 years, so something is working. Now to be honest had I spent my time flipping hamburgers at McDonalds instead of developing a system and playing horses, I would have earned 20 times plus more money.

Capper Al · 10-25-2013, 04:38 PM

Quote:

Originally Posted by Robert Goren

I am not a data base handicapper, but something I have always been curious about is how you handle track bias in your data. Do you ignore them? Throw races that have them? Have separate data bases for the most common?

Track bias for today's current race is almost impossible to deal with in computation, although I do have a method for it but rarely use it. The hope is over the long run it will wash out for databases. I do calculate bias from race line PP's in my routine calculations. That's different.

HUSKER55 · 10-25-2013, 05:06 PM

NOW you sound like me. There are days that I swear the biggest problem is making the proper wager at the proper time.

Delta lover, thanks for your example.

Exotic1 · 10-25-2013, 05:22 PM

Quote:

Originally Posted by DeltaLover

Hope the following is helpful:

Let's assume that we have a specific methodology to calculate some sort of a
speed figure (similar to Beyer or Brawn). For this step we need the algorithm to
derive this figure. Think as an example Beyer's method. Note that we have
infinite ways to model the past performance data to a figure, so the computer
can be used here, to automatically create its logic.

Each individual figure is not of much of a value until we somehow compose them
to a rating (similar to bris prime power) that can describe each horse with
a unique number. Again as before there are infinite ways to create the rating,
among other factors, we can consider recency, earnings, connections etc to
estimate the weight of each individual figure.

After we construct this rating for each horse the next step is to convert it to
some scale that will be easy to back test. One obvious way is for example to
create a probability distribution based on them. This process can again be
assisted by the computer by applying a genetic program for example.

The last step is to take these probabilities and backtest them through
historical data to see how well the fit. For example we might try to see how
well this probabilities predict the winner. The way to do so, is another model,
that can be as simple as comparing it against the final odds for example. This
comparison will later allow us to compare two set of figures and decide which
one is better than the other, or in other words it has better fitness.

Note, that this approach is not limited to the winning horse and speed figures
only. We can create for example another set of figures trying to predict who
will be on the lead in the first call of the race or even what is the horse that
will finish last.

Helpful to me. Thanks.

keenang · 10-25-2013, 07:59 PM

I remember years ago at a Dr.Sartin seminar he said no 2 races are never.never,ever run exactly the same. So I said to myself why keep a data base and to this day I think he was right.

Gene K.