PDA

View Full Version : Validating the Predictability of custom metrics


DeltaLover
10-01-2017, 10:42 AM
Speed figures, metrics and handicapping factors validation is by far the most important aspect of any handicapping approach and methodology.

In this notebook (https://github.com/deltalover/hoplato/blob/master/hoplato/pp/algorithmic_handicaper/build_model.ipynb) I am presenting one of the methods I am using to decide upon the quality of my custom “numbers” and decide of whether I am getting enough predictability or I need to improve them.

The linked documents describe the full approach without any hidden “proprietary” components so the reader can get a fully comprehensive view of what I am doing.

I would like to hear your criticism, comments and suggestions for improvement of the presented approach.

https://github.com/deltalover/hoplato/blob/master/hoplato/pp/algorithmic_handicaper/build_model.ipynb

Stevecsd
10-01-2017, 08:26 PM
Quite impressive.

I read that you are comparing pairs. How does this work with 9 horses in a race? A might beat B & beat might C, but C wins over A this time out. How does your algorithm account for that?

And I have seen many times where horse A beats horse B in a race, but the next time out horse B wins. Can it account for that?

I think the next step is to apply it to actual races to see if the 67% correct rate can win money.

:)

formula_2002
10-01-2017, 11:02 PM
Purpose of this exercise

At this point I only care about the predictive ability of the metrics and do not try to build a betting strategy which means that the crowd’s opinion as formed by the odds is not considered at all.

Let me know when you get to matching track odds to the "predictive ability".

good luck in you endeavor..

DeltaLover
10-02-2017, 09:31 AM
Quite impressive.

I read that you are comparing pairs. How does this work with 9 horses in a race? A might beat B & beat might C, but C wins over A this time out. How does your algorithm account for that?

And I have seen many times where horse A beats horse B in a race, but the next time out horse B wins. Can it account for that?

I think the next step is to apply it to actual races to see if the 67% correct rate can win money.

:)

How to generalize the "pair" comparisons is another challenge that is of course very important to the creation of an algorithmic handicapper but it also does not belong to the topic of this exercise; what I am developing here is a back testing platform that will provide convincing evidence that a specific set of metrics has sufficient predictive value. The next logical step is to use this application to compare different sets of metrics, including other types of figures (like bris figures for example) and derive some useful conclusion about their effectiveness.

DeltaLover
10-02-2017, 09:33 AM
Purpose of this exercise

At this point I only care about the predictive ability of the metrics and do not try to build a betting strategy which means that the crowd’s opinion as formed by the odds is not considered at all.

Let me know when you get to matching track odds to the "predictive ability".

good luck in you endeavor..

Your request is out of topic; if we are going to have a constructive conversation let's focus on what I am trying to accomplish here which is clearly described in my previous postings.

formula_2002
10-02-2017, 12:51 PM
I look forward to the discourse

DeltaLover
10-02-2017, 05:27 PM
I look forward to the discourse

It seems to me that there will not be much of a "discourse" in this thread; people are more interested in theoretical discussions, expressing opinions, horse racing small talk and off topics.

formula_2002
10-02-2017, 06:01 PM
It seems to me that there will not be much of a "discourse" in this thread; people are more interested in theoretical discussions, expressing opinions, horse racing small talk and off topics.

You might get significant input if you would add a few $$$$ to your work. However , for one, I'm willing to follow along for awhile.
Then again,it seems what you are addressing might be of more interest to horse OWNER'S and AGENTS , not bettors. Those principal's are interested in winning the race, collecting a purse ,compared to the bettor, who wants to win the race for a tote board payout profit.
Consider offering you work for college study

DeltaLover
10-02-2017, 06:15 PM
You might get significant input if you would add a few $$$$ to your work. However , for one, I'm willing to follow along for awhile.
Then again,it seems what you are addressing might be of more interest to horse OWNER'S and AGENTS , not bettors. Those principal's are interested in winning the race, collecting a purse ,compared to the bettor, who wants to win the race for a tote board payout profit.
Consider offering you work for college study

I do not disagree that accurate figures are used by owners and agents; I also believe that definition are also the most useful tool for a bettor.

Obviously I do not imply betting on the most frequent winner but the development of a "contrarian" methodology must have as its staring point some kind of accurate figures and metrics that will have some a certain level of correlation to the way the public forms the betting markets while mining for value belongs to a higher level in the pipeline.

steveb
10-04-2017, 05:43 PM
I am guessing that most people have not much clue about what they are looking at ......including me.

i am confident i have the knowhow better than most, but after a couple of minutes looking at your stuff, i just went back to doing whatever it was i was doing.

not everybody can read python code, especially those with no code writing experience.

if you truly wanted input you would make it easier for people to offer it.
not to mention that one might think you are going down the wrong track anyway.

how does one validate something that by its very nature can never be perfect?
all i do is try to minimise the error of the whole, but with the full knowledge any particular bit may be way out.

so if you think what i am writing here is double dutch, then perhaps that is what yours is to others!

DeltaLover
10-04-2017, 06:19 PM
I am guessing that most people have not much clue about what they are looking at ......including me.

i am confident i have the knowhow better than most, but after a couple of minutes looking at your stuff, i just went back to doing whatever it was i was doing.

not everybody can read python code, especially those with no code writing experience.

if you truly wanted input you would make it easier for people to offer it.
not to mention that one might think you are going down the wrong track anyway.

how does one validate something that by its very nature can never be perfect?
all i do is try to minimise the error of the whole, but with the full knowledge any particular bit may be way out.

so if you think what i am writing here is double dutch, then perhaps that is what yours is to others!

I certainly wanted input but obviously I did not make a good work towards the presentation layer; I though that it would be easy to follow the documentation that I have put together but it seems to me that people are not willing to read closely though it.

As far as validation, I do not think that lack of perfection invalidates it. What I am showing in this notebook is that the the set of figures I am using, indeed can be used for prediction and beat a random approach; of course I am not striving for perfection in the same exact way that a machine learning system is approaching any stochastic event.

formula_2002
10-04-2017, 10:02 PM
I certainly wanted input but obviously I did not make a good work towards the presentation layer; I though that it would be easy to follow the documentation that I have put together but it seems to me that people are not willing to read closely though it.

As far as validation, I do not think that lack of perfection invalidates it. What I am showing in this notebook is that the the set of figures I am using, indeed can be used for prediction and beat a random approach; of course I am not striving for perfection in the same exact way that a machine learning system is approaching any stochastic event.

lets just look at "calculate_pace_figures.py"
It would be helpful to me if your present your idea in the form of a "conclusion", without using code.

I assume you want our comments on your ideas, not the code.

DeltaLover
10-05-2017, 10:49 AM
lets just look at "calculate_pace_figures.py"
It would be helpful to me if your present your idea in the form of a "conclusion", without using code.

I assume you want our comments on your ideas, not the code.

Understanding the details of the metrics is not necessary for the current exercise; what I am developing here is an application that can receive any figure that can be expressed as a number and decide whether it has some predictability value or not. A completely useless metric should be easy to spot as it will not affect the performance if removed from the pattern.

To answer the question though, the pace figures implemented in the file you are mentioning, measure how quickly the second call of the leader of the race was ran. The scale used, ranges between 0 -100 following the normal distribution (the figure is a derivative of z-score); speed figures (as stated in the documentation) are calculated using a similar methodology to Beyer’s, (for more details you can read the notebooks).

I do not think that knowledge of programming is necessary to understand the details of this program; I could easily hide most of the implementation details from the notebooks but have placed them for documentation reasons more than anything else.