Computer-Based Head-to-Head Handicapping [Archive] - Horse Racing Forum - PaceAdvantage.Com

View Full Version : Computer-Based Head-to-Head Handicapping

InControlX

10-31-2012, 12:47 PM

Several PA Users have posted some interesting theoretical approaches to computer-based handicapping. I think the willingness to share concepts and ideas is positive, and that we can to a certain extent help each other out without giving up secrets. Sometimes, though, I think the shared concepts are too vague to be of practical use and a more defined suggestion would be better. To that aim I will attempt to illustrate a simplified short version of a specific computer analysis technique I call "Head to Head" with which I've had good success. This technique utilizes no new age thought processes nor any artificial intelligence algorithms, but is more of a cookbook approach for a "test, verify, and validate" framework than a mysterious black box for selection.

If there is interest I will go into further detail, and yes, there is a lot more of detail.

Disclaimers: Nothing is for sale here, nor will be. No guarantees. You might pitch your laptop against the wall in frustration if you attempt this. This is not claimed to be the best method ever nor the total solution to handicapping. You don't have to join a cult or go to grad school to try it. This is a method of handicapping, not a wagering strategy.

First, several things are needed to start:

1. Two years of delimited Chart Files.

2. The same two years of home made or purchased delimited Past Performance Files.

3. Visual Basic (VB) and good familiarity with its use. VB is not hard to learn.

4. A relatively modern laptop or desktop computer with 900+ GB hard drive

5. Eight to Twelve key independent handicapping Boolean Parameters of YOUR OWN DESIGN.

6. A LOT of free time.

The method described has only been possible on commercially available laptop and desk top personal computers the past eight or so years. Prior to that, the memory and speed requirements would bog down the machines. Perhaps future equipment advances will permit expansion of this process to larger arrays.

The basic Head to Head method is formulated around the concept that unlike Blackjack, Roulette, other games of chance and pure statistics, horse racing is a contest between competing entrant horses. I'm surprised how often this fact is ignored by approaches which import analysis methods from other studies. Playing cards don't compete to get to the top of the deck, have class levels, nor are (legally) manipulated by their owners.

Let's take a quick jump past 1 through 4 above, assume they are in hand or at least accessible, and delve into number 5.

If you've been at this game for awhile I'm sure you have some favorite things to see in an entry's past performances which indicate a good performance is pending under the right conditions. The key in setting up a Head to Head run is to define eight to twelve of them in Boolean (true/false) format. I am not going to divulge the parameters I use and I don't recommend you do either. What I suggest is that you plug YOUR favorites into the Head to Head and see what you discover. What has usually happened for me is finding that I have to refine my initial parameter list and start over. Although tedious, this eventually generates a proven handicapping approach after three to five trials. There is no rule that restricts you to eliminate pre-filters. If your best filters fit one particular class, say dirt routes, just run those. Note, however, that pre-filters will cut into your sample counts and if too restrictive will cause problems later. It's all a trade-off. You will note that this method does not pick spot plays, where a specific pattern is found which yields a high success rate, but rather finds races where the head-to-head competition stacks one entrant as a standout. In Head-to-Head analysis we automatically consider the strengths and weaknesses of all the entrants, not just the spot play pick. We also find the key combinations of Boolean Parameters that are best without having to guess ahead.

The Boolean Parameters must reduce down to true/false determinations about the horse's past performances prior to race. Of course, the more predictive the parameters are the better your eventual results will be. However, if you focus too much on speed figures or obvious indicators for the majority of your parameters you will likely end up with a "morning line favorite" picker with a fine winning percentage (40%+) but a poor ROI (80% or so). Try to use your more obscure handicapping edges which are not so obvious or well known. An example set, (definitely not my best, but not bad) Boolean Parameters are:

(1) 1 Has made early position gain at this class before at Sprint Distances.

(2) 2 Has made late position gain at this class before at Sprint Distances.

(3) 4 Has made early position gain at this class before at Route Distances.

(4) 8 Has made late position gain at this class before at Route Distances.

(5) 16 Has made early position gain at greater than this class before at Sprint Distances.

(6) 32 Has made late position gain at greater than this class before at Sprint Distances.

(7) 64 Has made early position gain at greater than this class before at Route Distances.

(8) 128 Has made late position gain at greater than class before at Route Distances.

(9) 256 Has made early position gain at less than this class before at Sprint Distances.

(10) 512 Has made late position gain at less than this class before at Sprint Distances.

(11) 1024 Has made early position gain at less than this class before at Route Distances.

(12) 2048 Has made late position gain at less than class before at Route Distances.

Each of the Boolean Parameters is assigned a value of consecutive powers of "2". These are shown after the index numbers above as 1, 2, 4, 8, ...2048.

The purpose in defining the Boolean Parameters is to differentiate the entrants into a fixed number of possible Preparation Groups consisting of all combinations. Entrant's individual Preparation Groups are identified by the sum of values for TRUE answered Boolean Parameters. In this example we have twelve Boolean Parameters which will yield 2^12 = 4096 possible Preparation Groups. Note that each Boolean Parameter MUST be answerable with a clear true/false response. There can be no gray area in the question.

Now a word or two about Type Discriminators is needed. We have some good evidence that not all races favor the same preparation. Therefore, we need to sub-categorize the race types by some logical division. For now, let's choose four types of dirt sprint, dirt route, turf sprint, and turf route. The artificial surfaces are included with "dirt", or you could add two more types for them.

The next step is to turn each Boolean Parameter into a Visual Basic program filter, and one-by-one go through the first database year's charts with the VB program keeping track of each Preparation Pattern Group's record against each other. A "competitive victory" is defined as the PGrp(X) entrant finishing in at least the top three positions and the PGrp(Y) entrant finishing behind the PGrp(X). (PGrp(X) and PGrp(Y) are two Preparation Pattern Groups). The past performance files are opened to yield the data for testing the Boolean Parameters and determining each entrant's Preparation Pattern Group. In English, the result is a scorecard of each Preparation Pattern AGAINST each other Preparation Pattern over the year's database. Because we are counting head-to-head outcomes we derive many more data samples than single pattern win/loss analysis which yield only one sample per race. A single six horse field in this method provides fifteen samples, a field of nine yields twenty four. A second run tabulates the results into a head-to-head comparison array over the first database year which holds the winning ratios of PGrp(X) vs. PGrp(Y).

HTH(PGrp(X), PGrp(Y), Type) where X = 0 to 4095, and Y = 0 to 4095 and Type = 0 for dirt sprints, 1 for dirt routes, 2 for turf sprints, and 3 for turf routes.

A third program run uses the head-to-head comparison array derived above in attempt to predict race outcomes in the second year's database of charts. This essentially repeats the second program run but adds a ranking determination to sort each race by HTH matrix values to yield a Race Prediction Matrix on new data. We use the second year's charts to confirm or disprove predictability. If the first year's patterns result in a poor winning percentage and ROI as applied to the second year's data, we can hardly have confidence of success in future application to real time entries. The Race Prediction Matrix is easier to understand by example. The insert figure is the Race Prediction Matrix for Belmont Race 1 on October 26, 2012. In the matrix, the head-to-head winning percentage ratios are entered for each entrant vs. each other for the entered race type, in columns vs. rows alignment. These are the HTH array values previously determined. The matrix column sums and this value divided by the total entrants minus one yields Average Competitive Ratios for each entrant printed beneath the matrix. It is the Average Competitive Ratio that is used to rank the entries.

The Average Competitive Ratios are then used to rank the entrants predictive finishing position from first to last:

Race 1 BEL 20121026 Predictive Rank

Dirt 1 M Purse 30,000

Fillies and Mares 3 Year Olds And Up CLAIMING ( $15,000 )

1 #6 pp 6 [0.739] PGrp= 1025 Coast of Sangria M/L= 2-1

2 #2 pp 2 [0.535] PGrp= 2060 Glynisthemenace M/L= 3-1

3 #1 pp 1 [0.498] PGrp= 3 Miss Brass Bonanza M/L= 20-1

4 #7 pp 7 [0.48] PGrp= 2176 Miss Libby M/L= 12-1

5 #8 pp 8 [0.474] PGrp= 2056 Destination Moon M/L= 15-1

6 #4 pp 4 [0.457] PGrp= 1 File Gumbo M/L= 8-5

7 #3 pp 3 [0.408] PGrp= 0 Katy's Office Girl M/L= 12-1

8 #5 pp 5 [0.408] PGrp= 0 So Much Heart M/L= 30-1

This race was chosen as an example because the first place pick Average Competitive Ratio (0.739 for #6) is much greater than the second place pick's (0.535 for #2) indicating a considerable advantage (GAP = (0.739 - 0.535) = 0.204) . Although in initial test runs I include all rankings, I later subdivide the results according to the GAP to establish a practical limit. An increasing winning percentage with GAP is a good sign that you're on to something.

A few other observations:

- In the matrix, the opposing rows and column entries (i.e., 3 vs. 4 and 4 vs. 3) should add up to 1.000, because if PGrp(X) beats PGrp(Y) 0.600 or 60% of the time, PGrp(Y) must beat PGrp(X) 0.400 or 40% of the time.

- The last two rated entries in the example race have no "true" Boolean Parameters and thus are Preparation Group Zero. I use a race filter which skips any race which has a Group Zero ML of less than 5:1 and only one less than 10:1.

- The Normalized Predictive Odds includes a few other factors than just Average Competitive Ratio, too messy to include now.

- A large GAP between 2nd and 3rd picks, and 3rd and 4th picks, and so on can be tested and used for exotics in more elaborate wagering.

After test application for tens of thousands of races a good evaluation is determined on the predictive quality of the original Boolean Parameters. If the results are not good, adjust and try again. I usually have a laptop or two running continuously over parameter or filter iterations.

It's a good idea to continuously monitor the success rate of selection gap picks. I perform this on a monthly basis. It's also prudent to refine and optimize pre-filters and post-filters around a selected set of Boolean Parameters. In other words, it never ends.

ICX

DeltaLover

10-31-2012, 02:20 PM

Very interesting....

You are making a lot of progress in just a single post, covering way too many topics...

I think we need to go a bit slower taking them one by one.....

Let's start:

Of course, the more predictive the parameters are the better your eventual results will be..

Please clarify..

How you measure how predictive a parameter is?

What you call a boolean parameter I would define as a predicate function (with no side effects) returning True or False while accepting a single parameter will must be a starter. This function implements a decision tree whose leafs consist boolean values.

Each starter has access to his own primitive data covering ratings, rankings and figures and also is having access to all its competitors and their summarized rankings..

Based in this definition the universe of all possible factors is infinite.

Your statement about how predictive a parameter is, is fundamental for our approach, so I suggest we spend some time and effort clarifying exactly this...

What are your thoughts?

PICSIX

10-31-2012, 02:29 PM

Are the, "predictive ranks" basically a power rating assigned to each entrant?

DeltaLover

10-31-2012, 02:54 PM

My understanding in that here we are taking about the predictiveness of a specific boolean factor. Having a factor f its predictiveness will be a number expressing its quality so we will be able to compare it against another one.

InControlX

10-31-2012, 03:12 PM

Very interesting....

You are making a lot of progress in just a single post, covering way too many topics...

I think we need to go a bit slower taking them one by one.....

Let's start:

Please clarify..

How you measure how predictive a parameter is?

What you call a boolean parameter I would define as a predicate function (with no side effects) returning True or False while accepting a single parameter will must be a starter. This function implements a decision tree whose leafs consist boolean values.

Each starter has access to his own primitive data covering ratings, rankings and figures and also is having access to all its competitors and their summarized rankings..

Based in this definition the universe of all possible factors is infinite.

Your statement about how predictive a parameter is, is fundamental for our approach, so I suggest we spend some time and effort clarifying exactly this...

What are your thoughts?

Good questions and comments, DeltaLover.

I agree its a lot for one post, but if I left too many gaps I thought it wouldn't tie together and the purpose would be lost.

I call the initial determinations Boolean simply because they are dimensioned as Visual Basic Boolean values, i.e., only true/false, so that they can correspond to a final unique binary sum of 2's power values.

The validation of a candidate predictive parameter is only realized by following the whole process through and seeing winning percentage/ROI improvement, although some obvious conclusions can be drawn for shortcuts. I've struggled on selections of these over the years, especially with class determinations and back time limits on qualifying races. I need improvements on several of them, but the neat thing about this approach is that it "automatically" identifies the key interrelationships between the Boolean Parameters. In essence, the results tell you how they combine for advantage (or not).

ICX

InControlX

10-31-2012, 03:14 PM

Are the, "predictive ranks" basically a power rating assigned to each entrant?

Correct. With the power rating being the average of the database head-to-head advantage ratios of each entry vs. each other.

ICX

DeltaLover

10-31-2012, 04:13 PM

I call the initial determinations Boolean simply because they are dimensioned as Visual Basic Boolean values, i.e., only true/false, so that they can correspond to a final unique binary sum of 2's power values

I am following a pretty similar apporach. Representing all matching factors as a binary number makes it very easy to search for matches and patterns both in a traditional relational data base or just by using a program written in a more imperative environment.

VB or any other language of the .net environment, mysql, SqlServer and most similar technologies present some restriction as far as the actual number of the bits that will represent each individual factor. Since the largest natively integer type supported by these is the long integer this means that handling more the 64 factors cannot easily be accomplished and you have to add some complexity to allow your application to handle it... This is just one of the reasons I find PYTHON to be a great solution for any research and development project... It handles any integer no matter what its size will be exactly in the same way without any need to cast or reallocate...

The validation of a candidate predictive parameter is only realized by following the whole process through and seeing winning percentage/ROI improvement, although some obvious conclusions can be drawn for shortcuts. I've struggled on selections of these over the years, especially with class determinations and back time limits on qualifying races

You have various alternatives that you can use to quantify the effectiveness of a factor, its winning percentage, impact value, weight impact value, ROI and final PNL are some of them. I think though that the better approach would take in consideration its frequency and its final ROI. The reason I am adding its frequency, in other words how often occurs has to do with avoiding overfilling which could be a serious issue as the factor granularity is increasing...

Capper Al

10-31-2012, 04:30 PM

Interesting approach.

InControlX

10-31-2012, 04:31 PM

I am following a pretty similar apporach. Representing all matching factors as a binary number makes it very easy to search for matches and patterns both in a traditional relational data base or just by using a program written in a more imperative environment.

VB or any other language of the .net environment, mysql, SqlServer and most similar technologies present some restriction as far as the actual number of the bits that will represent each individual factor. Since the largest natively integer type supported by these is the long integer this means that handling more the 64 factors cannot easily be accomplished and you have to add some complexity to allow your application to handle it... This is just one of the reasons I find PYTHON to be a great solution for any research and development project... It handles any integer no matter what its size will be exactly in the same way without any need to cast or reallocate...

You have various alternatives that you can use to quantify the effectiveness of a factor, its winning percentage, impact value, weight impact value, ROI and final PNL are some of them. I think though that the better approach would take in consideration its frequency and its final ROI. The reason I am adding its frequency, in other words how often occurs has to do with avoiding overfilling which could be a serious issue as the factor granularity is increasing...

Python looks very adaptive and more integer size tolerant, but my problem with changing from VB is that I've got about fifty custom applications in VB I need for work and I need to tweak them often. I've migrated platforms in the past and its always been painful keeping the formats straight. That said, this method is certainly not restricted to VB.

I generally sort a new set of initial parameters (or modified previous ones) into a big results array including descriptor code elements of surface, purse, distance, race condition, etc. which I can use as post-filters. My rule of thumb has been a minimum filtered quantity of 1000/year. If I filter the sampling much below 1K the error margin creeps up, and as you mention, the granularity produces over-optimistic results.

ICX

eurocapper

11-01-2012, 07:30 AM

vegasone

11-01-2012, 09:12 AM

This looks to me like genetic algorithms.

InControlX

11-01-2012, 09:36 AM

I'm afraid to me this has some aspect of looking for the ultimate truth in horse racing, instead of what is profitable for the time being. I believe research oriented persons are prone to this, personally I would focus on value or longshot analysis.

You get out a ranking based upon the quality of your initial parameters. If you use common inputs you get common outputs. The trick is to start with some parameters that define an advantage not widely applied by others.

ICX

InControlX

11-01-2012, 09:37 AM

This looks to me like genetic algorithms.

Assuming you mean generic, you are correct. There is no unusual math involved in the method.

ICX

DeltaLover

11-01-2012, 11:24 AM

You get out a ranking based upon the quality of your initial parameters. If you use common inputs you get common outputs. The trick is to start with some parameters that define an advantage not widely applied by others.

ICX

It is not very clear when you say to start with some parameters defining an advantage not widely appiled by others.

The most primitive level of your data will be common to anyone and published to the public domain. Going a level higher we have a normalized form of this common data that can be presented using many different methodologies : ragozin, thorograms, beyers, equibase or CJ figures are all measuring the same dimention using very common data (there can be some additional info from some of them as Ragozin or Brown claim to do) but these 'numbers' will all be correlated enough to be considered at least similar. These level can be composed to even higer level 'ratings' like bris prime power for example serving as an index that can describe a starter with a signle dimentional number as opposed to an array of speed figures per each start in the past... Then we can add another layer now expressing a quality of the event itself, like for example an index describing how 'much' speed or stamina is present.

Based in this, I can see tha advantage not coming from the parameters themselfs but from another synthetic level combining these layers of data with their perception from the public (as expressed in the pools) always searching for inefficiencies that add some space for price correction.

DeltaLover

11-01-2012, 11:25 AM

This looks to me like genetic algorithms.

Can you please expain better what looks like a genetic algorithm ?

vegasone

11-01-2012, 11:31 AM

DeltaLover

11-01-2012, 11:46 AM

Genetic. The same concept.
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, is evolved toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached

Cool.... But I didnt ask about a definition of a GA rather where you see the similarities with what we discuss here...

Since you methioned GAs I have to make a comment that based in our domain here, a similar approach called Genetic Programming might be more applicable since each chromosome is now represented from an expression tree rather than a sequence of bits which might discover more complicated and fit solutions...

Such a converation might be off topic for this specific thread but we can start a new one discussing excusively GA and GP, although I think it is better to concentrate to one topic at time until is suffiently exposed and then move to the next....

podonne

11-01-2012, 01:26 PM

Interesting post. Nice to see discussion like this. I want to consider this in some depth, but just off the top of my head I would pay close attention to sample sizes for your groups.

You mention having 12 binary factors (or classifiers) and 4 groups of races, so you really have 2^12*4 different groups when looking at all races, or 16,384. If you could ensure that every group had 100 entries, at an average of 8 entries per race you need 204.8K races in your database.

But of course you can't ensure the horses will be evenly divided. Entries matching a large number of factors will be rare, perhaps impossible to have all of them at the same time, which means some horses will fit into a group that you have very little data on. So you have to account for the uncertainty that it creates.

Also, its not necessary to use bit-wise mathematics with modern programming languages and database software (including VB). A unique list of strings of 1s and 0s, or a database table with a primary key and binary columns, will both accomplish the same task but be more understandable to you (and to people trying to understand your approach) without sacrificing performance.

Just some thoughts, not criticism. I run into these problems all the time. :)

podonne

11-01-2012, 01:31 PM

Cool.... But I didnt ask about a definition of a GA rather where you see the similarities with what we discuss here...

Since you methioned GAs I have to make a comment that based in our domain here, a similar approach called Genetic Programming might be more applicable since each chromosome is now represented from an expression tree rather than a sequence of bits which might discover more complicated and fit solutions...

Such a converation might be off topic for this specific thread but we can start a new one discussing excusively GA and GP, although I think it is better to concentrate to one topic at time until is suffiently exposed and then move to the next....

You probably don't need a genetic algorithm for your approach here, even though it sounds similar. GAs are best when exploring very large search spaces, but with 2^12 factors, your search space is not too large to comprehensibly explore.

If you had 2^24 factors, then yeah, you need a more efficient method of exploring what all those different combinations do, besides "trying them all one at a time", thus the need for a GA.

Also, GAs typically need an easily calculated fitness function to determine which GA is the "best". Your approach, as I read it, does not lend itself to that.

DeltaLover

11-01-2012, 01:54 PM

You probably don't need a genetic algorithm for your approach here, even though it sounds similar. GAs are best when exploring very large search spaces, but with 2^12 factors, your search space is not too large to comprehensibly explore.

If you had 2^24 factors, then yeah, you need a more efficient method of exploring what all those different combinations do, besides "trying them all one at a time", thus the need for a GA.

Also, GAs typically need an easily calculated fitness function to determine which GA is the "best". Your approach, as I read it, does not lend itself to that.

What I have done in the past when taking the approach of GA / GP was to create a high level DSL introducing the operators I though might have been of interest letting the GP to evolve the fittest expressions trees based in my fitness function(s). For this I've used a custom LISP like simple language following an approach based to Koza's original work....

As you correctly are pointing out the required sample size increases dramatically as the chromosomes are becoming more complex tending to generate over fitted results.

Capper Al

11-01-2012, 02:22 PM

Red Knave

11-01-2012, 02:39 PM

I just might be looking at this wrong. Please clarify.

I think the OP is using powers of 2 to give each parameter a unique binary value.
If you think of a string of 1s and 0s, each position from right to left represents a power of 2. You can use these strings to compare against each other and do math and boolean stuff like And, Or, Xor etc.

podonne

11-01-2012, 02:41 PM

Are we saying that the value of each attribute selected increases by the power of 2? Most handicapping factors IVs are in a close range. Yes Pace is better than class, but is it possible that we can be saying 2048 times? I just might be looking at this wrong. Please clarify.

Thanks

Its not as complicated as that. He's just saying that he's using 12 binary factors to create 2^12 distinct groups. So every horse fits in one of 2^12 groups, so comparing all the groups against all the other groups gets you a sense for how a future horse in one group will fare against a future horse in another group. Aggregate that enough times over enough horses and you get a sense for which groups of horses will beat which other groups of horses in a race.

Its an abstraction method, so the race becomes not about the horses in the race, instead it becomes a race between different groups. And you try to figure out which group will win and then bet the horse in the race in that group.

podonne

11-01-2012, 02:43 PM

What I have done in the past when taking the approach of GA / GP was to create a high level DSL introducing the operators I though might have been of interest letting the GP to evolve the fittest expressions trees based in my fitness function(s). For this I've used a custom LISP like simple language following an approach based to Koza's original work....

As you correctly are pointing out the required sample size increases dramatically as the chromosomes are becoming more complex tending to generate over fitted results.

What do you do when two horses in a race are in the same group? Granted it might not happen very often with 2^12 groups, but it occasionally will.

DeltaLover

11-01-2012, 03:21 PM

Its not as complicated as that. He's just saying that he's using 12 binary factors to create 2^12 distinct groups. So every horse fits in one of 2^12 groups, so comparing all the groups against all the other groups gets you a sense for how a future horse in one group will fare against a future horse in another group. Aggregate that enough times over enough horses and you get a sense for which groups of horses will beat which other groups of horses in a race.

Its an abstraction method, so the race becomes not about the horses in the race, instead it becomes a race between different groups. And you try to figure out which group will win and then bet the horse in the race in that group.

As said having N binay factors introduces 2^N possible masks

Most of these masks will fall into one or both of these two cases:

- Have very few data or no data at all

In my systems I impose a minimum number of winners for
every mask in order to be considered. Usually is 10 or
20.

This restriction plays a key role in my algorithm for
the creation of the universe of all masks:

I start with a group consisting of only the individual factors
(for our thread is 12, in my case over 100) and throw away
masks with less than 10 winners

Then I create all the possible pairs doing the same.
I continue each time extending the length of the combination
by 1 until there is no change at all.

- Although having enough data do no present any value

After having completed the process of every mask those
that fall within 1 standard deviationn from the mean ROI
are thrown away. I have done several experiments with
weight assigning but for the moment I just assign 2,1,-1,-2
expressing the number of stddevs.

podonne

11-01-2012, 03:57 PM

As said having N binay factors introduces 2^N possible masks

Most of these masks will fall into one or both of these two cases:

- Have very few data or no data at all

In my systems I impose a minimum number of winners for
every mask in order to be considered. Usually is 10 or
20.

This restriction plays a key role in my algorithm for
the creation of the universe of all masks:

I start with a group consisting of only the individual factors
(for our thread is 12, in my case over 100) and throw away
masks with less than 10 winners

Then I create all the possible pairs doing the same.
I continue each time extending the length of the combination
by 1 until there is no change at all.

- Although having enough data do no present any value

After having completed the process of every mask those
that fall within 1 standard deviationn from the mean ROI
are thrown away. I have done several experiments with
weight assigning but for the moment I just assign 2,1,-1,-2
expressing the number of stddevs.

100 is a lot of factors! For those listening, that's 2^100 = 1,267,650,600,228,229,401,496,703,205,376 distinct groups/masks. More than all the horses who have ever run, ever! :)

In your mask-building process above (which is a bit different from the original post), do you ensure that every horse still falls into one and precisely one group? It can be more complicated if a horse can fit more than one group, or not be in a group at all, and then dealing with horses in the same group.

Also, I would not use the standard deviation of the mask's ROI to remove masks. That's only accurate when you use past behavior to monitor future behavior. For partitioning a population into "meaningful" and "not meaningful" sub-groups you need something like a Chi-squared test, or a binomial test since they are binary factors.

Still not criticizing your work, just noticing thought processes similar to those I have gone down in the past without knowing the best way to approach a particular problem. This game is incredibly complex in the ways you can approach it.

Capper Al

11-01-2012, 04:17 PM

Won't the top rated horse by IV still be your top rated horse going across the matrix the vast majority of the time?

InControlX

11-01-2012, 04:17 PM

Several good observations have been posted on details of number of determinant groups (binary parameters in my original post), and sample size. It is correct that my posted example adds a multiplier of 4 to differentiate basic race type, so yes, the group count over all types of races is 16,384. A two-year DRF Chart database contains about 185,000 races, but since the head to head factoring goes down to the 3rd finisher we gain about 18 comparison results per race for a grand total of 3.33 million increments, or a very rough average of 200 per prep pattern.

As pointed out, the distribution is a far sight from linear even with the most general of initial parameters. It is usually necessary with VB to limit the counts to 30K or so to avoid overflow in the common preparation group comparisons (unless you have the luxury of DeltaLover's Python!) while some combinations will have tiny counts. I use a low count handler routine which reverts to a backup value of the overall win ratios first, then a default 0.500 if the counts are really small, say less than 10.

To answer the same-group power rating dilema... it's a tie, skip the race. I also skip any race with the top average entry having more than one of the default ratios explained above.

ICX

DeltaLover

11-01-2012, 04:21 PM

What do you do when two horses in a race are in the same group? Granted it might not happen very often with 2^12 groups, but it occasionally will.

This will happen for a in sure... A good example is first time outs that have very little data to be used...

You cannot do much in this case.

For the purposes of your model these two starters will be identical.

For example in my system this model creates a rating of positive over negative votes. Each starter has an array of matching masks and each has its own weight (2,1,-1,2). The positive divided by the negative totals creates a rating.

Of course if both starters have identical masks then their ratings will be the same and cannot be distinguished further.

This could be a sign that we need more refined factors adding more is not guaranteed to resolve the issue.

This is one of the reasons why we need multiple signals for a real world strategy and we cannot rely in a single one but for now let's focus in boolean factors for the purpose of this discussion..

DeltaLover

11-01-2012, 04:26 PM

Won't the top rated horse by IV still be your top rated horse going across the matrix the vast majority of the time?

Not necessary. It depends with what you are trying to model. If you are looking for winning frequency then yes, IV might be a suitable approach. But if your model is looking for value, IV is not the way to go

InControlX

11-01-2012, 04:30 PM

Are we saying that the value of each attribute selected increases by the power of 2? Most handicapping factors IVs are in a close range. Yes Pace is better than class, but is it possible that we can be saying 2048 times? I just might be looking at this wrong. Please clarify.

Thanks

Capper Al... No, the binary power number only means how many total combinations of different possible preparation patterns are used. Although some initial binary parameters will have a bigger win/loss or ROI effect than others, we don't need to know this going in. The results run will tell us if our selections were good, i.e., finishing gaps correspond to wins/ROI.

Also, in the determination by final run gaps we find races where a real dog of an entry becomes a pick selection beacuse bad as it's pattern is, the entry is superior to the remaining starters by a good margin. Even so, I still have trouble "pulling the trigger" in these races!

ICX

DeltaLover

11-01-2012, 05:12 PM

Also, I would not use the standard deviation of the mask's ROI to remove masks. That's only accurate when you use past behavior to monitor future behavior. For partitioning a population into "meaningful" and "not meaningful" sub-groups you need something like a Chi-squared test, or a binomial test since they are binary factors.

As I answered to Al, I think it depends and the correct answer is more of a trial and error process than a clear analytical prove.

Yes ChiSqured is a more accurate test than the one I described and I am extensively using it in other models... It happened that the one I described here is using this method...

podonne

11-01-2012, 05:37 PM

As I answered to Al, I think it depends and the correct answer is more of a trial and error process than a clear analytical prove.

Yes ChiSqured is a more accurate test than the one I described and I am extensively using it in other models... It happened that the one I described here is using this method...

Fair enough. I'm sure you are using it in the proper context.

Takes me back to the first time I tried something like that. Just seemed so simple. How do I know if a factor is different enough? Just look at a huge number of factors and calculate the ROI's standard deviation, so > 2 dev's is 95% sure its different!

Then someone pointed out that measurements of > 2 deviations were expected in any random process, it was really whether they happened more often than chance would predict (more than 95% of the time). But that just describes whether your sample distribution resembles a normal distribution, not to give you a means of filtering out particular samples, especially without a time dimension of some kind.

Seductive in its simplicity, though...

DeltaLover

11-01-2012, 05:52 PM

Then someone pointed out that measurements of > 2 deviations were expected in any random process, it was really whether they happened more often than chance would predict (more than 95% of the time).
:ThmbUp:

It took me several iterations to understood this concept.. Of course we can also use a population of totally random factors as a comparison measurement..

Magister Ludi

11-01-2012, 05:54 PM

DeltaLover

11-01-2012, 06:22 PM

Genetic Algorithms (GA) are frequently used with neural networks. Once a network has been successfully trained to its training parameters, its neurons are "mutated" to create another parent network. The trained network and the mutated network create a child through genetic crossover. The resulting child network is trained and tested. If the resulting network outperforms its parents, its neurons are "mutated" and it is "bred" to create another network. In other words, GA are used only after a network has been successfully trained to create a more accurate and robust solution.

Sure... In the past I have written moduels to train a NN using GA where the weight of each neuron was assigned by the GA avoiding back propagation sigma functions etc... Presently though I no longer implement either one (NN or GA) since I am using Open Source libraries that are making them an implementation detail to the whole platform allowing me to shift my focus to the domain rather than to low level details

podonne

11-01-2012, 08:18 PM

Genetic Algorithms (GA) are frequently used with neural networks. Once a network has been successfully trained to its training parameters, its neurons are "mutated" to create another parent network. The trained network and the mutated network create a child through genetic crossover. The resulting child network is trained and tested. If the resulting network outperforms its parents, its neurons are "mutated" and it is "bred" to create another network. In other words, GA are used only after a network has been successfully trained to create a more accurate and robust solution.

True, but there are a ton more applications than just fine tuning neural networks. Its really powerful when faced with a huge range of solutions where you can develop a bunch of random solutions, pick the best, and combine them to create a bunch more solutions, pick the best, and repeat.

Not sure you meant to be restrictive, just didn't want to leave the impression that application was its purpose.

For fun reading along the lines of this thread take a read about Learning Classifier Systems (LCS). Genetic algorithms + reinforcement learning + masks. Fun stuff. http://en.wikipedia.org/wiki/Learning_classifier_system