Computer-Based Head-to-Head Handicapping - Page 2 - Horse Racing Forum - PaceAdvantage.Com

vegasone · 11-01-2012, 11:31 AM

Genetic. The same concept.
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, is evolved toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached

DeltaLover · 11-01-2012, 11:46 AM

Quote:

Originally Posted by vegasone

Genetic. The same concept.
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, is evolved toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached

Cool.... But I didnt ask about a definition of a GA rather where you see the similarities with what we discuss here...

Since you methioned GAs I have to make a comment that based in our domain here, a similar approach called Genetic Programming might be more applicable since each chromosome is now represented from an expression tree rather than a sequence of bits which might discover more complicated and fit solutions...

Such a converation might be off topic for this specific thread but we can start a new one discussing excusively GA and GP, although I think it is better to concentrate to one topic at time until is suffiently exposed and then move to the next....

podonne · 11-01-2012, 01:26 PM

Interesting post. Nice to see discussion like this. I want to consider this in some depth, but just off the top of my head I would pay close attention to sample sizes for your groups.

You mention having 12 binary factors (or classifiers) and 4 groups of races, so you really have 2^12*4 different groups when looking at all races, or 16,384. If you could ensure that every group had 100 entries, at an average of 8 entries per race you need 204.8K races in your database.

But of course you can't ensure the horses will be evenly divided. Entries matching a large number of factors will be rare, perhaps impossible to have all of them at the same time, which means some horses will fit into a group that you have very little data on. So you have to account for the uncertainty that it creates.

Also, its not necessary to use bit-wise mathematics with modern programming languages and database software (including VB). A unique list of strings of 1s and 0s, or a database table with a primary key and binary columns, will both accomplish the same task but be more understandable to you (and to people trying to understand your approach) without sacrificing performance.

Just some thoughts, not criticism. I run into these problems all the time.

podonne · 11-01-2012, 01:31 PM

Quote:

Originally Posted by DeltaLover

Cool.... But I didnt ask about a definition of a GA rather where you see the similarities with what we discuss here...

Since you methioned GAs I have to make a comment that based in our domain here, a similar approach called Genetic Programming might be more applicable since each chromosome is now represented from an expression tree rather than a sequence of bits which might discover more complicated and fit solutions...

Such a converation might be off topic for this specific thread but we can start a new one discussing excusively GA and GP, although I think it is better to concentrate to one topic at time until is suffiently exposed and then move to the next....

You probably don't need a genetic algorithm for your approach here, even though it sounds similar. GAs are best when exploring very large search spaces, but with 2^12 factors, your search space is not too large to comprehensibly explore.

If you had 2^24 factors, then yeah, you need a more efficient method of exploring what all those different combinations do, besides "trying them all one at a time", thus the need for a GA.

Also, GAs typically need an easily calculated fitness function to determine which GA is the "best". Your approach, as I read it, does not lend itself to that.

DeltaLover · 11-01-2012, 01:54 PM

Quote:

Originally Posted by podonne

You probably don't need a genetic algorithm for your approach here, even though it sounds similar. GAs are best when exploring very large search spaces, but with 2^12 factors, your search space is not too large to comprehensibly explore.

If you had 2^24 factors, then yeah, you need a more efficient method of exploring what all those different combinations do, besides "trying them all one at a time", thus the need for a GA.

Also, GAs typically need an easily calculated fitness function to determine which GA is the "best". Your approach, as I read it, does not lend itself to that.

What I have done in the past when taking the approach of GA / GP was to create a high level DSL introducing the operators I though might have been of interest letting the GP to evolve the fittest expressions trees based in my fitness function(s). For this I've used a custom LISP like simple language following an approach based to Koza's original work....

As you correctly are pointing out the required sample size increases dramatically as the chromosomes are becoming more complex tending to generate over fitted results.

Capper Al · 11-01-2012, 02:22 PM

Are we saying that the value of each attribute selected increases by the power of 2? Most handicapping factors IVs are in a close range. Yes Pace is better than class, but is it possible that we can be saying 2048 times? I just might be looking at this wrong. Please clarify.

Thanks

Red Knave · 11-01-2012, 02:39 PM

Quote:

Originally Posted by Capper Al

I just might be looking at this wrong. Please clarify.

I think the OP is using powers of 2 to give each parameter a unique binary value.
If you think of a string of 1s and 0s, each position from right to left represents a power of 2. You can use these strings to compare against each other and do math and boolean stuff like And, Or, Xor etc.

podonne · 11-01-2012, 02:41 PM

Quote:

Originally Posted by Capper Al

Are we saying that the value of each attribute selected increases by the power of 2? Most handicapping factors IVs are in a close range. Yes Pace is better than class, but is it possible that we can be saying 2048 times? I just might be looking at this wrong. Please clarify.

Thanks

Its not as complicated as that. He's just saying that he's using 12 binary factors to create 2^12 distinct groups. So every horse fits in one of 2^12 groups, so comparing all the groups against all the other groups gets you a sense for how a future horse in one group will fare against a future horse in another group. Aggregate that enough times over enough horses and you get a sense for which groups of horses will beat which other groups of horses in a race.

Its an abstraction method, so the race becomes not about the horses in the race, instead it becomes a race between different groups. And you try to figure out which group will win and then bet the horse in the race in that group.

podonne · 11-01-2012, 02:43 PM

Quote:

Originally Posted by DeltaLover

What I have done in the past when taking the approach of GA / GP was to create a high level DSL introducing the operators I though might have been of interest letting the GP to evolve the fittest expressions trees based in my fitness function(s). For this I've used a custom LISP like simple language following an approach based to Koza's original work....

As you correctly are pointing out the required sample size increases dramatically as the chromosomes are becoming more complex tending to generate over fitted results.

What do you do when two horses in a race are in the same group? Granted it might not happen very often with 2^12 groups, but it occasionally will.

DeltaLover · 11-01-2012, 03:21 PM

Quote:

Originally Posted by podonne

Its not as complicated as that. He's just saying that he's using 12 binary factors to create 2^12 distinct groups. So every horse fits in one of 2^12 groups, so comparing all the groups against all the other groups gets you a sense for how a future horse in one group will fare against a future horse in another group. Aggregate that enough times over enough horses and you get a sense for which groups of horses will beat which other groups of horses in a race.

Its an abstraction method, so the race becomes not about the horses in the race, instead it becomes a race between different groups. And you try to figure out which group will win and then bet the horse in the race in that group.

As said having N binay factors introduces 2^N possible masks

Most of these masks will fall into one or both of these two cases:

- Have very few data or no data at all

In my systems I impose a minimum number of winners for
every mask in order to be considered. Usually is 10 or
20.

This restriction plays a key role in my algorithm for
the creation of the universe of all masks:

I start with a group consisting of only the individual factors
(for our thread is 12, in my case over 100) and throw away
masks with less than 10 winners

Then I create all the possible pairs doing the same.
I continue each time extending the length of the combination
by 1 until there is no change at all.

- Although having enough data do no present any value

After having completed the process of every mask those
that fall within 1 standard deviationn from the mean ROI
are thrown away. I have done several experiments with
weight assigning but for the moment I just assign 2,1,-1,-2
expressing the number of stddevs.

podonne · 11-01-2012, 03:57 PM

Quote:

Originally Posted by DeltaLover

As said having N binay factors introduces 2^N possible masks

Most of these masks will fall into one or both of these two cases:

- Have very few data or no data at all

In my systems I impose a minimum number of winners for
every mask in order to be considered. Usually is 10 or
20.

This restriction plays a key role in my algorithm for
the creation of the universe of all masks:

I start with a group consisting of only the individual factors
(for our thread is 12, in my case over 100) and throw away
masks with less than 10 winners

Then I create all the possible pairs doing the same.
I continue each time extending the length of the combination
by 1 until there is no change at all.

- Although having enough data do no present any value

After having completed the process of every mask those
that fall within 1 standard deviationn from the mean ROI
are thrown away. I have done several experiments with
weight assigning but for the moment I just assign 2,1,-1,-2
expressing the number of stddevs.

100 is a lot of factors! For those listening, that's 2^100 = 1,267,650,600,228,229,401,496,703,205,376 distinct groups/masks. More than all the horses who have ever run, ever!

In your mask-building process above (which is a bit different from the original post), do you ensure that every horse still falls into one and precisely one group? It can be more complicated if a horse can fit more than one group, or not be in a group at all, and then dealing with horses in the same group.

Also, I would not use the standard deviation of the mask's ROI to remove masks. That's only accurate when you use past behavior to monitor future behavior. For partitioning a population into "meaningful" and "not meaningful" sub-groups you need something like a Chi-squared test, or a binomial test since they are binary factors.

Still not criticizing your work, just noticing thought processes similar to those I have gone down in the past without knowing the best way to approach a particular problem. This game is incredibly complex in the ways you can approach it.

Capper Al · 11-01-2012, 04:17 PM

Won't the top rated horse by IV still be your top rated horse going across the matrix the vast majority of the time?

InControlX · 11-01-2012, 04:17 PM

Several good observations have been posted on details of number of determinant groups (binary parameters in my original post), and sample size. It is correct that my posted example adds a multiplier of 4 to differentiate basic race type, so yes, the group count over all types of races is 16,384. A two-year DRF Chart database contains about 185,000 races, but since the head to head factoring goes down to the 3rd finisher we gain about 18 comparison results per race for a grand total of 3.33 million increments, or a very rough average of 200 per prep pattern.

As pointed out, the distribution is a far sight from linear even with the most general of initial parameters. It is usually necessary with VB to limit the counts to 30K or so to avoid overflow in the common preparation group comparisons (unless you have the luxury of DeltaLover's Python!) while some combinations will have tiny counts. I use a low count handler routine which reverts to a backup value of the overall win ratios first, then a default 0.500 if the counts are really small, say less than 10.

To answer the same-group power rating dilema... it's a tie, skip the race. I also skip any race with the top average entry having more than one of the default ratios explained above.

ICX

DeltaLover · 11-01-2012, 04:21 PM

Quote:

Originally Posted by podonne

What do you do when two horses in a race are in the same group? Granted it might not happen very often with 2^12 groups, but it occasionally will.

This will happen for a in sure... A good example is first time outs that have very little data to be used...

You cannot do much in this case.

For the purposes of your model these two starters will be identical.

For example in my system this model creates a rating of positive over negative votes. Each starter has an array of matching masks and each has its own weight (2,1,-1,2). The positive divided by the negative totals creates a rating.

Of course if both starters have identical masks then their ratings will be the same and cannot be distinguished further.

This could be a sign that we need more refined factors adding more is not guaranteed to resolve the issue.

This is one of the reasons why we need multiple signals for a real world strategy and we cannot rely in a single one but for now let's focus in boolean factors for the purpose of this discussion..

DeltaLover · 11-01-2012, 04:26 PM

Quote:

Originally Posted by Capper Al

Won't the top rated horse by IV still be your top rated horse going across the matrix the vast majority of the time?

Not necessary. It depends with what you are trying to model. If you are looking for winning frequency then yes, IV might be a suitable approach. But if your model is looking for value, IV is not the way to go