Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 12-28-2012, 05:14 PM   #1
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Handicapping Using Similarities And Dissimilarities

One handicapping approach I am trying lately has to do with the detection, grouping and grading of similarities and dissimilarities within a given race.

Starting for a set of given attributes, let's say best - worst speed figure, familiarity with surface and distance, recency and class movement I try to encode the process of classification to average and outliers expecting the extreme choices (favorites and long shots) to occupy the polar ends while the middle ranges the middle of the distribution..

Of course in the real world things will not be so neat. Most of the times the distribution will not be so clear signifying a discrepancy from public's opinion. This discrepancy can be either the outcome of bad data or bad algorithm or it can very well be a betting anomaly that should be closely examined as it could present some value.

For example assuming that A and B both look the same but A appears to be the odds on favorite while B is listed as a 5-1 shot could either mean that we are missing something in our assessment or something else is going on.. Most likely in the later condition horse B does not represent an overlay but a dead horse who does not receive any action despite its resemblance to the favorite, so we can safely eliminate it...

Things will not always be so straight forward though and this is exactly what has to be researched up to a point where we would be able to detect this type of anomaly even in very slight fluctuations. We need even to consider key differences that might be overestimated by the public leading it to the wrong decision path....

I am still not sure if such an approach is really worth a significant time investment and Im would like to hear your opinions...

Last edited by DeltaLover; 12-28-2012 at 05:17 PM.
DeltaLover is offline   Reply With Quote Reply
Old 12-28-2012, 06:59 PM   #2
Robert Fischer
clean money
 
Robert Fischer's Avatar
 
Join Date: Sep 2006
Location: Maryland
Posts: 23,559
Quote:
Originally Posted by DeltaLover
One handicapping approach I am trying lately has to do with the detection, grouping and grading of similarities and dissimilarities within a given race.

Starting for a set of given attributes, let's say best - worst speed figure, familiarity with surface and distance, recency and class movement I try to encode the process of classification to average and outliers expecting the extreme choices (favorites and long shots) to occupy the polar ends while the middle ranges the middle of the distribution..

Of course in the real world things will not be so neat. Most of the times the distribution will not be so clear signifying a discrepancy from public's opinion. This discrepancy can be either the outcome of bad data or bad algorithm or it can very well be a betting anomaly that should be closely examined as it could present some value.

For example assuming that A and B both look the same but A appears to be the odds on favorite while B is listed as a 5-1 shot could either mean that we are missing something in our assessment or something else is going on.. Most likely in the later condition horse B does not represent an overlay but a dead horse who does not receive any action despite its resemblance to the favorite, so we can safely eliminate it...

Things will not always be so straight forward though and this is exactly what has to be researched up to a point where we would be able to detect this type of anomaly even in very slight fluctuations. We need even to consider key differences that might be overestimated by the public leading it to the wrong decision path....

I am still not sure if such an approach is really worth a significant time investment and Im would like to hear your opinions...
It could be worthwhile simply for the experience of going through the process.

I could be wrong, but from what you are saying, it seems like you would need additional conditions or qualifiers(how valuable are the discrepancies in different subsets of races or odds ranges...) to find an advantage once you developed a good set of factors to compare.

Also depending on how i'm reading the attributes part, I'm not sure if you mean you are encoding attributes data from scratch, or if you plan on using easily accessible data for your project. If it's the former, that is a whole project of it's own. You may want to use some ready made factors like speed figures or prime power (or then again the factors you choose for attributes could make or break the effort).
__________________
Preparation. Discipline. Patience. Decisiveness.

Last edited by Robert Fischer; 12-28-2012 at 07:00 PM.
Robert Fischer is offline   Reply With Quote Reply
Old 12-28-2012, 07:18 PM   #3
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
The ideal solution will be to input raw data as they appear in the drf file and allow the algorithm to create the factors, the ratings and select the normalization automatically.

A way of doing that is by creating a DSL consisting of the necessary keywords and a genetic program that will assemble them to expression trees...

Where I find the more challenge is the classification process... In other words how the algorithm will decide that two or more horses look alike ... Kmeans is the obvious solution although the existance of binary factors makes it more complicated...
DeltaLover is offline   Reply With Quote Reply
Old 12-28-2012, 09:46 PM   #4
Robert Goren
Racing Form Detective
 
Robert Goren's Avatar
 
Join Date: Jul 2007
Location: Lincoln, Ne but my heart is at Santa Anita
Posts: 16,316
Burton P Fabricand wrote book called Horse Sense back in the 1960s. It had a ton rules on for horses similar to favorite. If you found a horse similar to the favorite, you bet the favorite. I tried to test the book once by paper and pen. As I remember some of the rules contradicted each other and I gave up. I am pretty sure the the book has mention here several times over the years.
Anyway Good Luck on your project. I think you may be on to something.
__________________
Some day in the not too distant future, horse players will betting on computer generated races over the net. Race tracks will become casinos and shopping centers. And some crooner will be belting out "there used to be a race track here".
Robert Goren is offline   Reply With Quote Reply
Old 12-28-2012, 09:55 PM   #5
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Quote:
Originally Posted by Robert Goren
Burton P Fabricand wrote book called Horse Sense back in the 1960s. It had a ton rules on for horses similar to favorite. If you found a horse similar to the favorite, you bet the favorite. I tried to test the book once by paper and pen. As I remember some of the rules contradicted each other and I gave up. I am pretty sure the the book has mention here several times over the years.
Anyway Good Luck on your project. I think you may be on to something.
amazon does not seem to have it

http://www.amazon.com/s/ref=nb_sb_no...=Horse%20Sense
DeltaLover is offline   Reply With Quote Reply
Old 12-28-2012, 10:24 PM   #6
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,921
Fabricand's work was based upon what he called "The Principle of Maximum Confusion."

The idea was that if two horses are "similar" and one is bet down there must be a reason for it.

The book came with a very complex set or rules... some of which turned out to be contradictory.
Dave Schwartz is online now   Reply With Quote Reply
Old 12-28-2012, 11:23 PM   #7
Johnny V
Registered User
 
Join Date: Jan 2010
Posts: 647
Fabricand wrote another book called The Science of Winning: A Random Walk Along the Road to Investment Riches back sometime in the 80's and followed with a later updated edition. The book also contains a stock market and options section as well. I found it to be complex and hard to follow in some ways but others may have had better luck with it.
Johnny V is offline   Reply With Quote Reply
Old 12-28-2012, 11:51 PM   #8
Robert Goren
Racing Form Detective
 
Robert Goren's Avatar
 
Join Date: Jul 2007
Location: Lincoln, Ne but my heart is at Santa Anita
Posts: 16,316
Quote:
Originally Posted by DeltaLover
Apparently, there are two editions of it, A 1965 edition which I use to have and 1976 edition which I never read. At the price I found for it on the net, I wish I still had it. I would sell it a NY minute. It was unworkable for me, but somebody might find something useful in it. He takes different approach than most writers of not only that era, but of today as well.
__________________
Some day in the not too distant future, horse players will betting on computer generated races over the net. Race tracks will become casinos and shopping centers. And some crooner will be belting out "there used to be a race track here".

Last edited by Robert Goren; 12-28-2012 at 11:59 PM.
Robert Goren is offline   Reply With Quote Reply
Old 12-29-2012, 12:14 AM   #9
Dave Schwartz
 
Dave Schwartz's Avatar
 
Join Date: Mar 2001
Location: Reno, NV
Posts: 16,921
I have both of his books in my collection.

Neither are worth the effort to read them IMHO.
Dave Schwartz is online now   Reply With Quote Reply
Old 12-29-2012, 12:54 AM   #10
Robert Goren
Racing Form Detective
 
Robert Goren's Avatar
 
Join Date: Jul 2007
Location: Lincoln, Ne but my heart is at Santa Anita
Posts: 16,316
Quote:
Originally Posted by Dave Schwartz
I have both of his books in my collection.

Neither are worth the effort to read them IMHO.
You are probably right, Dave. The idea behind the book(s) might be a good one, but the way he tried to define it in practice did not work for me or a couple of other people I knew who read the book. Of course, he did not have kind of tools we have today to refine his orginal premise. I am not willing to dismiss his premise out of hand although his methods to implement were bad.
__________________
Some day in the not too distant future, horse players will betting on computer generated races over the net. Race tracks will become casinos and shopping centers. And some crooner will be belting out "there used to be a race track here".
Robert Goren is offline   Reply With Quote Reply
Old 12-29-2012, 04:22 AM   #11
Capper Al
Registered User
 
Capper Al's Avatar
 
Join Date: Dec 2005
Location: MI
Posts: 6,330
My first comparison is how do my selections line up with the trackman's? I'm not concerned if I picked a horse 3rd and he picked fourth. I am concerned that I picked a horse in my contenders and he didn't, or he picked it and I didn't. Either way, I start questioning my selections at this point. When the tote-board lights up, I revisit the whole process over again with my picks verse the board. I understand the feeling. What is it that the trackman or public sees that I don't. My result seem to be random. Sometimes I'm right and sometimes they were right. Yet, this may be an opportunity to learn something as to why.
__________________


"The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under bridges, to beg in the streets, and to steal bread."

Anatole France


Capper Al is offline   Reply With Quote Reply
Old 12-29-2012, 11:53 AM   #12
bob60566
Vancouver Island
 
Join Date: Dec 2010
Posts: 1,747
Quote:
Originally Posted by Capper Al
My first comparison is how do my selections line up with the trackman's? I'm not concerned if I picked a horse 3rd and he picked fourth. I am concerned that I picked a horse in my contenders and he didn't, or he picked it and I didn't. Either way, I start questioning my selections at this point. When the tote-board lights up, I revisit the whole process over again with my picks verse the board. I understand the feeling. What is it that the trackman or public sees that I don't. My result seem to be random. Sometimes I'm right and sometimes they were right. Yet, this may be an opportunity to learn something as to why.
Read this angle years ago you use Trackmans selection when it is not in the consensus selections
bob60566 is offline   Reply With Quote Reply
Old 12-29-2012, 02:36 PM   #13
lansdale
Registered User
 
Join Date: Jan 2006
Posts: 1,506
Bayesian techniques

Hi DL,

This sounds very much like the kind of basically Bayesian approaches to the game being used by Trifecta Mike and, to some degree, Jeff Platt. Similar work is being done in the poker world with much success. I don't know that this is your purpose in separating fields into 'similar' and 'non-similar' horses - possibly to look for unbalanced races, but, as sheer speculation, it has long seemed to me that 'dissimilar' or more heterogeneous horses win more than their fair share of races compared with the more homogenous - those in the same field that can be more easily ranked by a standard metric. I have absolutely no statistical backing for this, but I'd be curious to know if anything you find implies that there might be something to this.

Cheers,

lansdale

Quote:
Originally Posted by DeltaLover
One handicapping approach I am trying lately has to do with the detection, grouping and grading of similarities and dissimilarities within a given race.

Starting for a set of given attributes, let's say best - worst speed figure, familiarity with surface and distance, recency and class movement I try to encode the process of classification to average and outliers expecting the extreme choices (favorites and long shots) to occupy the polar ends while the middle ranges the middle of the distribution..

Of course in the real world things will not be so neat. Most of the times the distribution will not be so clear signifying a discrepancy from public's opinion. This discrepancy can be either the outcome of bad data or bad algorithm or it can very well be a betting anomaly that should be closely examined as it could present some value.

For example assuming that A and B both look the same but A appears to be the odds on favorite while B is listed as a 5-1 shot could either mean that we are missing something in our assessment or something else is going on.. Most likely in the later condition horse B does not represent an overlay but a dead horse who does not receive any action despite its resemblance to the favorite, so we can safely eliminate it...

Things will not always be so straight forward though and this is exactly what has to be researched up to a point where we would be able to detect this type of anomaly even in very slight fluctuations. We need even to consider key differences that might be overestimated by the public leading it to the wrong decision path....

I am still not sure if such an approach is really worth a significant time investment and Im would like to hear your opinions...
lansdale is offline   Reply With Quote Reply
Old 12-31-2012, 01:51 PM   #14
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
The most important thing is to define what we mean be similarity.

Let's assume that we are looking for similarity based in common characteristics in their running lines.

Of course we can start with very simple similarities and gradually refine them to more complex. For example we can assume that horses coming off an X day layoff look similar, but we can add to the layoff the number of workouts, the finish position in their last races and any other factor we can think.

Here, we have to answer an important question: How can we be sure that a specific algorithm to define similarity is better or worse than another one? There are various approaches we can follow. Two of them are the following:

- Starters belonging is the same cluster should show neutral final results. This means that within their cluster they should show identical winning percentages. Based in this definition it is not impossible to group together starters who at first look are very different on paper but they happen to have similar chances...

- Starters belonging is the same cluster should present similar ROI. Of course based in this definition we add another layer of indirection, this of the awareness of the public of the similarity of these starters which is something different from the previous definition.

I think that the answer to this question is much harder than what looks from first glance. Think about it...

Some one comes up with a claim that using some method he can create a group of some starters (let's say 3) that they are similar, using as similarity definition their win rate. How can we verify that he is correct and indeed there 3 starters are winning with the same rate? Can you see a valid approach? Note, that final odds or morning line are never considered here... We just have three runners and their corresponding win rates...

What is the correct resolution to this problem?
DeltaLover is offline   Reply With Quote Reply
Old 12-31-2012, 02:42 PM   #15
baconswitchfarm
Registered User
 
Join Date: Jan 2008
Location: Kentucky
Posts: 1,069
Quote:
Originally Posted by Johnny V
Fabricand wrote another book called The Science of Winning: A Random Walk Along the Road to Investment Riches back sometime in the 80's and followed with a later updated edition. The book also contains a stock market and options section as well. I found it to be complex and hard to follow in some ways but others may have had better luck with it.

I don't know which edition I have. It is a green hardcover with gold lettering and looks old. It is a really slow tedious read. I would say it helped me a small bit just looking at things differently. It is not a light read for when you are falling asleep in bed. It took some focus for me to commit to finishing it.
baconswitchfarm is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 05:39 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.