|
|
11-21-2012, 12:41 AM
|
#46
|
Registered User
Join Date: Jan 2005
Posts: 6,626
|
Quote:
Originally Posted by raybo
Your assumption is correct, we get a set of predictive attributes from the database, the program then uses those attributes to determine eliminations as win contenders, then afurther elimination process is undertaken regarding adjusted early fractional velocities.
The number of cards needed for a particular track varies according to the average number of races per card, but, approximately 240-250 recent races (24-25 cards approximately) are kept in the database, constantly updated, of course. That number can vary for some tracks, depending on how the spread of pace pressure groupings works out, and how many of each of those type races actually occur, some pace pressure readings happen an insignificant number of times, as you can imagine, while others occur in significant numbers. There are 20 different groupings, so not all tracks have the same spread of pace pressure ratings, thus needing a few more, or fewer, races in the database. Once we find the number where predictiveness is at an acceptable level, the program keeps the database there automatically.
|
I am impressed at the innovative way you view the data. Thank you for explaining.
|
|
|
11-21-2012, 07:25 AM
|
#47
|
EXCEL with SUPERFECTAS
Join Date: Mar 2004
Posts: 10,206
|
You're welcome. I think the fact that the program takes Randy Giles' work a step or 2 further, makes it more valuable. While Randy's work tells you a particular PPG gives advantage to an early, or a late, horse, etc., this program tells you which of the specific running styles have won those races and at what percentage. Also, it tells you the same thing for those styles' early speed point ranges. Both of those things being track specific and in a recent time frame.
|
|
|
11-21-2012, 07:41 AM
|
#48
|
Registered User
Join Date: Dec 2005
Location: MI
Posts: 6,330
|
Quote:
Originally Posted by Cratos
I disagree because each race is an independent event and the margin of victory in one event has nothing to do with the margin in another event. Additionally, the field of horses will probably be different in each event.
|
It is never the attribute (fact) like DSLR, it is how it is used. Days has to be applied in conjunction with other factors or it is meaningless. The more dominate attributes like speed and pace are considered primary factors and may stand as a single source (variable) on their own for selecting horses while Days Since Last Raced is a secondary attribute and has to be interpreted in a comprehensive manner together with other factors.
__________________
"The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under bridges, to beg in the streets, and to steal bread."
Anatole France
|
|
|
11-21-2012, 10:25 AM
|
#49
|
Vancouver Island
Join Date: Dec 2010
Posts: 1,747
|
Quote:
Originally Posted by Capper Al
It is never the attribute (fact) like DSLR, it is how it is used. Days has to be applied in conjunction with other factors or it is meaningless. The more dominate attributes like speed and pace are considered primary factors and may stand as a single source (variable) on their own for selecting horses while Days Since Last Raced is a secondary attribute and has to be interpreted in a comprehensive manner together with other factors.
|
In todays world there is only so many times you can administer apple juice in a cycle and make the horse effective.
|
|
|
11-21-2012, 10:53 AM
|
#50
|
Registered User
Join Date: Dec 2005
Location: MI
Posts: 6,330
|
Quote:
Originally Posted by bob60566
In todays world there is only so many times you can administer apple juice in a cycle and make the horse effective.
|
If you are referring to cheating then there isn't much one can do except hope there's a pattern that can be flagged.
__________________
"The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under bridges, to beg in the streets, and to steal bread."
Anatole France
|
|
|
11-21-2012, 11:33 AM
|
#51
|
Registered user
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
|
In my opinion the concept of recency is one of the most naively treated in handicapping literature and related software.
Recency analysis is a classical example of unsupervised learning where the input vector consists of the intervals between consecutive races and the output is a classification universe. It is a cluster analysis that can be optimized by various fitness functions: PNL, ROI or winning frequency maximizers or minimizers.
The objective of this classification is to derive group monikers in such a way that each starter will be assigned one creating a recency shape for each race. The value of a classifications schema can be evaluated by a selection method (ga, nn, linear regression or other) that will be able to show profitability utilizing it in some way...
|
|
|
11-21-2012, 01:11 PM
|
#52
|
Vancouver Island
Join Date: Dec 2010
Posts: 1,747
|
Quote:
Originally Posted by Capper Al
If you are referring to cheating then there isn't much one can do except hope there's a pattern that can be flagged.
|
Wrong thread should have posted under Pattern Recognition
|
|
|
11-21-2012, 07:18 PM
|
#53
|
Registered User
Join Date: Feb 2008
Posts: 1,591
|
Quote:
Originally Posted by DeltaLover
In my opinion the concept of recency is one of the most naively treated in handicapping literature and related software.
Recency analysis is a classical example of unsupervised learning where the input vector consists of the intervals between consecutive races and the output is a classification universe. It is a cluster analysis that can be optimized by various fitness functions: PNL, ROI or winning frequency maximizers or minimizers.
The objective of this classification is to derive group monikers in such a way that each starter will be assigned one creating a recency shape for each race. The value of a classifications schema can be evaluated by a selection method (ga, nn, linear regression or other) that will be able to show profitability utilizing it in some way...
|
You might discover a "bounce" classification.
Delta, you may want to give an example of what you have stated in paragraph 2. It is a good idea.
Mike (Dr Beav)
Last edited by TrifectaMike; 11-21-2012 at 07:19 PM.
|
|
|
11-21-2012, 11:07 PM
|
#54
|
Registered user
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
|
Sure Doc..
I will try to present my approach:
Each starter has an array of days intervals between his starts:
s1: 25 15 53 212 ....
s1: d1, d2, dn
For simplicity let's consider only todays race days off and previous race.
In this case we have
classififaction = f(d1,d2)
Each interval adds one dimension so for our example we are talking about two dimensions
Each starter can be represented as a point in a two dimensional surface x,y.
To make the algorithm easier we might need some transformation logic for the days off:
for example:
T(d) = log(d) (or whatever)
Using the euclidean distance for all starters we will be looking for clusters having some similar behavior:
for example winning frequency.
The objective of the algorithm will be to find two dimensional clusters that behave similarly.
For example we might find a cluster c1 who is having winning frequency or c2 who is having the lower.
Each cluster will be assigned an arbitrary label C1, C2, C2 etc
Then the whole race can be described as a composite of clusters based where each starter belongs:
C1
C1
C2
C3
C7
Now the race can be matched against similar races from where we might be able to conclude (for example)
that this type of race is most frequently won by a C1 type
|
|
|
11-22-2012, 04:08 AM
|
#55
|
Registered User
Join Date: Nov 2012
Posts: 26
|
Hi Delta
Do you use any other inputs to your cluster analysis. Trainer springs to mind ?.
|
|
|
11-22-2012, 07:16 AM
|
#56
|
Registered user
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
|
Trainer will add a categorical dimension in the space which subsequently will narrow clusterization to this dimension only.
Although I am not currently doing in in any of my systems we can add analyze trainers after we have completed the recency clustrerization. Based in it we will now be able to assign to each trainer a distribution of classifications and rate him based in them.
For example:
Trainer: T1
C1 : T1-C1-Stats win% roi
C2 : T1-C2-Stats win% roi
C3 : T1-C3-Stats win% roi
C4 : T1-C4-Stats win% roi
Trainer: T2
C1 : T2-C1-Stats win% roi
C2 : T2-C2-Stats win% roi
C3 : T2-C3-Stats win% roi
C4 : T2-C4-Stats win% roi
etc
Now we can use each trainer's vector :
[ Ti-Ci .... ]
To perform another classification...
|
|
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|