Text mining the race conditions [Archive] - Horse Racing Forum - PaceAdvantage.Com

podonne

08-07-2012, 12:17 AM

Reading through a race book at the conditions reveals some very specific lists of eligibility and weight allowances, often more than can be expressed by a simple Alw2000N1X or by combinations of fields in a DRF single file format csv. Races since X allowed 2 lbs, registered owners, etc...

The only way to figure out whether certain horses fit "better" into these more complex eligibility requirements in a database would be to somehow code a search function which respected and properly weighted all of them. But that would mean reading the conditions and making changes by hand, as parsing conditions like that would be very difficult for a computer program to interpret.

Which makes me wonder, has anyone done this? Any notion of how much\whether such information would be worth betting on it?

I could use like Mechanical Turk to have people turn the complex conditions into standardized search parameters that when dealing with large databases, would easily tell you horses with fit the conditions really well. But if its been done, or this is common knowledge, or if the weight allowances don't give an advantage, then I'll save myself the coin.

Thanks,
Podonne

InControlX

08-07-2012, 07:08 PM

Podonne,

I think your concept is good, but when I tried setting up a "conditions advantage" back in 2005 I wasn't able to distinguish a wagering gain for an eligibility advantage, i.e., entry eligibility based upon terms like "Non-winners of a race over $25,000 purse since May 1". Entries having won for more in April were live, with 5-10% winning percentage hikes, but apparently also too obvious and had proportionately even less tote numbers. I thought I had found an angle on claimers stepping up to allowance (and thus passing "except claiming" win restrictions, but it was only useful at MNR for awhile.

I found no correlation at all on weight advantage conditions.

ICX

podonne

08-07-2012, 08:16 PM

Podonne,

I think your concept is good, but when I tried setting up a "conditions advantage" back in 2005 I wasn't able to distinguish a wagering gain for an eligibility advantage, i.e., entry eligibility based upon terms like "Non-winners of a race over $25,000 purse since May 1". Entries having won for more in April were live, with 5-10% winning percentage hikes, but apparently also too obvious and had proportionately even less tote numbers. I thought I had found an angle on claimers stepping up to allowance (and thus passing "except claiming" win restrictions, but it was only useful at MNR for awhile.

I found no correlation at all on weight advantage conditions.

ICX

Thanks for your reply. That's what I was afraid of. I've heard people talk often about the weight differential not mattering, so it was hard for me to think that allowing 2 lbs would give a substantial boost. But, then, why have the 2lb allowance in the conditions at all?

Interesting that your April horses were so easily identified by the crowd, given how hard it is to systematize a condition like that. But, I guess on-track bettors can pick out these horses easily enough...

pondman

08-07-2012, 09:19 PM

, or if the weight allowances don't give an advantage, then I'll save myself the coin.

Thanks,I
Podonne

It almost sounds as if you are asking the wrong question. The condition are generally written to give more weight to the horse with the advantage. There are a number of conditions when going with the high weight is enough to give you a slight edge. If you were discussing mountaineer, I think there is a slight advantage with the higher weight over time. I don't think it's enough to make it a primary.

stu

08-07-2012, 09:47 PM

The most important piece to watch is the parenthetical phrases within a race condition. Occasionally due to program production mistakes, those phrases often don't make it into the DRF or program.

Example:

(Races where entered for $17,500 or less does not count in allowances or eligibility)

eqitec

08-08-2012, 02:56 PM

My software text mines and parses the race conditions into what I call micro-classes.

E.g. The 4th at Mth on Friday parses out to the following micro-class:

MthDFtCond ClaimingOlderMales5-5.9kn2l

All such parsed micro-classes are stored in a related Factor Analysis Points (FAP) file which contains unique settings for each micro-class. For the example above, the FAPs for that micro-class ignores past performances and breeding for wet track conditions. If the Ft in the micro-class changes to a wet track condition (Sl, Gd, My), then a different set of FAPs is applied to the changed micro-class.

ranchwest

08-11-2012, 08:49 AM

My software text mines and parses the race conditions into what I call micro-classes.

E.g. The 4th at Mth on Friday parses out to the following micro-class:

MthDFtCond ClaimingOlderMales5-5.9kn2l

All such parsed micro-classes are stored in a related Factor Analysis Points (FAP) file which contains unique settings for each micro-class. For the example above, the FAPs for that micro-class ignores past performances and breeding for wet track conditions. If the Ft in the micro-class changes to a wet track condition (Sl, Gd, My), then a different set of FAPs is applied to the changed micro-class.

How useful have you found this information to be?

eqitec

08-12-2012, 08:41 PM

I find it very useful in several ways. Here are two:
1. As per my previous post, I have 1000s of pre-set impact value templates configured based on the elements contained in the micro-classes. When any given micro-class appears in a race card, my software automatically loads the impact values for that specific micro-class. (my software refers to them as "Factor Analysis Points"). If a micro-class changes at the last minute because the track condition changes, or a race is taken off the grass, and/or the distance changes, making those changes will change the micro-class, which will then automatically load the new impact values template for the changed micro-class. This allows me to be very quick about re-handicapping the race based on the changed elements in the micro-class.
2. I also text mine and parse downloaded chart files using the same micro-class algorithms as applied to the race conditions that come down with race cards files. As an example, let's say one of today's races is the following micro-class:
TAMDFt6FCond ClaimingOlderMales5-6Kn1y
and I'm looking at a horse in this race which last ran in a
TAMDFt6FCond ClaimingOlderMales8-9kn3l
To answer the question whether or not this is a class change (up, down, or no change), I can quickly pull reports for both micro-classes at TAM and make performance comparisons to answer the class change question.

The biggest challenge I have with this approach is with the races with multiple conditions, such as Opt. Clm/Allowances or races with "b" designations. I usually "lock and load" impact values based on one micro-class that applies to the majority of the horses in the field.

ranchwest

08-12-2012, 09:51 PM

Thanks, that was interesting.

The Bit

08-13-2012, 10:00 AM

I find it very useful in several ways. Here are two:
1. As per my previous post, I have 1000s of pre-set impact value templates configured based on the elements contained in the micro-classes. When any given micro-class appears in a race card, my software automatically loads the impact values for that specific micro-class. (my software refers to them as "Factor Analysis Points"). If a micro-class changes at the last minute because the track condition changes, or a race is taken off the grass, and/or the distance changes, making those changes will change the micro-class, which will then automatically load the new impact values template for the changed micro-class. This allows me to be very quick about re-handicapping the race based on the changed elements in the micro-class.
2. I also text mine and parse downloaded chart files using the same micro-class algorithms as applied to the race conditions that come down with race cards files. As an example, let's say one of today's races is the following micro-class:
TAMDFt6FCond ClaimingOlderMales5-6Kn1y
and I'm looking at a horse in this race which last ran in a
TAMDFt6FCond ClaimingOlderMales8-9kn3l
To answer the question whether or not this is a class change (up, down, or no change), I can quickly pull reports for both micro-classes at TAM and make performance comparisons to answer the class change question.

The biggest challenge I have with this approach is with the races with multiple conditions, such as Opt. Clm/Allowances or races with "b" designations. I usually "lock and load" impact values based on one micro-class that applies to the majority of the horses in the field.

Just a guess, but have you found that the n1y condition is close to the n2l condition class wise? But the n3l condition is tougher than the n1y while the n2y is tougher than the n3l?

eqitec

08-13-2012, 07:38 PM

For TAM, the n3l at 7F on fast tracks for older horses in the $8-10K micro-class is only negligibly tougher than the n1y's running for $5K.

But, I don't think generalities can be made that hold up from track to track. Much depends on the strengths and weaknesses of the horse populations which happen to be present at any given meet.

eqitec

08-27-2012, 08:24 PM

I'm adding to my reply to show text parsing from macro-class to micro-classes of three Optional Claiming races on the Saratoga card yesterday (1,8,9). See the attached.
In the case of the 1st race, there were 16 pre-entered. 2 AEs and 2 MTOs scratched, leaving 12 which went to the post.
Here's the breakdown of which of the three micro-class eligibilities of the 12 which started:
Micro-Class 1 - 1,2,3,7,10,12
Micro-class 2 - 5
Micro-class 3 - 4,6,9,10
The winner (2) and placer (12) were eligible via micro-class 1
The show horse (10) was eligible via micro-class 3. The $2 Tri was $3,377.
Now, the official race chart shows this race as merely for Clm$20k for Fillies and Mare, 3&up. There's no mention of the other two micro-class eligibilities. When this race gets published in PPs, it will merely show as fClm$20,000b.
Given this degree of complexity that racing secretarys have created, how can any player establish a horse's class status as a component of the their handicapping effort? No one will know what the "b" means, and the official chart of the race, if one goes so far as to look it up, won't be of any help.
Is it really necessary for racing secretarys to cast their nets so widely to fill races? If so, then the data publishers should be compelled to keep up with the complexities the racing secretarys have created so the betting public is not so much in the dark.