MS Access as Handicapping Database - Page 8 - Horse Racing Forum - PaceAdvantage.Com

sjk · 09-24-2023, 01:42 PM

I would think you would try it with one variable first and then see if the second one adds value. Find the value of a length that best fits your predictions and then keep that value as you see if the ordinal provides additional value at its best fit.

classhandicapper · 09-25-2023, 09:20 AM

Quote:

Originally Posted by sjk

I would think you would try it with one variable first and then see if the second one adds value. Find the value of a length that best fits your predictions and then keep that value as you see if the ordinal provides additional value at its best fit.

That's sort of what I did with the database itself. I have the ability to test my automated ratings within the database. So if I add or tweak a factor I can look at the before and after to see if it improves the results.

I can't think of a way to do it with Excel other than duplicate what I already do within the database (trial and error).

When I tried just the Race Rating, Finish Position, and Beaten Lengths in Excel, it was saying the finish position and beaten lengths are important but the race rating was making a negative contribution. That's clearly wrong. It doesn't understand that the factors are related in a way where the Finish Position and Beaten Lengths should be subtracted from the Race Rating.

Trial and error may be my best bet when it comes to this unless there's a function or technique I'm unfamiliar with (which is very possible).

sjk · 09-25-2023, 02:18 PM

What I would try:

I imagine you are trying to estimate the speed rating next out so I would do regression where that is the dependent variable and you are trying to find the alpha to best fit with (race rating)-(alpha)*(beaten lengths)*(distance factor) like you would to make a speed figure.

The finish position is so co-linear with the beaten lengths that I would not trust the significance of adding it in.

formula_2002 · 09-25-2023, 02:58 PM

Quote:

Originally Posted by classhandicapper

That's sort of what I did with the database itself. I have the ability to test my automated ratings within the database. So if I add or tweak a factor I can look at the before and after to see if it improves the results.

I can't think of a way to do it with Excel other than duplicate what I already do within the database (trial and error).

When I tried just the Race Rating, Finish Position, and Beaten Lengths in Excel, it was saying the finish position and beaten lengths are important but the race rating was making a negative contribution. That's clearly wrong. It doesn't understand that the factors are related in a way where the Finish Position and Beaten Lengths should be subtracted from the Race Rating.

Trial and error may be my best bet when it comes to this unless there's a function or technique I'm unfamiliar with (which is very possible).

Just a few words without looking at regression models and looking at what I call “My Relationship Model”.
Considering finishing position, beaten lengths and may I add final track odds. Continuing.
Using the next to last race, observe the values for the 1st 6 columns F through K (see attached, I entered some fictional values).
Then use the next 2 columns for various “what if” conditions giving rise to the final column “Key Figure”
Now us using the key figure test for the last race results using the “Key Figure”
You can change the relationships between the last three columns until you meet with some positive success.
Just for the record, I would use a sample database of 500 to 1000 races and then test against about 1000 out of sample data.
Of Couse once you meet with success, the next hurdle is figuring how to the determine the odds of a “live” race, but that is a horse of a different color, but may be doable when tested.

Good data-basing analysis.

classhandicapper · 10-20-2023, 01:22 PM

I started doing some regression on Turf racing. The main difference I am seeing compared to dirt so far is that speed figures count for less and consistency counts for more (especially Wins). That's sort of what I already knew from years of trial and error handicapping, but it's always nice when the math verifies what you think. The one thing I am missing in my database is a good "late" rating.

ARAZI91 · 12-24-2023, 03:56 PM

There many weighting type systems but because racing has a myriad of conditions (surface, distance, ability, different jockey /trainer pools etc) have you ever tried Entropy type weighting or anything from the algorithms known as MCDMA (Multi Criteria Decision Making/Analysis) - Topsis springs to mind as the distribution and dispersion of numerical values has a big influence. I have found these types of weighting systems better rather than a Regression type Global model which tends to give "fixed" weights.

Topsis - see pic and download this https://www.researchgate.net/publica...ling_solutions

Shannon Entropy Weights 1 and 2 - see pics

Lots of decent Youtube tutorials on MCDM/A type methods for weighting and decision making and analysis - used a lot in various large industries but most of them have excellent utility for horseracing and sports - it's an untapped field.

classhandicapper · 12-25-2023, 08:56 AM

Quote:

Originally Posted by ARAZI91

There many weighting type systems but because racing has a myriad of conditions (surface, distance, ability, different jockey /trainer pools etc) have you ever tried Entropy type weighting or anything from the algorithms known as MCDMA (Multi Criteria Decision Making/Analysis) - Topsis springs to mind as the distribution and dispersion of numerical values has a big influence. I have found these types of weighting systems better rather than a Regression type Global model which tends to give "fixed" weights.

Topsis - see pic and download this https://www.researchgate.net/publica...ling_solutions

Shannon Entropy Weights 1 and 2 - see pics

Lots of decent Youtube tutorials on MCDM/A type methods for weighting and decision making and analysis - used a lot in various large industries but most of them have excellent utility for horseracing and sports - it's an untapped field.

I'm unfamiliar with that math. Maybe I'll take a look if I ever have time.

What I've been doing is breaking out the data by sprint, route, turf, dirt and getting separate weights for each.

MJC922 · 12-25-2023, 10:22 AM

Just my opinion but when it comes to horse racing, regression has always seemed to be a flat kind of two dimensional way of fitting together various data points. With that being said as long as you have a good solid insight into which horses are live and taking money in a race then regression can probably work fairly well anyway, with the point being that it's the 'taking money' part of it which is probably allowing for any sort of profitability in the first place.

What I'm fairly sure is a dead-end-street at the moment is generating an odds line using regression, contrasting that line with the public's odds and then betting on the largest percentage overlay, IMO that's a good way to land on a lot of dead horses which oftentimes aren't even going to be well-meant tbh.

The way my brain works to handicap races is more along the lines of multiple factors not necessarily being weighted and then adding up but rather multiple factors acting in a synergistic way. I don't believe regression is the ideal technique to pick up on those synergies. Something more along the lines of market basket analysis would fit the description better. It will probably lead into much more of a spot-playing approach but when going up against the teams I believe that's where most of the remaining edges are going to be.

Saratoga · 12-25-2023, 10:46 AM

The only issue with watching the tote , you don't know if its the Whales or legit bet downs...

Maybe early betting can give some incites....late betting won't help cause of the Whales...

There was a Delta race last week where early betting had a 3/5 opening line on a 6-1 morning line horse , then drifted up to 8-1 final...

MJC922 · 12-25-2023, 10:52 AM

Quote:

Originally Posted by Saratoga

The only issue with watching the tote , you don't know if its the Whales or legit bet downs...

Maybe early betting can give some incites....late betting won't help cause of the Whales...

There was a Delta race last week where early betting had a 3/5 opening line on a 6-1 morning line horse , then drifted up to 8-1 final...

Have to use DD probables from the first leg. I agree late betting is going to be a mix of some barn money and a whole lot of whale money piling on to it. Fixed odds if we ever get them will put an end to all of that nonsense. They'll have to do it with just the model and no insight into what's getting bet (i.e. what's medicated). Just the model by itself is nothing to be feared I can promise you that.

classhandicapper · 12-25-2023, 12:30 PM

Quote:

Originally Posted by MJC922

The way my brain works to handicap races is more along the lines of multiple factors not necessarily being weighted and then adding up but rather multiple factors acting in a synergistic way.

Can you give me a theoretical example of something like this without giving away anything valuable to your own gambling?

MJC922 · 12-25-2023, 02:49 PM

Quote:

Originally Posted by classhandicapper

Can you give me a theoretical example of something like this without giving away anything valuable to your own gambling?

There are some factors that I look at when certain criteria are met which indicate to me that today is probably a throwaway race for a horse. So that's one example (intent) because those factors probably would not be 'worth' the same in a different context or even be looked at as related for that matter. Race flow is another thing I think can be difficult to get a good handle on in some cases using regression.

Outside posts in two turn routes when say the scenario is there are 5 or 6 runners inside of a pressing horse with very close early pace ratings to it. It's going to be tricky to bring that degree of context using regression and yet as soon as they hit the first turn and the horse is pressing in the four path it might as well be 15-1 no matter what else it had going for it. Some things can appear to be subtle on paper and test that way, things like post positions, and in the typical race they might be but in certain scenarios suddenly something not normally decisive becomes that way. I think we need to uncover and understand those factors which are at times decisive but which don't show up quite so well in the usual regression analysis.

sjk · 12-25-2023, 04:27 PM

Quote:

Originally Posted by classhandicapper

Can you give me a theoretical example of something like this without giving away anything valuable to your own gambling?

Linear models are great at comparing the attributes of the runners but they don't capture the pace dynamics.

I start by making a line using the various runners past races but after that part is done they need to be adjusted based on how they fit into the pace dynamics.

A horse that can get clear gets a serious upgrade and the deep closers get downgraded. All done by objective tables calculated many years ago.

classhandicapper · 12-26-2023, 08:52 AM

Thanks. I agree with both of you. I just wasn't sure what kinds of things we were talking about.

My intent was to eventually do some regression tests with different projected pace scenarios to see how that impacted the results given I have a numeric running style in my data, but I haven't gotten around to it yet.

I'm mostly doing this on and off as an effort to learn and refine my thinking as opposed to making automated plays off an odds line. I haven't even really tried to turn any of this into an automated odds line yet.