PDA

View Full Version : Strange stats on Breakers


Ray2000
06-22-2013, 09:59 AM
Like traynor, I've been looking thru past performance lines of this years starters and came across something I can't explain. The robot looks at all pplines not older than 40 days and labels the starter a "bad actor" if any of the lines show a break. (actually calcs BRK%.. '1 of 2' or '3 of 6' lines is 50%) My actual program will bet against bad actors or toss the race out completely if more than 3 bad actors in a race.

Surprise...

All tracks,
all starters 136,228, Winners 17,897 (13.14%) -25.19% ROI%**

15,560 Bad Actors,... Winners 2,157 (13.86%) -17.56% ROI%


Hard to figure, breakers win more? :confused: :confused: , once again counter-intuitive


**these Winners and ROIs are not for actual races, as many of these starters are in the same race. They are based on the percentage of all lines.





Significance .0155
Call:
glm(formula = CHARTWON ~ +BRKPERCENT, family = binomial, data = z)

Deviance Residuals:
Min 1Q Median 3Q Max
-0.5621 -0.5291 -0.5291 -0.5291 2.0176

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.895320 0.008472 -223.727 <2e-16
BRKPERCENT 0.162378 0.067075 2.421 0.0155

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 105983 on 136227 degrees of freedom
Residual deviance: 105978 on 136226 degrees of freedom
AIC: 105982

DeltaLover
06-22-2013, 10:03 AM
Like traynor, I've been looking thru past performance lines of this years starters and came across something I can't explain. The robot looks at all pplines not older than 40 days and labels the starter a "bad actor" if any of the lines show a break. (actually calcs BRK%.. '1 of 2' or '3 of 6' lines is 50%) My actual program will bet against bad actors or toss the race out completely if more than 3 bad actors in a race.

Surprise...

All tracks,
all starters 136,228, Winners 17,897 (13.14%) -25.19% ROI%**

15,560 Bad Actors,... Winners 2,157 (13.86%) -17.56% ROI%


Hard to figure :confused: :confused: , once again counter-intuitive


**these Winners and ROIs are not for actual races, as many of these starters are in the same race. They are based on the percentage of all lines.





Significance .0155
Call:
glm(formula = CHARTWON ~ +BRKPERCENT, family = binomial, data = z)

Deviance Residuals:
Min 1Q Median 3Q Max
-0.5621 -0.5291 -0.5291 -0.5291 2.0176

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.895320 0.008472 -223.727 <2e-16
BRKPERCENT 0.162378 0.067075 2.421 0.0155

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 105983 on 136227 degrees of freedom
Residual deviance: 105978 on 136226 degrees of freedom
AIC: 105982



Ray,

I think what you discovered is a very typical situation where the publc falls to an illussion presenting a profitable situation. It is a special case though, since I would expect at least the winning percent to be larger but in this case it seems the public is completely wrong.

wilderness
06-22-2013, 10:24 AM
Hard to figure, breakers win more? , once again counter-intuitive

Hey Ray,
Might you have the capability to extend the stats further by stable and/or trainer?

Given the likelihood of larger stables these days, a break results in a closer look at any issues (old or new) the horse may have, and generally speaking, corrections/adjustments are eminent.

Decades ago there were more caretakers in barns per horse than in today's methods. As a result, and in the old days, issues were less likely to be be missed.
It's not uncommon these days for a caretaker to oversee five or more horses due to how the procedures have changed.

traynor
06-22-2013, 10:41 AM
You are separating trotters and pacers?

Ray2000
06-22-2013, 10:56 AM
Don, traynor

I do have horses' trainer and gait but haven't reduced it down any further as yet.

more work.... :)

DeanT
06-22-2013, 11:00 AM
The ROI being higher is something we might expect - so many say "I will not bet a breaking trotter off a qualifying line no matter what" type stuff. But the impact values are higher too? That is a head scratcher.

Ray2000
06-22-2013, 11:14 AM
Yes Dean

now that I've thought about it a bit more, maybe the Win% number is akin to the Silky Sullivan effect
.....when you do lose, lose big time.? not sure :confused:

traynor
06-22-2013, 02:44 PM
Specifically, are you separating the breakers in pace races from the breakers in trot races? They are fundamentally different kinds of races when you are studying breaks. Mixing them tends to generate weird numbers.

Ray2000
06-22-2013, 02:50 PM
traynor, I have the pacers and trotters separated now, not much of a difference that I can see.

Pacers
all starters 101,768 ....Winners 13,333 (13.10%) -24.96% ROI%
Bad Actors.... 6,954 ...Winners... 966 (13.89%) -18.16% ROI%

Trotters
all starters 34,460 Winners 4,568 (13.26%) -25.87% ROI%
Bad Actors... 8,622.Winners 1,192 (13.83%) -17.19% ROI%

Don
I'll table the numbers by the Trainer but the sample size is too low when broken down this way.
I only have this years races. Showing minimum 100 All starters and minimum 20 bad actors


Trainer Breakers Wins Win% ROI% ...All Starters Wins Win% ROI%
Hall, Michael 28 11 39% 74% 297 60 20% -20%
Palone, Michael 31 10 32% 32% 179 50 28% -2%
Mungillo, John 25 8 32% -22% 305 55 18% -32%
Hundertpfund, Joseph Jr 25 8 32% -22% 257 62 24% -12%
Copley, Jamie G 26 8 31% -13% 192 45 23% -17%
Beckwith, Melissa 33 10 30% 163% 225 46 20% 24%
Gill, Daniel 30 9 30% -37% 317 84 26% -20%
Mc Donald, Gary 20 6 30% -5% 149 39 26% 14%
Burke, Ronald 165 49 30% 7% 1265 342 27% -9%
Callahan, Nicholas 22 6 27% 133% 232 29 13% -11%
Wallace, Benjamin 22 6 27% 47% 216 38 18% -24%
Allard, Rene M 102 27 26% -13% 710 177 25% -14%
Riegle, Bruce 34 9 26% 11% 154 25 16% -31%
Henriksen, Per 40 10 25% -25% 123 29 24% -14%
Brainard, Tracy 21 5 24% -32% 134 39 29% -13%
Naedele, Christopher 22 5 23% -37% 157 25 16% -26%
Campbell, Gary 22 5 23% 227% 107 11 10% -10%
Moreau, Richard 76 17 22% 15% 574 127 22% -13%
Graham, James 27 6 22% 106% 278 40 14% 29%
Conn, Harla 33 7 21% -53% 270 65 24% -26%
Johnson, Bob 33 7 21% -3% 193 24 12% -52%
Sarama, Gerry 48 10 21% -39% 268 49 18% -16%
Bickmore, Randy 24 5 21% -43% 143 33 23% -10%
Blumenfeld, Paul 20 4 20% 22% 258 26 10% -21%
Davis, Dylan 31 6 19% -7% 217 48 22% -15%
Asher, Kimberly 26 5 19% 213% 106 9 8% 16%
Shahan, Robert 21 4 19% -8% 101 9 9% -29%
Hollar, Calvin 27 5 19% 85% 273 37 14% -44%
Ehrhardt, Kevin 38 7 18% -13% 183 35 19% -23%
Puddy, Victor 22 4 18% -31% 328 56 17% 10%
Affrunti, Angie 22 4 18% -49% 204 50 25% -22%
Stimer, Bart 28 5 18% 197% 107 8 7% 6%
Stockwell, Sue 28 5 18% 41% 107 12 11% -11%
Mc Caffrey, David 34 6 18% -40% 251 39 16% -30%
Wright, Gregory 23 4 17% -15% 116 7 6% -37%
Miller, Ervin 29 5 17% -32% 257 54 21% -28%
Crissman, Crissy 24 4 17% -56% 162 19 12% -58%
Marashian, Marcus 31 5 16% -34% 154 24 16% -2%
Morgan, Virgil 25 4 16% -39% 394 95 24% -21%
Rankin, Jennifer L 25 4 16% -66% 119 19 16% -42%
Demers, Gerard 32 5 16% -1% 193 31 16% -2%
Sherman, Kent 52 8 15% -8% 300 46 15% 2%
Harder, Mark 26 4 15% -48% 138 24 17% -2%
Clabaugh, Larry 59 9 15% -42% 269 41 15% -45%
Wrubel, Gail 20 3 15% -14% 162 28 17% -15%
Di Domenico, Scott 27 4 15% -16% 187 32 17% -32%
Schnittker, Ray 55 8 15% -50% 217 29 13% -48%
Zendt, William 21 3 14% -67% 136 24 18% 13%
Mc Donald, Jim 22 3 14% -64% 181 28 15% -51%
Smith, Jack 23 3 13% -31% 139 19 14% -11%
Zeron, Richard 23 3 13% -28% 117 13 11% -39%
Armstrong, Barry 40 5 13% -23% 121 11 9% -52%
Wengerd, John G 32 4 13% -59% 182 22 12% -28%
Simpson, Brandon 24 3 13% -11% 252 26 10% -15%
Beaudoin, Jacques 24 3 13% -54% 126 11 9% -51%
Wengerd, John 33 4 12% -60% 223 29 13% -28%
Miller, Julie 33 4 12% -5% 146 21 14% -21%
Deters, Michael 33 4 12% -62% 120 16 13% -47%
Harmon, Robert W 35 4 11% -51% 263 37 14% -30%
Baillargeon, Benoit O 53 6 11% -43% 188 24 13% -53%
Johnson, Allan 28 3 11% -80% 163 18 11% -33%
Plano, Richard 39 4 10% -56% 241 44 18% -24%
Downey, Ian C 39 4 10% -58% 214 23 11% -14%
Ford, Mark 40 4 10% 65% 661 86 13% -28%
Messenger, Gary 40 4 10% -30% 305 51 17% 0%
Pfister, Don 40 4 10% -74% 168 23 14% -38%
Bendis, Randy 20 2 10% -12% 338 57 17% -20%
Phillips, Robert 20 2 10% -36% 165 22 13% -27%
Smith, Joel 22 2 9% -62% 124 24 19% -13%
Ohol, Brenda 34 3 9% 84% 256 15 6% -6%
Perrin, John 57 5 9% -71% 193 14 7% -70%
Cassavaugh, Randy 26 2 8% -77% 164 13 8% -41%
Fahy, William 31 2 6% -45% 117 11 9% -36%
Gillock, Richard D 35 2 6% -65% 190 20 11% -30%
Laterza, Dennis 20 1 5% -90% 191 18 9% -39%
Posner, Michael 20 1 5% -76% 173 10 6% -65%
Croghan, Ross 20 1 5% -83% 132 17 13% -15%
Lareau, Gaston 20 1 5% -38% 105 4 4% -69%
Petrelli, Frank 21 1 5% -66% 129 26 20% -13%
Morales, Jose 23 1 4% -85% 146 19 13% -26%
Cassano, Joseph 25 1 4% -90% 112 3 3% -70%
Snyder, Dane 28 1 4% -48% 165 29 18% -11%
Snyder, Doug 31 1 3% -48% 140 16 11% -12%
Robinson, Shawn R 22 0 0% -100% 151 14 9% -47%



I'm working on Trainer switch after a claim but once again not many starters to work with.

if there's a prominent trainer not shown it's because he's not in my license lookup list.
That needs updated too :)

lamboguy
06-22-2013, 02:53 PM
those are some great stats you guys are coming up with. you are going against conventional opinion's which are usually very good!

lets put it this way, the way i am going so far today, i need a good DOCTOR!

Ray2000
06-22-2013, 02:58 PM
lambo

I recommend Doctor Who.......with a time machine.. :D

traynor
06-22-2013, 03:24 PM
I have not kept track of breaks as a factor for several years, mainly because I only handicap races from a database of handicapped races (not from past performances). By the time I see the data, it has all been cleaned, tweaked, changed, and whatever else. It doesn't look much like "normal" past performances at all.

The numbers look strange, because when I DID track breaks, they were fairly normal in trot races, and relatively rare in pacers. Your numbers suggest they are almost identical. That is really weird, because almost everything I track shows different data for pacers and trotters.

Ray2000
06-22-2013, 04:05 PM
Yeah, I'm not ready to 'hang my hat' on this, calling a starter a bad actor because of 1 break in 2 pplines because that's all that's available is shaky. but..That should be a small percentage of the database though....Oh well deserves more study because my presently "in use" robot might be throwing away some positive bets...

pandy
06-22-2013, 10:19 PM
I've had a few takes on breakers in my columns over the years. One, obvious, don't bet on 2yo trotters that are the favorite because a lot of them break, especially fillies.

Two, in general, don't bet favorites that show recent breaks.

Three, generally speaking, I don't have a problem betting a horse that broke in its last start if the value is right (normally you get paid for taking the chance because the odds are higher because of break). The reason why I say this, over the long haul, if you only bet overlays, or even if you avoid obvious underlays, it really doesn't matter if you bet on horses that show breaks or not. My reasoning is simple: If a horse wins 25% of its starts, does it matter if it breaks occasionally? For instance, say you have two trotters who race in the same class, each have 50 starts and 15 wins. One has never made a break and the other has made 5 breaks. What's the difference? The breakers you want to avoid are the ones with low win percentages.

traynor
06-22-2013, 11:29 PM
Be nice to them when you first encounter them, and be sure you can find them again.

I met a nice gentleman at the summer harness meet at Bay Meadows many years ago. He asked (of course), "Who do you like in this race?" I responded confidently, "Roxie's Fiddler." He said, "Yeah, I know. But what happens if Roxie's Fiddler breaks?" Roxie's Fiddler showed one break four races back. Why he would think it would happen again is unknown (and still is, at least to me).

We had an interesting discussion about alternatives, but--being young and foolish and overly impressed with the "wisdom of crowds" I bet rather heavily on the favorite--Roxie's Fiddler--who seemed unbeatable.
http://www.allbreedpedigree.com/roxies+fiddler

Of course, Roxie's Fiddler broke, the exacta combination the nice gentleman had recommended produced a major score (presumably for him, certainly not for me) and I spent the rest of the season wandering the turf club and grandstands looking for him. Oh, well. So much for not listening to others.

This is copied from another thread, but the point is relevant. Sometimes it is wise to consider what might happen if the primary contender breaks. I agree with Pandy that trying to divine whether a horse will break or not is not something to agonize over. However, it might be nice to have a backup plan in the event it does.

Ray2000
06-23-2013, 05:16 AM
Good advice Pandy, traynor

here's a gimmick I posted back in August of '08....OMG Have I been posting that long??? :)

http://www.paceadvantage.com/forum/showthread.php?t=50155

wiffleball whizz
06-23-2013, 05:50 AM
Wow I'm trying to comprehend this thread a lot of numbers here.....but I do appreciate the analysis......

Question.....say a program page or horse has 8 lines and the horse shows 2 breaks a win and 5 races on a half with post 6,7 or 8 does that count for anything...

Or if a horse has 8 starts no breaks but always draws the 1,2 or 3 hole.....wax wondering if posts were under consideration cuz a 8 hole may just be a trip around the track without much risk of snapping one off

Ray2000
06-23-2013, 08:46 AM
Question.....say a program page or horse has 8 lines and the horse shows 2 breaks a win and 5 races on a half with post 6,7 or 8 does that count for anything...



For every starter this year (up to Jun 10), at almost every track available for internet wagering. I assemble a spreadsheet row (record) with the usual info on date, driver, trainer etc. The robot also looks at the pplines for last 40 days previous to the race day and averages the numbers for speed, position change from post to half, lengths behind at finish, and a bunch of other stuff. For break% it counts the number of pplines showing a "X" including Qualifiers and records this as a percentage.

I did not consider any other factors such as the post position, tracksize, drivers, dr switches, etc. just whether or not it was a winning line, and if so what'd it pay.

BTW looking at it with some further detail gives...


Breaks Startrs Wins Win% ROI%
1 in 6 2383 351 15% -23%
1 in 5 4457 617 14% -21%
1 in 4 4286 567 13% -23%
1 in 3 7963 1091 14% -16%
1 in 2 773 99 13% 1%

none 120668 15743 13% -26%


so apparently the crowd underbets the worst of them.

traynor
06-23-2013, 08:47 AM
Wow I'm trying to comprehend this thread a lot of numbers here.....but I do appreciate the analysis......

Question.....say a program page or horse has 8 lines and the horse shows 2 breaks a win and 5 races on a half with post 6,7 or 8 does that count for anything...

Or if a horse has 8 starts no breaks but always draws the 1,2 or 3 hole.....wax wondering if posts were under consideration cuz a 8 hole may just be a trip around the track without much risk of snapping one off

Track size could definitely make post position a factor in how seriously one takes a race to be an example of potential, both in conjunction with breaks (on the assumption that inner posts equate to more exertion/effort/try, with outer posts the opposite). That is, if all the "no break" races are outer post, and all the breaks are inner post, there may be a direct correlation. The problem in using that logic is the (excruciatingly small) sample size--that may be (and probably is) insufficient to make judgement calls about whether or not a particular horse will break in a particular race. With that caveat, it could be a VERY useful factor to study in depth over many horses in many races to see if there is any correlation.

traynor
06-23-2013, 09:00 AM
For every starter this year (up to Jun 10), at almost every track available for internet wagering. I assemble a spreadsheet row (record) with the usual info on date, driver, trainer etc. The robot also looks at the pplines for last 40 days previous to the race day and averages the numbers for speed, position change from post to half, lengths behind at finish, and a bunch of other stuff. For break% it counts the number of pplines showing a "X" including Qualifiers and records this as a percentage.

I did not consider any other factors such as the post position, tracksize, drivers, dr switches, etc. just whether or not it was a winning line, and if so what'd it pay.

BTW looking at it with some further detail gives...


Breaks Startrs Wins Win% ROI%
1 in 6 2383 351 15% -23%
1 in 5 4457 617 14% -21%
1 in 4 4286 567 13% -23%
1 in 3 7963 1091 14% -16%
1 in 2 773 99 13% 1%

none 120668 15743 13% -26%


so apparently the crowd underbets the worst of them.

One area that might need some fine tuning is the race selection criteria. For "habitual" breakers, qualifying races are intended to demonstrate the ability to make it around the track without breaking--with little regard for any other performance indicator. Including qualifying races with betting races may distort the significance of the results.

If I were doing the data mining, I would be tempted to code something to distinguish "blocks" of races, in which a race with a break, followed by a qualifier, followed by a race without a break would be treated/evaluated using a different set of criteria than "three races, one break." There are a number of other similar "layering" processes you might use.

It looks like you are doing some really nteresting research, that could benefit anyone who wagers on harness races. Stated explicitly or not, I think all of those who read about your research appreciate the time and effort you put into it, as well as the spirit in which it is freely presented to those who need it most.

Ray2000
06-23-2013, 10:32 AM
thx traynor

looking for those patterns you've described is something I'd like to do.

wiffleball whizz
06-23-2013, 12:44 PM
I'll tell u a track that this can really be a factor is ocean downs.....I've talked to a few horsemen that say the track is unlike anything they have seen before......the track is close to being a circle and outside horses or a horse trying to come off the back have zero shot......and they got cheap trotters off big layoffs here.....very well could be worth looking it when your applying numbers here.....

This thread actually has me wanting to bet trot races more now....

Again thanks for generating these stats here....and take a look at ocean downs the bias there is the worst in racing

Longshot6977
06-23-2013, 07:19 PM
....and take a look at ocean downs the bias there is the worst in racing

Here is a link to the post position stats.

http://www.oceandowns.com/pdf/PostPositionStats.pdf

Percentage of Winning Favorites is 50.47%

wiffleball whizz
06-23-2013, 07:29 PM
Here is a link to the post position stats.

http://www.oceandowns.com/pdf/PostPositionStats.pdf

Percentage of Winning Favorites is 50.47%











Suprised the 8 actually has that many wins.....and as typical half mile track racing the 4 shows to be a great starting spot.....I guess that is the typical leave let 1 retake and see if you can get him in the lane...

Sitting home watcing OD on the phone thinking about heading over to the borgAta and bet some harness.....tons of harness tonight