Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Thoroughbred Horse Racing Discussion > General Handicapping Discussion


Reply
 
Thread Tools Rate Thread
Old 07-23-2017, 05:11 PM   #1
Dan Montilion
Registered User
 
Join Date: Dec 2002
Posts: 1,013
Big Data... Thick Data

https://www.ted.com/talks/tricia_wan..._from_big_data

I found this to be a very informative handicapping presentation, albeit not about handicapping. I look forward to the thoughts of others. If it produces any. Hypothetically and in the simplest of terms. Bag data shows best effort if run back in 28 days. Thick data notes horse was entered on day 27 and the race did not fill but is written back on day 30 and fills.
__________________
"Your body is not a temple, it's an amusement park. Enjoy the ride."

Anthony Bourdain
Dan Montilion is offline   Reply With Quote Reply
Old 07-23-2017, 06:11 PM   #2
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,257
Thanks for posting that. (I really enjoyed watching the presentation.)

Ok. Sticking with your days since last raced example...

Suppose, hypothetically, that big data suggests optimal returns occur (could be thousands of parimutuel tickets cashed or thousands of checks for purse money earned) when race day occurs on the 28th day after the most recent start.

It should be obvious that thick data -- if put in the right context -- has the ability to completely overrule whatever observations might have been gleaned from big data.

Big data example: You have a database and can generate large sample stats for horses returning off a 180 day layoff since their most recent start.

Thick data example: You have insight into what transpired during the layoff.

What if you are able to make the thick data observation that a specific horse was turned out to a private farm for six months? And was given steroids and worked vigorously on the half mile track there?

And shows up in the paddock today carrying muscle mass and confidence he didn't have before?

When you are able to connect the dots in a thick data way you'd be crazy to just blindly go with your big data model.


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com

Last edited by Jeff P; 07-23-2017 at 06:17 PM.
Jeff P is offline   Reply With Quote Reply
Old 07-23-2017, 07:04 PM   #3
zerosky
Registered User
 
Join Date: Feb 2004
Location: uk
Posts: 369
Interesting lecture, I just wish they would stop using the term 'Big Data' its statistics!
I found some good insights on the following pages.
http://psychclassics.yorku.ca/topic.htm
zerosky is offline   Reply With Quote Reply
Old 07-24-2017, 06:02 AM   #4
Gamblor
Registered User
 
Join Date: Oct 2012
Posts: 75
So what's better? Big or thick? How about both big AND thick?
Gamblor is offline   Reply With Quote Reply
Old 07-24-2017, 07:39 AM   #5
acorn54
Registered User
 
Join Date: Dec 2003
Location: new york
Posts: 1,629
Quote:
Originally Posted by Jeff P View Post
Thanks for posting that. (I really enjoyed watching the presentation.)

Ok. Sticking with your days since last raced example...

Suppose, hypothetically, that big data suggests optimal returns occur (could be thousands of parimutuel tickets cashed or thousands of checks for purse money earned) when race day occurs on the 28th day after the most recent start.

It should be obvious that thick data -- if put in the right context -- has the ability to completely overrule whatever observations might have been gleaned from big data.

Big data example: You have a database and can generate large sample stats for horses returning off a 180 day layoff since their most recent start.

Thick data example: You have insight into what transpired during the layoff.

What if you are able to make the thick data observation that a specific horse was turned out to a private farm for six months? And was given steroids and worked vigorously on the half mile track there?

And shows up in the paddock today carrying muscle mass and confidence he didn't have before?

When you are able to connect the dots in a thick data way you'd be crazy to just blindly go with your big data model.


-jp

.
i think the lecturer mentioned the fact that companies are going the way of the dodo bird, by blindly following what the big data tells them.
acorn54 is online now   Reply With Quote Reply
Old 07-24-2017, 12:31 PM   #6
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,257
That's exactly the point I was trying to make.

If a once lofty company like Nokia can fall off the face of the map because their managers chose to ignore thick data and were utterly blind to emerging trends in their market space:

What does that say about the horseplayer who ignores thick data?

Or for that matter -- What does that say about track management and horsemen who choose to ignore thick data?

See the horse racing slowly dying in SoCal thread.


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com
Jeff P is offline   Reply With Quote Reply
Old 07-24-2017, 01:39 PM   #7
ReplayRandall
Buckle Up
 
ReplayRandall's Avatar
 
Join Date: Apr 2014
Posts: 10,614
Quote:
Originally Posted by Jeff P View Post
What does that say about the horseplayer who ignores thick data?
Defining what is "thick data" to the horseplayer is a subject undertaking, to say the least. For example of thick data, what are the public's betting tendencies as we get towards the middle of the card at a specific track? At the beginning, at the end? What pools are affected the most to extract value? The least? How do you gather this info from the players perspective? Is it based on whether a lot of chalk has been winning, medium prices or bombs as we go through the card or viewing yesterday's charts/replays? Or is it based on perceived biases on the dirt, turf, routes or sprints as the card progresses?.....The list is quite long and very subjective for establishing "what is good thick data", versus mediocre data.....Lastly, what percentage of "blend" do you give big data when combined with thick data for optimal results/profits?
ReplayRandall is offline   Reply With Quote Reply
Old 07-26-2017, 11:34 AM   #8
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,257
Imo, valid questions -- every one of them.

But the last one is of particular interest to me:
Quote:
Originally Posted by ReplayRandall View Post
.....Lastly, what percentage of "blend" do you give big data when combined with thick data for optimal results/profits?
You mentioned something that I think is a valid point:
Quote:
Originally Posted by ReplayRandall View Post
.....The list is quite long and very subjective for establishing "what is good thick data", versus mediocre data.
One approach that seems to be working (for me) has been to get both big data and thick data into a data set.

And from there run a statistical analysis (mlr, tda, what have you) on the intersection of big data and thick data.

If you've made a valid thick data observation: Your stat analysis should suggest that incremental improvement can be had by adding a thick data observation to an existing big data model.


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com

Last edited by Jeff P; 07-26-2017 at 11:49 AM.
Jeff P is offline   Reply With Quote Reply
Old 07-26-2017, 12:07 PM   #9
ReplayRandall
Buckle Up
 
ReplayRandall's Avatar
 
Join Date: Apr 2014
Posts: 10,614
Quote:
Originally Posted by Jeff P View Post
Imo, valid questions -- every one of them.

But the last one is of particular interest to me:

One approach that seems to be working (for me) has been to get both big data and thick data in a data set.

And from there run a statistical analysis on the intersection of big data and thick data.



-jp

.
I use converging/intersection points which reoccur, as there is more than just one "intersection" to my analysis....BTW, each and every track has its own unique data stats and betting mentality(thick data), thus there is NO universal format that works across all venues/circuits.....Except for one, that works in tourneys only, which is what 3.5 years of hit and miss will finally get you, but the end result was worth the time invested.
ReplayRandall is offline   Reply With Quote Reply
Old 07-26-2017, 12:26 PM   #10
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
I cannot see how horse racing can be approached using Big data. In contrary I think that the related data do not qualify neither by size nor by type. Using a single modern computer we can easily load million or races in memory (covering many years worth of complete data) represented in a structured format that can be processed as such.
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein
DeltaLover is offline   Reply With Quote Reply
Old 07-26-2017, 12:44 PM   #11
ReplayRandall
Buckle Up
 
ReplayRandall's Avatar
 
Join Date: Apr 2014
Posts: 10,614
Quote:
Originally Posted by DeltaLover View Post
I cannot see how horse racing can be approached using Big data. In contrary I think that the related data do not qualify neither by size nor by type. Using a single modern computer we can easily load million or races in memory (covering many years worth of complete data) represented in a structured format that can be processed as such.
IMO simply stated, there is NO EDGE left in a structured formatted data process/analysis, it's been picked clean....You must go outside the box, using creative contrarian concepts to find an edge. There are exceptions, but the actual number of plays are so limited and subject to variance droughts, it's just not worth the time invested.
ReplayRandall is offline   Reply With Quote Reply
Old 07-26-2017, 12:53 PM   #12
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Quote:
Originally Posted by ReplayRandall View Post
IMO simply stated, there is NO EDGE left in a structured formatted data process/analysis, it's been picked clean....You must go outside the box, using creative contrarian concepts to find an edge. There are exceptions, but the actual number of plays are so limited and subject to variance droughts, it's just not worth the time invested.
What you are saying here is correct although I have the following questions:

- Why is not possible to create "contrarian concepts" ( I like the term!) based on the existing data? After all these are the data that dictate the formation of the pools and they must be responsible for the existence of betting inefficiencies.

- What is the source of the (potentially unstructured) data to use? Are they the product of web search (including social data like twiter of fb for example) or they require custom collection meaning dedicated on site observers?
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein
DeltaLover is offline   Reply With Quote Reply
Old 07-26-2017, 01:16 PM   #13
ReplayRandall
Buckle Up
 
ReplayRandall's Avatar
 
Join Date: Apr 2014
Posts: 10,614
Quote:
Originally Posted by DeltaLover View Post
What you are saying here is correct although I have the following questions:

- Why is not possible to create "contrarian concepts" ( I like the term!) based on the existing data? After all these are the data that dictate the formation of the pools and they must be responsible for the existence of betting inefficiencies.

- What is the source of the (potentially unstructured) data to use? Are they the product of web search (including social data like twiter of fb for example) or they require custom collection meaning dedicated on site observers?
Here's an example of using big data sets at a slightly losing ROI of 93-95%. If this specific data set has consistently shown these numbers for the last 3 years, I look to see how they are doing after 100 plays. If they are severely under-performing, say at a 60% rate of return, I will have the confidence based on the data to bet these specific sets HARD, until they return close to their mean performance, like an under-valued stock that has a great balance sheet, good fundamentals, product line, but for some unknown reason has fallen out of favor with the market crowd......A contrarian concept using data which most operators throw away for lack of a +ROI, but is consistent at 93-95% as they come...$$$
ReplayRandall is offline   Reply With Quote Reply
Old 07-26-2017, 01:22 PM   #14
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Quote:
Originally Posted by ReplayRandall View Post
Here's an example of using big data sets at a slightly losing ROI of 93-95%. If this specific data set has consistently shown these numbers for the last 3 years, I look to see how they are doing after 100 plays. If they are severely under-performing, say at a 60% rate of return, I will have the confidence based on the data to bet these specific sets HARD, until they return close to their mean performance, like an under-valued stock that has a great balance sheet, good fundamentals, product line, but for some unknown reason has fallen out of favor with the market crowd......A contrarian concept using data which most operators throw away for lack of a +ROI, but is consistent at 93-95% as they come...$$$
Great! This is the way to go. Still, these approach has nothing to do with big data which is the theme of this thread. "Big data" is not (necessarily) about the absolute size of the data but about the processing methodology.
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein
DeltaLover is offline   Reply With Quote Reply
Old 07-26-2017, 01:26 PM   #15
ReplayRandall
Buckle Up
 
ReplayRandall's Avatar
 
Join Date: Apr 2014
Posts: 10,614
Quote:
Originally Posted by DeltaLover View Post
Great! This is the way to go. Still, these approach has nothing to do with big data which is the theme of this thread. "Big data" is not (necessarily) about the absolute size of the data but about the processing methodology.
I know, but this subject is basically dead to me, while thick data is still alive and well, and I thought I'd just give you something interesting to chew on..
ReplayRandall is offline   Reply With Quote Reply
Reply




Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 05:57 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.