Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board


Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board (http://www.paceadvantage.com/forum/index.php)
-   General Handicapping Discussion (http://www.paceadvantage.com/forum/forumdisplay.php?f=2)
-   -   Big Data... Thick Data (http://www.paceadvantage.com/forum/showthread.php?t=139762)

Dan Montilion 07-23-2017 05:11 PM

Big Data... Thick Data
 
https://www.ted.com/talks/tricia_wan..._from_big_data

I found this to be a very informative handicapping presentation, albeit not about handicapping. I look forward to the thoughts of others. If it produces any. Hypothetically and in the simplest of terms. Bag data shows best effort if run back in 28 days. Thick data notes horse was entered on day 27 and the race did not fill but is written back on day 30 and fills.

Jeff P 07-23-2017 06:11 PM

Thanks for posting that. (I really enjoyed watching the presentation.)

Ok. Sticking with your days since last raced example...

Suppose, hypothetically, that big data suggests optimal returns occur (could be thousands of parimutuel tickets cashed or thousands of checks for purse money earned) when race day occurs on the 28th day after the most recent start.

It should be obvious that thick data -- if put in the right context -- has the ability to completely overrule whatever observations might have been gleaned from big data.

Big data example: You have a database and can generate large sample stats for horses returning off a 180 day layoff since their most recent start.

Thick data example: You have insight into what transpired during the layoff.

What if you are able to make the thick data observation that a specific horse was turned out to a private farm for six months? And was given steroids and worked vigorously on the half mile track there?

And shows up in the paddock today carrying muscle mass and confidence he didn't have before?

When you are able to connect the dots in a thick data way you'd be crazy to just blindly go with your big data model.


-jp

.

zerosky 07-23-2017 07:04 PM

Interesting lecture, I just wish they would stop using the term 'Big Data' its statistics!
I found some good insights on the following pages.
http://psychclassics.yorku.ca/topic.htm

Gamblor 07-24-2017 06:02 AM

So what's better? Big or thick? How about both big AND thick?

acorn54 07-24-2017 07:39 AM

Quote:

Originally Posted by Jeff P (Post 2198856)
Thanks for posting that. (I really enjoyed watching the presentation.)

Ok. Sticking with your days since last raced example...

Suppose, hypothetically, that big data suggests optimal returns occur (could be thousands of parimutuel tickets cashed or thousands of checks for purse money earned) when race day occurs on the 28th day after the most recent start.

It should be obvious that thick data -- if put in the right context -- has the ability to completely overrule whatever observations might have been gleaned from big data.

Big data example: You have a database and can generate large sample stats for horses returning off a 180 day layoff since their most recent start.

Thick data example: You have insight into what transpired during the layoff.

What if you are able to make the thick data observation that a specific horse was turned out to a private farm for six months? And was given steroids and worked vigorously on the half mile track there?

And shows up in the paddock today carrying muscle mass and confidence he didn't have before?

When you are able to connect the dots in a thick data way you'd be crazy to just blindly go with your big data model.


-jp

.

i think the lecturer mentioned the fact that companies are going the way of the dodo bird, by blindly following what the big data tells them.

Jeff P 07-24-2017 12:31 PM

That's exactly the point I was trying to make.

If a once lofty company like Nokia can fall off the face of the map because their managers chose to ignore thick data and were utterly blind to emerging trends in their market space:

What does that say about the horseplayer who ignores thick data?

Or for that matter -- What does that say about track management and horsemen who choose to ignore thick data?

See the horse racing slowly dying in SoCal thread.


-jp

.

ReplayRandall 07-24-2017 01:39 PM

Quote:

Originally Posted by Jeff P (Post 2199064)
What does that say about the horseplayer who ignores thick data?

Defining what is "thick data" to the horseplayer is a subject undertaking, to say the least. For example of thick data, what are the public's betting tendencies as we get towards the middle of the card at a specific track? At the beginning, at the end? What pools are affected the most to extract value? The least? How do you gather this info from the players perspective? Is it based on whether a lot of chalk has been winning, medium prices or bombs as we go through the card or viewing yesterday's charts/replays? Or is it based on perceived biases on the dirt, turf, routes or sprints as the card progresses?.....The list is quite long and very subjective for establishing "what is good thick data", versus mediocre data.....Lastly, what percentage of "blend" do you give big data when combined with thick data for optimal results/profits?

Jeff P 07-26-2017 11:34 AM

Imo, valid questions -- every one of them.

But the last one is of particular interest to me:
Quote:

Originally Posted by ReplayRandall (Post 2199088)
.....Lastly, what percentage of "blend" do you give big data when combined with thick data for optimal results/profits?

You mentioned something that I think is a valid point:
Quote:

Originally Posted by ReplayRandall (Post 2199088)
.....The list is quite long and very subjective for establishing "what is good thick data", versus mediocre data.

One approach that seems to be working (for me) has been to get both big data and thick data into a data set.

And from there run a statistical analysis (mlr, tda, what have you) on the intersection of big data and thick data.

If you've made a valid thick data observation: Your stat analysis should suggest that incremental improvement can be had by adding a thick data observation to an existing big data model.


-jp

.

ReplayRandall 07-26-2017 12:07 PM

Quote:

Originally Posted by Jeff P (Post 2199718)
Imo, valid questions -- every one of them.

But the last one is of particular interest to me:

One approach that seems to be working (for me) has been to get both big data and thick data in a data set.

And from there run a statistical analysis on the intersection of big data and thick data.



-jp

.

I use converging/intersection points which reoccur, as there is more than just one "intersection" to my analysis....BTW, each and every track has its own unique data stats and betting mentality(thick data), thus there is NO universal format that works across all venues/circuits.....Except for one, that works in tourneys only, which is what 3.5 years of hit and miss will finally get you, but the end result was worth the time invested.

DeltaLover 07-26-2017 12:26 PM

I cannot see how horse racing can be approached using Big data. In contrary I think that the related data do not qualify neither by size nor by type. Using a single modern computer we can easily load million or races in memory (covering many years worth of complete data) represented in a structured format that can be processed as such.

ReplayRandall 07-26-2017 12:44 PM

Quote:

Originally Posted by DeltaLover (Post 2199728)
I cannot see how horse racing can be approached using Big data. In contrary I think that the related data do not qualify neither by size nor by type. Using a single modern computer we can easily load million or races in memory (covering many years worth of complete data) represented in a structured format that can be processed as such.

IMO simply stated, there is NO EDGE left in a structured formatted data process/analysis, it's been picked clean....You must go outside the box, using creative contrarian concepts to find an edge. There are exceptions, but the actual number of plays are so limited and subject to variance droughts, it's just not worth the time invested.

DeltaLover 07-26-2017 12:53 PM

Quote:

Originally Posted by ReplayRandall (Post 2199734)
IMO simply stated, there is NO EDGE left in a structured formatted data process/analysis, it's been picked clean....You must go outside the box, using creative contrarian concepts to find an edge. There are exceptions, but the actual number of plays are so limited and subject to variance droughts, it's just not worth the time invested.

What you are saying here is correct although I have the following questions:

- Why is not possible to create "contrarian concepts" ( I like the term!) based on the existing data? After all these are the data that dictate the formation of the pools and they must be responsible for the existence of betting inefficiencies.

- What is the source of the (potentially unstructured) data to use? Are they the product of web search (including social data like twiter of fb for example) or they require custom collection meaning dedicated on site observers?

ReplayRandall 07-26-2017 01:16 PM

Quote:

Originally Posted by DeltaLover (Post 2199738)
What you are saying here is correct although I have the following questions:

- Why is not possible to create "contrarian concepts" ( I like the term!) based on the existing data? After all these are the data that dictate the formation of the pools and they must be responsible for the existence of betting inefficiencies.

- What is the source of the (potentially unstructured) data to use? Are they the product of web search (including social data like twiter of fb for example) or they require custom collection meaning dedicated on site observers?

Here's an example of using big data sets at a slightly losing ROI of 93-95%. If this specific data set has consistently shown these numbers for the last 3 years, I look to see how they are doing after 100 plays. If they are severely under-performing, say at a 60% rate of return, I will have the confidence based on the data to bet these specific sets HARD, until they return close to their mean performance, like an under-valued stock that has a great balance sheet, good fundamentals, product line, but for some unknown reason has fallen out of favor with the market crowd......A contrarian concept using data which most operators throw away for lack of a +ROI, but is consistent at 93-95% as they come...$$$

DeltaLover 07-26-2017 01:22 PM

Quote:

Originally Posted by ReplayRandall (Post 2199743)
Here's an example of using big data sets at a slightly losing ROI of 93-95%. If this specific data set has consistently shown these numbers for the last 3 years, I look to see how they are doing after 100 plays. If they are severely under-performing, say at a 60% rate of return, I will have the confidence based on the data to bet these specific sets HARD, until they return close to their mean performance, like an under-valued stock that has a great balance sheet, good fundamentals, product line, but for some unknown reason has fallen out of favor with the market crowd......A contrarian concept using data which most operators throw away for lack of a +ROI, but is consistent at 93-95% as they come...$$$

Great! This is the way to go. Still, these approach has nothing to do with big data which is the theme of this thread. "Big data" is not (necessarily) about the absolute size of the data but about the processing methodology.

ReplayRandall 07-26-2017 01:26 PM

Quote:

Originally Posted by DeltaLover (Post 2199744)
Great! This is the way to go. Still, these approach has nothing to do with big data which is the theme of this thread. "Big data" is not (necessarily) about the absolute size of the data but about the processing methodology.

I know, but this subject is basically dead to me, while thick data is still alive and well, and I thought I'd just give you something interesting to chew on..;)


All times are GMT -4. The time now is 10:48 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 10:48 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.