PDA

View Full Version : What odds board to scrape?


CBedo
07-14-2009, 07:36 PM
I'm tired of listening to whining about late odds drops (usually by me), so I want to do some real research into the phenomenon. The first part of the project is just picking an odds board to scrape. Any recommendations would be helpful.

The main things I'm lookiing for are:
1. Free
2. Have pool size information (at least for the win pool)

Secondarily, these qualifications would be nice:
3. Don't have to log in (less programming, not need to do authentication)
4. Aren't going to freak out that someone is scraping their site (I know almost all sites, it's against their TOS. I'm not going to be killing their servers 100 times per minute, just doing checks in line with the tote cycle).

Any site recommendations or any other tips or tricks would be greatly appreciated. :ThmbUp:

PaceAdvantage
07-15-2009, 12:03 AM
HTML boards are the easiest of course....but I'm not sure of any that don't require a login...

CBedo
07-15-2009, 01:00 AM
Yep, parsing an html page is pretty easy (I'm learning), but sometimes the authentication and cookie passing is a little tougher (for my simplistic programming ability anyway). I just thought I'd ask if anyone had any recommendations before I locked into one.

bcgreg
07-15-2009, 08:44 AM
Have you looked at this:

http://www.homebased2.com/atr/at_the_races.htm

bcgreg

Ted Craven
07-15-2009, 08:59 AM
Plainridge Racecourse Tote (http://12.40.60.194/plainridge/DisplayNavTable.asp?SortField=ResolvedName&TrackType=T) is easier to login to via software automation and has all the data you need. Smaller op than BRIS SuperTote/TwinSpires, so don't overload them (please...;) )

Ted

Tom Barrister
07-15-2009, 09:13 AM
Have you considered trying the ESPN (YouBet) feed?

Handiman
07-15-2009, 11:05 AM
Any idea on what the cost of a direct data feed would be? And /or the feasability of getting one?


Handi :)

chickenhead
07-15-2009, 11:20 AM
I was thinking about this sort of thing as a larger part of a HANA project awhile back.

Why don't we (horseplayers) make a site somewhere that offers a structured interface for this kind of information, so anyone that's interested doesn't have to write their own scraper?

Why don't we have the database housed on that site with a simple interface so anyone that is interested in doing some basic research can? And have the raw database downloadable so anyone interested in more extensive research can?

Tote data is one of the few things that isn't copyrighted.

rokitman
07-15-2009, 09:51 PM
I was thinking about this sort of thing as a larger part of a HANA project awhile back.

Why don't we (horseplayers) make a site somewhere that offers a structured interface for this kind of information, so anyone that's interested doesn't have to write their own scraper?

Why don't we have the database housed on that site with a simple interface so anyone that is interested in doing some basic research can? And have the raw database downloadable so anyone interested in more extensive research can?

Tote data is one of the few things that isn't copyrighted.
:ThmbUp: :ThmbUp: :ThmbUp:

CBedo
07-16-2009, 01:26 AM
It's funny some of the things BRIS has done (probably to keep people like me from scraping). I looked at ESPN and TVG, but they dont' provide pool totals, just odds (at least from what I saw). I want the actual dollar amounts bet on a horse. I thnk the Plainridge site would be really easy, but it doesn't seem to have the widest track selection.

I'm working through some of the javascript on one of the ADWs sites right now. I think I've about got it figured out. It's not optimal, but it should work for my purposes.

I noticed that ATR Pro seems to like the Philly Phonebet board, so maybe I'll open an account with them to check it out.

GameTheory
07-16-2009, 01:33 AM
It's funny some of the things BRIS has done (probably to keep people like me from scraping). I looked at ESPN and TVG, but they dont' provide pool totals, just odds (at least from what I saw). I want the actual dollar amounts bet on a horse. I thnk the Plainridge site would be really easy, but it doesn't seem to have the widest track selection.

I'm working through some of the javascript on one of the ADWs sites right now. I think I've about got it figured out. It's not optimal, but it should work for my purposes.

I noticed that ATR Pro seems to like the Philly Phonebet board, so maybe I'll open an account with them to check it out.Phonebet/Racing Channel is by far the easiest and most reliable to parse of the HTML toteboards, but you do need an account and you need to be logged in. Plus they never seem to change anything. I've been parsing that one for many years -- only had to change it once when they moved to PHP pages (page format stayed the same).