Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board


View Single Post
Old 08-09-2020, 01:07 PM   #13
Jeff P
Registered User
 
Jeff P's Avatar
 
Join Date: Dec 2001
Location: JCapper Platinum: Kind of like Deep Blue... but for horses.
Posts: 5,291
Quote:
Originally Posted by Tom View Post
How do you guys open the XML file so it is readable?
I tried several things Google suggested,note worked.
Opening with a browser shows all the non-text elements.

Maybe my browsers are all too old?
I started parsing the xml in 2009 when it first came out. Back then I was using the (then) latest version of Microsoft's xml parser.

I quickly discovered Microsoft's xml parser was using way too much RAM - approx 1.2 gigabytes of the 4.0 gigabytes that my then 2007 machine had.

To me, this seemed ridiculous given that at end of day after all changes had been added the xml files themselves generally averaged about 200 kb in size... and that early in the day when only east coast tracks were running an xml file might only be 70 kb in size.

So I wrote my own xml parser. (Of course it helps that I have a background as a developer.)

That said, there are many different xml parsers out there and (Imo) just about all of them can get the job done.

In its simplest form xml is a standardized schema designed to deliver data wrapped inside of tags (also called nodes.)

The tags or nodes in the file follow a certain order.

At the very top of the document you'll find a tag that looks like this: "<late_changes>" (without the quotes.)

At the very bottom of the document you'll find a closing tag that looks like this: "</late_changes>" (without the quotes.)

Everything between the two tags (the late_changes node) contains data for late changes.

The next row in today's xml file looks like this: "<race_date>08/09/2009</race_date>" (without the quotes.)

As you might intuitively guess, the string text between the "<race_date>" and "</race_date>" tags or the race_date node (without the quotes) is where you'll find the date for all of the changes data found in the current file.

The xml parser that I wrote simply scans the file and reads string text contained between predefined tags (or data contained in predefined nodes.)

Later in the day a track employee or someone working for Equibase will add similar rows for both Albuquerque and Arlington Park... but right now as I type this, the next row in the file looks like this: "<track country="USA" id="CNL" track_name="COLONIAL DOWNS">"

This is the opening track tag for Colonial Downs.

Several rows further down in the file you'll find a closing track tag for Colonial Downs that looks like this: "</track>" (without the quotes.)

Everything between each opening track tag and closing track tag (or within each track node) contains changes data for that track:

Course changes, distance changes, track condition changes, temp rail changes, scratches, rider overweights, horse weights, rider changes, and reported first time geldings, etc.

The xml parser that I wrote simply scans the predefined nodes in the file one track at a time and one race at a time - reading the data contained between each pair of predefined tags or within each node, and writes the data read from each node to a database - where it can then be used to generate a changes report and/or used for live play.

I hope I managed to type most of that out in a way that makes sense,


-jp

.
__________________
Team JCapper: 2011 PAIHL Regular Season ROI Leader after 15 weeks
www.JCapper.com

Last edited by Jeff P; 08-09-2020 at 01:22 PM.
Jeff P is offline   Reply With Quote Reply
 
» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 03:08 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.