Quote:
Originally Posted by mbugg1976
Thanks Traynor
Not really understanding the below
I can use weka but there is that many classifiers it left me slightly confused
|
You look for X (whatever it may be). You look in all entries, all races to see how many times you find a match for X. Example, jockey with green cap. Those are matches. Then find how many of those matches were wins--however you define it--usually "won the race" or close, or something similar. Those are the hits.
In 100 races, 300 entries match X. Of that 300, 30 won. The value of X as a factor is 30/300. 0.10.
If you use that basic qualifier first, it will quickly locate the most signicant of the variables.
Initially, I use a coded "pattern" to pre-qualify. Faster, easier, and simpler than most other approaches. The data is in a long string (for example) separated by commas. Split it at the commas. Identify the element of interest (46). Identify the element you want as "win" (79 or whatever). Loop through the whole data clump, finding how many times element 46 matches X, and how many times element 79 is "won the race" or whatever.
If data(46) = something Then
'increment matches
matches += 1
if data(79) = "won" Then
'increment wins
wins += 1
End If
End If
I did a research project awhile back that analyzed some insane number of horses and races (every race run in Australia in five years). It was MUCH easier to use code than to try to fix the (horribly broken) database that it came in.