PDA

View Full Version : Database help please


Mark
04-16-2002, 10:31 AM
Access help.

Hope some access expert can help me out on this . I know this is pretty basic, but I thought I would start out by finding out how to do avg earning per start.

I want to end up with 5 columns of data
1. HORSE NAME
2. TOTAL NUMBER OF RUNS
3. TOTAL NUMBER OF WINS
4. TOTAL PRIZEMONEY FROM WINS
5. EARNINGS PER START (#2/#4)

I then want to export this to excel, which I am more familiar with.. I will then use vlookup to retrieve these 5 columns and match it up with the entries list which I have pasted into the spreadsheet.

I have a simple database in Access 2000 of 100,000 records (10,000 races) across three tables
Table 1 has the horse names, the finishing position and post-race odds (and unique racenumber)
[ this table lists all the races a horse has had, so horse A may have 20 entries]
Table 2 has the race details – race track, race value to the winner, (and unique racenumber)
Table 3 has horse details, trainer, sire, etc.

I started out by trying to work out how may times a horse a in table 3 appears in table 1, then adding up the number of 1’s in the finishing postions where there is a match, then adding up the money from these wins and finally dividing one by the other.

I have read the help files until my eyes started bleeding, so before I go blind I hope one of you guys can point me in the right direction. Do I use an SQL command or???????

Thanks.

hurrikane
04-16-2002, 11:22 AM
Mark, not exactly sure what your question is? Can you be a little more descriptive. Are you having problems joining tables, doing the calcs. ??

Mark
04-16-2002, 11:58 AM
Sure thing and thanks for your interest. You're right, its the calcs I'm struggling with. I have joined the tables by unique race number and horse name. So, when I go into table 3 there is a + sign on the far left next to the horse name. If I click on it, it shows all its runs.

I'm not sure about the various join types when linking the tables though. Perhaps this needs altering.

I don't know whether I can have the calculated fields (#1-5 above) in a new table 4(ideal) or whether I can only do this via a seperate query.

I am having no luck with the calcualtions

FortuneHunter
04-16-2002, 01:41 PM
First of all, make a backup copy of your Access database. So when you try different things you can always recover. Never go more that 4 - 8 hours of "development" without a back up.

Of the three tables sounds like one Table 2 has a record per race, Table 1 has a record per race entry, and Table 3 has a record per horse.

Your Primary Element of interest is by horse "past race" line.

By using Primary Keys you can "relate" or link all three tables as follows:

Table 1 Primary Keys: Horse, Date, Track, Race Number
Table 2 Primary Keys: Date, Track, Race Number
Table 3 Primary Key: Horse

Relationships
Table 1 (Date, Track, Race Number) to Table 2 is Many to One
Table 1 (Horse) to Table 2 is Many to One

There is a window under Tools to develop Primary Keys and relationships.

Now you have made the tables in the database relational.

Now you can do a series of Queries using the Access Query window.

Query A: Horses Name (Table 1), Date, Track, Race (Table 1), Finish (Table 1), Race Value (Table 2). This query gives a record for each "past race" line in the DB where the Horse Name is key and the data across all 3 tables are related by Horse, Date, Track, Race Number, your primary keys.

The result of Query A is a dynaset (i.e. a table that is created dynamically and not stored) that contains 1 record for each entry in each race, which you said would be 100,000 records.

If you pass Query A a horses name as the criteria the result will be a record for each race the horse ran, that includes, Date, Track, Race, Purse Structure.

If you pass Query A a horses name and you set the finish critera to 1, the result will be a record for each race the horse ran, that includes, Date, Track, Race, Purse Structure for races this horse won only

If you pass Query A a horses name and you set the finish critera to 1, and set the track to Saratoga, the result will be a record for each race the horse ran, that includes, Date, Track, Race, Purse Structure for races this horse won at Saratoga only


I will stop here. IMO, the key is database relational architecture which I tried to explain. If you understand what I am saying above and you can create Query A above, Making Query Z which is a combination of Query A, B, C, D is easy and will give you what you want. That will be lesson 2.

Feel free to ask any questions will be happy to answer.

FH

Mark
04-17-2002, 01:49 PM
Thanks FortuneHunter, and sorry for the delay in replying.

I have now linked the three tables.

However, TABLE1 doesn't have date or track fields.

Only the unique RACENUMBER field can link TABLE1 AND TABLE2.
an example is 20020195, rather than the racenumber on the card.

TABLE 2 AND TABLE 3 can only be linked via HORSENAME.


If it is relevant, the join type is '1' and the relationship type is ONE-TO-MANY.

Sorry to be slow, but can you tell me if I have done the query correctly.

1) clicked on query down the left hand side and chose create query in design view.

2) Across the top in fields, I selected HORSE NAME, RACENUMBER, FINISHING POSITION AND RACEVALUE for each of the columns in the query.

3) I ran the query, and now shows a new table of four columns, NAME, RACENUMBER, FINISHING POSTION AND RACEVALUE.


From what I understand if I were to edit the query, entering "1" as the criteria in FINISHING POSITION and run the query again it will only show the winners.

I would now like to progress to your lesson 2, where I would like you to show me if I can make calculated fields to counts the runs, wins, earnings from wins, and finally EPS.

Thanks again for your help so far.

FortuneHunter
04-18-2002, 07:17 AM
Mark, sounds good. When you open each table, are the Primary Keys defined?

How do you "parse" the Race Number code into Date, Track, Race Number so you know what race it is?

You are correct in that you can manually enter "criteria" in any of the columns in the query design view and "rerun" the query.

However, in the end, the ideal thing would be to "import" the the Racecard Entry file for today/tomorrow and have the program generate a report automatically. Manual entry of critera is only good for testing/development. You mentioned Excel as a means to do this. That is OK, but out of my field of expertise.

Lesson 2:

Now you have made Query A. Next is Query B.
In the Query Window, select new, find duplicates query wizard. In the wizard window view box select Query and select the query, Query A, that we made in Lesson 1.

If you select Horse Name as the target, Query B will show you how many duplicates of the Hosre Name from Table 1. This is value is the "total number of runs".

You see, you can make Queries from Queries.

Make Query C which which is identical to Query A, except put a 1 in finish position column critera.

Make Query D which is a "find duplicate" of Query C. This gives you the "total number of Wins".

If you put a horse name in Q A and Q C (try Blue Burner) then you will have stats(# runs/#wins) for a particular horse.

Out of time. A few more Q's to go and a Report. The final one Q ties it together. It will prompt you for the horse name and the rest is magic.

Finally, if we import a RaceCard and write some macros, Access will generate the report automatically by race for each entry where each entry is listed by earings per start from highest to lowest.

Is that where you want to go?

To see samples of MS Access reports you can go to my Website and look around:

www.handicappers-datamine.com

FH

Mark
04-18-2002, 10:31 AM
Thanks very much. This is going exactly where I want!

One point on the primary keys. I have defined 'racenumber' and 'horsename' as primary keys in table2 and table3 respectively by opening the table in design view, highlighting the row with the field name and clicking on the primary key icon.

I did try to do the same on table1 for racenumber and name but it said

"The changes you requested to the table were not successful because they would create duplicate values in the index, primary key or relationship. Change the data in the field or fields that contain duplicate data, remove the index, or redefine the index to permit duplicate entries and try again."

Presumably I was trying to define a primary key twice. So, since the tables are already linked, I assume everything is OK and I can carry on.

.>>How do you "parse" the Race Number code into Date, Track, Race Number so you know what race it is?

I don't really. The racenumber is simply the year [2002] and the cumulative racenumber for the year when all tracks are sorted alphabetically.

eg Jan 1st
Aqueduct 10 race card 200200001 to 200200010
GGate 13race card 200200011 to 200200023
etc

Only table2 has the racedate,track name, distance,etc.

Hope that answers your question.

All the queries seem to be Ok, after double checking some manually. Just to confirm what we have done so far:

QUERY A = NAME_RACENO_FINP_RVALUE
QUERY B = NAME_NO OF RUNS
QUERY C = NAME_RACENO_FINP (=1)_RVALUE
QUERY D = NAME_NO OF WINS

When drawing these together and recording the macros, it may help you for me to mention that I will continue to obtain the entries via an excel spreadsheet.

I do not know whether I will need to first import the file manually or whether Access can read the data automatically from the spreadsheet. But I do anticipate that is would be best to always save the days entries spreadsheet with a standard name eg ENTRIESTODAY.XLS

I would also like to export the final report in excel format, so I can mess around with it. Again, I don't know if this export process can be included in the macro.

I look forward to your next lesson.

ranchwest
04-18-2002, 12:37 PM
I would have used the Track, Date and Race to comprise the key, not create an aritificial key. With your artificial number for the cards, your key doesn't allow easy views based on track, date or race. Just a thought.

Mark
04-18-2002, 01:37 PM
You're probably right ranchwest, but my computer is slow so I was trying to minimize the size of the tables.

rrbauer
04-18-2002, 04:12 PM
It's OK to have a slow computer so long as you bet on fast horses!

:)

FortuneHunter
04-19-2002, 01:45 PM
Mark, I didn't forget you. Fridays are very busy.
I agree with RW. Going with Date, Track, Race# is better for Primary Keys. Much more “debug” friendly. If you continue to fool with databases in horseracing you will get a new PC, no doubt. 800Mhz PIII, 128 Mbytes Ram, 20G Hdrive, CD-RW (a must!), 19” Monitor, Wheel Mouse for under $1,000. So, speed should not be a problem.

You wrote:
“I did try to do the same on table1 for racenumber and name but it said

"The changes you requested to the table were not successful because they would create duplicate values in the index, primary key or relationship. Change the data in the field or fields that contain duplicate data, remove the index, or redefine the index to permit duplicate entries and try again."

Presumably I was trying to define a primary key twice. So, since the tables are already linked, I assume everything is OK and I can carry on.”

In Table 1 design view, put the cursor on racenumber. Down below, change the Index parameter to “Yes(Duplicates OK)” from the pulldown. You should now be able to define it as a Primary Key.

Lesson 3, it is a short one.

Create Query E from Query C. In Query E design view on the tool bar select “Totals” it is the Greek Character for Sum (looks like the letter E on acid).

When you do this, a new row appears in the Query grid called “Total”.

If Query C has a horse name specified then, you have a row for each race the horse won with the purse value in Query C. In Query E, make a column with Purse Value and select “sum” from the pulldown menu in the Totals row.

When you run the Q you should get the Sum of money earned for the horse specified in Query C.

If you create another column in Query E with Purse Totals and use “avg” in Totals Row you will get Average earnings per win.

There will be 1 more Q that ties it all together.

Let me know how you make out, FH

Mark
04-20-2002, 12:24 AM
change the Index parameter to “Yes(Duplicates OK)” from the pulldown. You should now be able to define it as a Primary Key.

When I checked, it had that option already. But when I just tried to define both rows as primary keys at the same time no error message was displayed (if I try to do them one at a time I do)

Lesson 3

Create Query E from Query C. In Query E design view on the tool bar select “Totals” it is the Greek Character for Sum (looks like the letter E on acid).

When you do this, a new row appears in the Query grid called “Total”.

If Query C has a horse name specified then, you have a row for each race the horse won with the purse value in Query C.

After this is when I start running into trouble. I've double checked all the instructions you have given me in all your posts so far to see if I've slipped up somewhere. For the sake of clarity I'll replicate my entries in queryE below:

FIELD= NAME=RACENO=FPOS= RVAL= RVAL= RVAL
TABLE=TBLE1=TBLE1= TBLE1=TBLE2=TBLE2=TBLE2
TOTAL=grpby=grpby= grpby=grpby= Sum= Avg
SORT= BLANK=BLANK= BLANK=BLANK=BLANK=BLANK=
SHOW=TICK=TICK=TICK=TICK=TICK=TICK

CRITERIA "SLOWHORSE #56"

The differences in what you expect the result to be and what I'm getting are:

1) If I specify a horse in Q C, and run Q C, only that horse will be displayed as I expect. If I then exit that query (saving changes) and run query E, you suggest that it should take the horse specified in Q C and isolate that same record in Q E, even if no horse is specified in Q E. This is not the case. In actual fact, Q E continues to display all horses. Since Q E and Q C aren't linked, I can't see any logical reason why it should be the case. Perhaps I've misunderstood your post?

2) The Sum and Avg columns seem to have had no effect. If I do specify a horse in Q E , it continues to list all a horse's wins on separate rows. Also the race value,sum and avg are ALWAYS the same value.

3) "If you create another column in Query E with Purse Totals and use “avg” in Totals Row you will get Average earnings per win."

Again, if I'm jumping the gun here, please shoot me! But I can't figure out how finding averaging earnings per win can help me calculate average earning per start.

4) When you say "Create Query E from Query C" do you mean use the same fields or use the Q C output as Q E input. I have tried both and it doesn't appear to make any difference to the sum and avg calculations.


Sorry the above is wordy, and thanks again. Oh, and a new computer is a while away - the wife is too busy spending my regular income. Any money will have to come from my horse wins :)

Mark
04-20-2002, 12:53 AM
Hang on a minute, just been messing around with it, and if I use the Query C inputs and delete the fields racenumber, race value (group by) and finishing position, then it will display the total money from wins and avg win. But I still don't see the significance of avg win though!

FortuneHunter
04-22-2002, 08:47 PM
Hi Mark,
Sounds like you got Query E to work after all.

Back to Primary Keys. You should be able to define multiple primary keys in a table. For example, Date, Track, Race are the primary keys in my tables. I am at a loss here.

If you have Queries A thru E working, then we should be able to make Query F.

Query F is made up of:
Query B, which should be Number of Starts
Query E, which should be Sum of Earning

In Query F design view; “connect” the Name Field by pointing to the name in Query B and click and drag to the name in Query E. When you are finished, there should be a line connecting the name fields in Query B, and Query E.

Now create a columns Name(Q B), NumberOfStarts(Q B), SumOfEarnings(Q E).

The final column is an expression (i.e. calculation). Type this in the field row.

EarningsPerStart: SumOfEarnings/NumberofStarts

This should divide the earnings by starts by making an expression from the other columns.

Let me know how it goes.

FH

Mark
04-24-2002, 11:02 AM
The query worked. All the values are spot on. How long it took is unknown. The progress bar moved all the way across within 2mins, but hung there for at least another hour. I left it and went to bed. It was done when I woke up. Strange because the other queries didn't take long.

Maybe that computer upgrade is more urgent than I thought! I'm running a 400mhz p2 with 128ram. Do you think I would notice if I added more ram or is it a waste of money? Also I have a p3 450 on my other machine and could swap over if you think its worthwhile.

However, I am hopeful that when you show me how to match it up with only the days entries, things will speed up.

Look forward to your next lesson!

FortuneHunter
04-24-2002, 02:38 PM
That is way too long. Your machine is slow compared to today's standard, but not that slow. It is something else.

What Operating System and what version of Access are you using?

A couple of things to try:

Compact and Repair Database:
Open an explore window navigate to your database and wrie down the current size.
Open database, On Toolbar, Click Tools, Database Utilites, Compact and Repair Database. This may take some time.
Check database size and let me know if there was a change.
You should Compact and Repair often

Indexing
In each table (design view) for each field there is a property called" index". For fields such as Horse Name and Race ID, which are used for as primary search fields, index should be Yes. Allow duplicates where appropriate.

For a better explaination of index, put cursor in the index field and press "F1" key. A help window should pop up with description of "index".
Access help is pretty good, just put the cursor on anything in question and press F1.

We have to get that query down to 15 seconds or less. Time Query B and E. Q F should not take much longer than the sum of B and E

Let me know, FH

FortuneHunter
04-24-2002, 02:42 PM
Don't change hardware yet, try those things I mentioned.

More Ram is always good, more important then processor speed IMO. Lets wait and see.

FH

Mark
04-25-2002, 07:18 PM
I've tried everything I can think of, but no noticeable increase in performance. I will now tell you what I've done in exhaustive detail, just in case its relevant.

I made a separate database and stripped everything out except those fields needed for the query. I then compacted and repaired. This got it down to 30mb in size.

I've re-written all the queries from the top, and threw in an extra 128mb of RAM from my other computer and then defragged.

I would like to specify the primary keys situation. Table 1 - I am able to define both Racenumber and horse name as primary keys. In the index they both say "yes, duplicates ok". Table 2 and 3 also have the fields defined as keys, but I can't have the setting as anything other than "Yes, (no duplicates)" regardless of what order I define the keys across the 3 tables.

All the total columns are defined as 'group by' except those using sum and avg for working out the race value calcs. and eps is defined as an expression

Maybe the stickin point is constructing the EPS calc incorrectly? In Query F, I have entered :

EPS: [SumOfRACEVALUE]/[#runs]

SumOfRACEVALUE is defined in Q E as:
field: RACEVALUE
table : QueryC
total : Sum

#runs is defined in Q B as:
field: #runs: NAME
table : queryA
total : count

The only queries joined are B and E in Query F

As before, this is very strange because the queries are all done within 30secs.


Since the resulting figures are accurate, it might be an idea if we ignore this problem for the moment and move on to the remaining queries and write the macro as I don't seem to be getting anywhere.

FortuneHunter
04-27-2002, 07:33 AM
Mark,

Mark, What do think think about sending me the "stripped" down database?
If you have "WinZip" you can zip it up and email it to me.
I have an 200Mhz, 800Mhz and 1 Ghz Machines. I can see if I can figure out what the problem is and how fast it runs on my machine.
If you are interested email me:

handicapdatamine@aol.com

FH