PDA

View Full Version : Access question


raybo
09-18-2008, 01:18 AM
When you import an Excel file into Access, does it just import the cell values, or does it import the underlying formulas? Do I need to do a copy/paste special-values before importing?

I have Excel 2007 now and can put a complete Bris data file (.mcp) on one row. I'm thinking about loading all my .mcp files and the corresponding .xrd results files into Excel 2007 in this manner, then draw each category of data into : Race, Horse, Trainer, Jockey, Workouts, (a post handicapped sheet with my own numbers and selections), Results sheets, etc. via formulas referencing the data in the main sheet that has all the cards and results files in it. After that I want to import all these sheets into like tables in Access.

I'll use Excel macros to cycle through each card and each race and probably an Access macro to grab each race from Excel.

Is this feasible?

Donnie
09-18-2008, 08:03 AM
to your first question = just the computed data, not formulas.
to your second question = depending on the number of fields, sounds like what you are doing can be done in Access completely. If there are too many fields in the excel table, you can always do the import into Access using two different import specifications files sending the data to 2 tables. Then you can bring all the data you wish to see into one query, including your calulated fields. And this can ALL be automated thru macros in Access as well.

raybo
09-18-2008, 08:16 AM
to your first question = just the computed data, not formulas.
to your second question = depending on the number of fields, sounds like what you are doing can be done in Access completely. If there are too many fields in the excel table, you can always do the import into Access using two different import specifications files sending the data to 2 tables. Then you can bring all the data you wish to see into one query, including your calulated fields. And this can ALL be automated thru macros in Access as well.

Thanks for your reply, Donnie. I'm not a database guy, yet, and would find it very difficult to do "import specifications" (don't even know what that means at the moment). I thought I'd start with Excel since I've been using it for horse racing for many years and then ease into queries in Access as I become more skilled.

Tom
09-18-2008, 09:43 AM
raybo, I'm far from an Access expert, and the world's worst user can still see me ahead of him:rolleyes: but, when you set an import, you can save a set of criteria as a spec that will exclude and include fields of your choosing. I do this on several comma files I import, based on what I want the table used for.
Set it up once, then every time you import, just select that template.

Jeff P
09-18-2008, 10:42 AM
There are 1435 fields in most Bris data files...

254 in an XRD Results file (but only 122 of those are currently in use. The rest are listed as reserved for future use.)

While Excel 2007 allows you to handle that many fields in a single record, Access does not. A standard Access table alllows 255 fields.

Two strategies come to mind:

1. Create multiple tables. This will allow you to store all of the fields. Then, join multiple tables whenever you write a query.

2. Decide ahead of time which fields are important to you and grab just those.

Number crunching in Access is done in a slighly different manner than Excel. Access supports VBA code... VBA being an acronym for Visual BASIC for Applications (Microsoft.)

Using VBA, once you understand what you are doing... if you can envision it - you can create it. When it comes to number crunching the kind of info read from data and results files, there isn't anything you can't do using VBA in Access. It's almost identical to Microsoft Visual Basic.


-jp

.

raybo
09-19-2008, 12:10 AM
Ok, if I want to import the comma-delimited file (.mcp file) do I need to rename it and use .csv instead? I see no way of importing the file as is.

podonne
09-20-2008, 03:18 PM
Without playing the "don't use access" card, My entries table only has 116 columns, and my pp table only has 104, so technically you should be fine once the data is loaded into access. It's the initial load of the CSV program file that is the difficulty. My staging table that i read the csv file into has 1461 columns.

Depending on how you reference the cell in Excel's VBA you'll get the formula or the value (.FormulaR1C1, etc...)

BTW, VBA might be like the older visual basics, 6 or before, but the later versions are far superior.

plainolebill
09-20-2008, 05:28 PM
.txt

TampaJim
09-21-2008, 12:55 AM
Instead of a 'staging table' I just use an array to load the enitre card into memory then suck the data out of the array into my various tables in the database.

a la...

Public Const Max_Races_on_Card = 40
Public Const Max_Horses_in_Race = 25
Public Const Max_Input_Size = 10000

with a later redimensioning of the CSV array...

ReDim RB(Max_Races_on_Card, Max_Horses_in_Race, FieldCount)

Then its simple to load it up with nested For..Next loops.

Once the memory array is loaded for the card, you can create new DB table entries for RaceData, HorseDataForRace, and BackRaceHorseData. If you calculate things you need from BackRaceHorseData (i.e. Avg-Speed-Last-3) and include it as a field in HorseDataForRace, you don't need to keep your BackRaceHorseData in the table, reducing the DB size substantially.

When you're done with a card, you redim the array to (1,1,1) then back to max and import the next file.

Adjust your 'FieldCount' based on the type of file you're importing then do something like..


StartFile:

fnum = FreeFile
Open fnam For Input As #fnum

RaceNum = 1: HorseNum = 1: FieldNum = 1

Do While Not EOF(fnum)

' Assume that we are starting with a valid file
' Process until one of [Date,Track,Race] is different


FirstRead:
Input #fnum, currtrack
Input #fnum, currdate
Input #fnum, currrace
FieldNum = 4

lasttrack = currtrack: lastdate = currdate: lastrace = currrace

StartRace:
Select Case Right$(fnam, 4)

Case ".PCS", ".MCP"
FieldCount = 1435
Case ".XRD"
FieldCount = 127
Case ".ETD"
FieldCount = 69
Case Else
FieldCount = 4
End Select
ReDim RB(Max_Races_on_Card, Max_Horses_in_Race, FieldCount)


For HorseNum = 1 To Max_Horses_in_Race
RB(RaceNum, HorseNum, 1) = currtrack
RB(RaceNum, HorseNum, 2) = currdate
RB(RaceNum, HorseNum, 3) = currrace
For FieldNum = 4 To FieldCount
Input #fnum, RB(RaceNum, HorseNum, FieldNum)
Next FieldNum

Input #fnum, currtrack
Input #fnum, currdate
Input #fnum, currrace
If currtrack <> lasttrack Then GoTo NextRace:
If currdate <> lastdate Then GoTo NextRace:
If currrace <> lastrace Then GoTo NextRace:
Next HorseNum

NextRace:
' We should have gotten here because we finished loading a race
' Populate Tables as appropriate
Select Case Right$(fnam, 4)

Case ".PCS", ".MCP"

Call UpdateRaceInfo(frm, RI, RB, RaceNum)
' Process all 10 available Back Races for Each Horse
For XX = 1 To HorseNum
If Not IsNull(RB(RaceNum, HorseNum, 1)) And _
RB(RaceNum, HorseNum, 1) <> "" Then
Call UpdateBackRaceInfo(frm, BRI, RB, RaceNum, XX)
Call UpdateHorseInfo(frm, HI, RB, RaceNum, XX)
Else
XX = HorseNum + 1 ' escape out
Exit For
End If
Next XX


Case ".XRD"

..
..

raybo
09-21-2008, 09:10 AM
I appreciate all of your suggestions/points of view, but, I'm afraid I'm lost. This is the reason I've not created a racing database before now.

This database will not be used for handicapping, only for research purposes. I do need to include my own numbers, however. All my handicapping is, and will continue to be, done in Excel.

I've read, several times, "Designing Relational Database Systems" from Microsoft Press (1999), and "SQL A Beginner's Guide" by Forrest Houlette, Ph.D. (2001). The SQL book was written for use in SQL Server, not Access.

Both of these books were and are way over my head and they were written primarily for programmers , administrators, etc, with business/institutions in mind. Trying to adapt the teachings for what I am trying to do just gets me going in circles and accomplishing nothing.

I've begun and quit designing databases many times, as I have become frustrated/pissed that I can't grasp the theory well enough to move to the next step and actually create a database.

This is the reason I decided to create the tables in Excel and then import them into Access as separate tables. I assume I can create my own numbers in Excel, in the proper table, and then import them along with the other data from the Bris file.

I have already created an Access table named "Card" which contains all of today's racecard data. This table contains 250 fields. The first problem I ran into with the table is the creation of a race ID so each race can be linked to the other tables. Should I create the "RaceID" in Excel before importing to Access or should it be done in Access after importing? I'm assuming "RaceID" would include track,date,race# ie: HOU010420041, HOU010420042, etc..

raybo
09-21-2008, 09:34 AM
I'll try to attach a screenshot of the "Card" table I have in Access.

sjk
09-21-2008, 10:44 AM
I don't understand the necessity of having a race identifier since you can link on the three fields track, date and race number with what you have.

Jeff P
09-21-2008, 10:58 AM
Ray,

You wrote:This database will not be used for handicapping, only for research purposes. I do need to include my own numbers, however. All my handicapping is, and will continue to be, done in Excel.and:I've begun and quit designing databases many times, as I have become frustrated/pissed that I can't grasp the theory well enough to move to the next step and actually create a database.

I can tell you from experience that creating a racing database is a journey... one that can provide rewards... but a journey nonetheless. And that journey was (in my case) far far longer and more involved than I ever would have imagined when I first started out.

By the time you are done - in truth you are never actually done - you will always discover new things that you want to add and test... You don't have to be a programming mastermind to do something like this. But you might become one simply from the journey itself. So instead of "By the time you are done"... I'll say By the time you are further along you will have a thorough understanding of Access (or whatever tool you decide to use.)

While not the fastest/most efficient database product out there, Access is something widely accepted.... and it's easy to find lots of available instructional reading material. So for the non programmer I'd say Access is probably a good choice. But that's going off on a tangent. Without question Access CAN handle the task I think you are describing.

Here's my suggestion:

Instead of trying to invent the wheel all at once, bite off small chunks at first. Think baby steps.

Why not design your database (at first) to be a really simple research tool?

Maybe one table with just a handful of factors. You can always add more later.

Maybe a table structure that looks something like this:

Field Name Data Type Description
-------------- ----------- -----------------------
Track Text 3 character Bris Track Code (BEL=Belmont,AQU=Aqueduct, etc)
Date Date/Time Date of race
Race Number/Integer Race Number
Dist Number/Decimal Distance of race
Surface Text Surface: D,T,A
TrackCond Text Track Condition: FT,FM,SY,MY,GD,etc
RailPosition Number/Integer position in starting gate from the rail out
ClassDescriptor Text C=claiming, M=Mdn Claiming, A=Alw, etc
RaceCond Text N2L, N3L, etc
ClaimPrice Text Claiming Price
Odds Number/Decimal Final Odds
FinPos Number/Integer Official finish position
Win Number/Decimal Win Mutuel
Place Number/Decimal Place Mutuel
Show Number/Decimal Show Mutuel


Ray1 Number/Decimal Ray's first factor, numeric value
RankForRay1 Number/Integer Ray's first factor, rank... top horse in field rank=1, 2nd best horse rank=2, etc.
GapForRay2 Number/Decimal Ray's first factor, gap... diff in value from top horse in field for this factor

Ray2 Number/Decimal Ray's 2nd factor, numeric value
RankForRay2 Number/Integer Ray's 2nd factor, rank... top horse in field rank=1, 2nd best horse rank=2, etc.
GapForRay2 Number/Decimal Ray's 2nd factor, gap... diff in value from top horse in field for this factor

From a workload standpoint, you'd import Bris card files into Excel where all number crunching is done... just like you do now.

After number crunching then an export to an Access table gives you a record of what happened for each horse in that Bris card file.

Accumulate a significant number of records in the Access table and get to querying.

For example, if the only thing you ever created was the above table... but you were meticulous about using it to capture data from every race card that you've ever played...

You would have a complete set of records that you could query - which could be used to tell you how best to exploit factors Ray1 and Ray2 - provided of course they are exploitable in the first place.

Hope I've helped...


-jp

.

Tom
09-21-2008, 01:17 PM
Here is a suggestion....do your work in Excel where you are familiar, then import the spreadsheets into Access, including your calculated numbers. This will allow you to avoid duplicate records, and new tables as you go along and link them, query the data......in time, as you use Access, you will get comfortable with it and learn new tools. I have a db with tables from CJ figs, HTR, and BRIS, plus Beyer numbers - once linked,there is a lot of information available.

raybo
09-29-2008, 07:17 AM
There are 1435 fields in most Bris data files...

254 in an XRD Results file (but only 122 of those are currently in use. The rest are listed as reserved for future use.)

While Excel 2007 allows you to handle that many fields in a single record, Access does not. A standard Access table alllows 255 fields.

Two strategies come to mind:

1. Create multiple tables. This will allow you to store all of the fields. Then, join multiple tables whenever you write a query.

2. Decide ahead of time which fields are important to you and grab just those.

Number crunching in Access is done in a slighly different manner than Excel. Access supports VBA code... VBA being an acronym for Visual BASIC for Applications (Microsoft.)

Using VBA, once you understand what you are doing... if you can envision it - you can create it. When it comes to number crunching the kind of info read from data and results files, there isn't anything you can't do using VBA in Access. It's almost identical to Microsoft Visual Basic.


-jp

.

Ok, I've decided to create several tables in Excel, separating the data as needed and including my calculated numbers, rather than just creating one or 2 tables by including only some of the data from the Bris file. I decided to do it this way because I won't know which data I may want available in the future, so I want all data to already be in the DB from the get-go.

raybo
09-29-2008, 07:20 AM
raybo, I'm far from an Access expert, and the world's worst user can still see me ahead of him:rolleyes: but, when you set an import, you can save a set of criteria as a spec that will exclude and include fields of your choosing. I do this on several comma files I import, based on what I want the table used for.
Set it up once, then every time you import, just select that template.

After playing around with Access, I now know what "import specifications" are and how to accomplish and save them. Thanks for the explanation.

raybo
09-29-2008, 07:23 AM
I don't understand the necessity of having a race identifier since you can link on the three fields track, date and race number with what you have.

After doing some rereading my DB book, I agree with you that there is no need to create a race identifier.

raybo
09-29-2008, 08:32 AM
I'm still a little fuzzy on a couple of things concerning the design phase.

Is it necessary to assign a primary key in every table? Or, should I just concentrate on indexing each table properly? I tried using several fields as primary keys in a table but Access won't let me do that. It is only allowing one field as a primary key. I assume the other fields would be candidate keys but am not sure if anything needs to be done concerning these candidate keys. I indexed all of these fields and hope that is all I need to do for each table.

I assume that, for example, in the Horses table I will need to create a "horse_id" to uniquely identify each horse. The only thing I can come up with that would do this properly would be to concatenate the horse's name and it's country of birth, I'm assuming each registration organization in each country do not allow previously used names to be duplicated.

Concerning the tables to be created in Excel for importation into Access I assume I should create the following tables or sheets:

RaceCards (In Excel 2007 I can load the complete Bris .mcp or .drf comma-delimited file as well as the complete .xrd results file on the same row. This will enable me to only run "Infotran" once for each .mcp/.drf file and the corresponding .xrd results file. Then, via Excel macros, I can grab each data file and result file and send the data to each table, accomplish my calculations, and export them to Access to populate the tables there. Adding more racecards in the future would be simple enough if I do it this way.)

Card (General race data, with column names, so that each individual racecard can be exported to Access. Fields to include:track,date,racenumber,racetype(G1,G2,G3,al w,clm,msw, etc.),distance,surface,sex/age conditions,purse/claming price,horsename,postposition,saddlecloth number,wgt,meds,equip,clm price of horse,jockey,pace pars,class par,speed par, etc.)

Horse (Horse specific data to include track,date,racenumber,horsename,postposition,saddl ecloth number,wgt,age,sex,color,meds,equip,clm price of horse,jockey,trainer,owner,birth, breeding, etc.)

Jockey (all jockey stats)

Trainer (all trainer stats)

PPs (All previous running lines for each horse, to include my calculations for each previous running line and which running line was selected for today's race(? not sure about this one but I'd like to be able to query the DB for this info for future running line selection method improvements, any suggestions concerning how to accomplish this annotation would be appreciated)(Not sure how to prevent duplications of PPs for each horse, any suggestions?)(Would like to database every running line for each horse in the database)

Workouts (all workout data, not sure how to go about preventing duplications of workouts as I add workouts for each horse in the future, any suggestions? I would like to database every workout ever run for each horse in the database

Results (All results data in the .xrd file)

raybo
09-29-2008, 08:50 AM
Concerning relationships (one to one, one to many, etc.) is this done in the table design phase, or is it even necessary?

raybo
09-29-2008, 09:05 AM
After finalizing and creating the DB tables I will post the entire DB schema/physical layout of the tables and the indexed fields in each table (for use by any interested members who are thinking of creating their own DB).

Then I can start tackling queries/macros so I can actually use the DB. I will post these also.

headhawg
09-29-2008, 09:47 AM
I am by no means an expert but I do know a few things about DBs. DB design is more important than you might think. Bad design will cause inefficiencies in your data analysis. What I mean is that poor design could lead to a lack of data integrity which could cause inaccuracy in your information analysis. If that happens then what's the point of the db in the first place?

One of the basic design rules is that every table should have a unique identifier. So, yes, it is likely that every table needs a primary key. It will enforce uniqueness for the record thus (virtually) eliminating duplicate data which of course wastes resources and can screw up data analysis. Access does allow more than one field in the primary key (a composite key). You could simply drag through the rows in design view starting with the first field if they are contiguous or hold down control to select each field if they are not.

Relationships (foreign keys) are also very likely. They enforce referential integrity so that a change in one table does not cause orphan records in another. There is going to be a "natural" relationship between your data tables; horses and trainers, for example. Another rule of db design is to have your tables "focus" on one entity with regard to the data it holds. Sometimes this is known as "narrow" tables for purposes of normalization. Because of this focus, you would not mix the data for trainers and horses in the same table. But you would have a trainer id in the horse table which corresponds to the primary key trainer id in the trainers table. There is your relationship.

So I guess what I am saying is that , yes, design before implementation will help you in the long run. I suspect from your posts that you already knew that, however.

podonne
09-29-2008, 02:10 PM
If you are going to scope tables, I would look at:

Races - Contains all the information about the race itself, uses race_id as a primary key (autonumber)
Horses - Contains all the information about the horse itself, like breeder, birth year, sire, auction, etc... Uses horse_id as a primary key (autonumber)
Entries - Contains all the information about the horse when it is entered in a race, uses entry_id as a primary key (autonumber), but also a unique key on race_id and horse_id.
PPs - The pps for the entry (indexed with entry_id)
Workouts - the workouts for the entry (indexed with entry_id)
Results - the results for the entry (indexed with entry_id and maybe race_id if you care)

raybo
09-29-2008, 06:46 PM
If you are going to scope tables, I would look at:

Races - Contains all the information about the race itself, uses race_id as a primary key (autonumber)
Horses - Contains all the information about the horse itself, like breeder, birth year, sire, auction, etc... Uses horse_id as a primary key (autonumber)
Entries - Contains all the information about the horse when it is entered in a race, uses entry_id as a primary key (autonumber), but also a unique key on race_id and horse_id.
PPs - The pps for the entry (indexed with entry_id)
Workouts - the workouts for the entry (indexed with entry_id)
Results - the results for the entry (indexed with entry_id and maybe race_id if you care)

What do you mean by "unique key on race_id and horse_id"? I indexed race_id with duplicates ok. I indexed horse_id with no duplicates.

For the "Races" table I indexed RaceDate, Track, and Race# with duplicates ok.

For "Horses" table, I indexed HorseName and CountryBirth with duplicates ok

Will get to the other tables soon.

richrosa
09-29-2008, 09:30 PM
After doing some rereading my DB book, I agree with you that there is no need to create a race identifier.

A primary key race identifier makes a perfect foreign key for other tables, simplifies indexing, and makes joins easier and more efficient. In my experience, in a racing database, you will be climbing a steep uphill battle while trying to avoid it.

raybo
09-30-2008, 07:07 AM
A primary key race identifier makes a perfect foreign key for other tables, simplifies indexing, and makes joins easier and more efficient. In my experience, in a racing database, you will be climbing a steep uphill battle while trying to avoid it.

I have reversed my previous decision to not have a primary key.

RaceID is now my primary key for the "Races" table.

As a matter of fact I have decided to have a primary key for every table. I happened upon "Crystal's Access Basics" and came to my senses. :)

raybo
09-30-2008, 07:15 AM
What do you mean by "unique key on race_id and horse_id"? I indexed race_id with duplicates ok. I indexed horse_id with no duplicates.

For the "Races" table I indexed RaceDate, Track, and Race# with duplicates ok.

For "Horses" table, I indexed HorseName and CountryBirth with duplicates ok

Will get to the other tables soon.

I have removed all indexes except the unique key for race_id and horse_id in tblEntries. I will review indexes later.

podonne
09-30-2008, 07:28 PM
I have removed all indexes except the unique key for race_id and horse_id in tblEntries. I will review indexes later.


Indexes aren't that tough. Two reasons you might use them are to prevent duplicates and speed up searching\table joins. I would give each table a primary key based on an autonumber field, then add a unique index where you know duplicates shouldn't be, then add indexes for your commonly searched fields (entry_id in the pps table, race_id in the entry table, etc...)

raybo
10-01-2008, 06:29 AM
Indexes aren't that tough. Two reasons you might use them are to prevent duplicates and speed up searching\table joins. I would give each table a primary key based on an autonumber field, then add a unique index where you know duplicates shouldn't be, then add indexes for your commonly searched fields (entry_id in the pps table, race_id in the entry table, etc...)

Thanks podonne, this looks like sound advise.

In "Crystal's Access Basics", she suggests creating all the initial tables with a primary key and then, after all the tables have been created, look at relationships and create the indexes. This made sense to me.

It appears that all DB authors have different views on much of the design phase. In the end you just have to make up your own mind on what you think is the best way, for your particular application.

ranchwest
10-01-2008, 12:28 PM
My suggestion is to import directly into Access, not through Excel. You can continue to work in Excel. Eventually, you'll understand how to recreate your Excel work in Access. At some point you'll find it easier that you're importing directly into Access.

Don't worry too much about getting everything exactly right. Keep it very simple at first. Trust me, if you're like most people you'll start over again, and again, and again, and again. Eventually you'll be close to what you think is right at that point. Then, you'll start over again.

raybo
10-01-2008, 05:40 PM
My suggestion is to import directly into Access, not through Excel. You can continue to work in Excel. Eventually, you'll understand how to recreate your Excel work in Access. At some point you'll find it easier that you're importing directly into Access.

Don't worry too much about getting everything exactly right. Keep it very simple at first. Trust me, if you're like most people you'll start over again, and again, and again, and again. Eventually you'll be close to what you think is right at that point. Then, you'll start over again.

I tried importing the Bris file into Access but couldn't figure out how to create the primary keys. I know that the primary key for tblRaces, for example, will contain the fields named "RaceDate,Track, and RaceNum" but how would I create a field named "RaceID" for this primary key if I am importing from the Bris file directly into Access?

sjk
10-01-2008, 06:44 PM
As someone mentioned earlier a primary key can contain multiple fields. You define it by holding the ctrl key as you select the fields.

Tom
10-02-2008, 07:48 AM
I think what you need do is set the primary key in the table, then when you import, it will prevent duplicates from going in.

raybo
10-02-2008, 11:04 PM
As someone mentioned earlier a primary key can contain multiple fields. You define it by holding the ctrl key as you select the fields.

I know how to create a multifield primary key but if I import directly into Access instead of going to Excel first, there is no way, that I see, to name a field "Race_ID", for example.

raybo
10-02-2008, 11:13 PM
I think what you need do is set the primary key in the table, then when you import, it will prevent duplicates from going in.

As I replied to sjk's post, I know how to create and define a multifield primary key (that would represent "race_id", for example), if I can't name a field "race_id" then my queries will have to include the 3 fields included in the "race_id" instead of simply using a field named "race_id" for the relationship between 2 or more tables.

Am I missing a way to import from Bris data directly into Access and being able to define and name a multifield key?

It was suggested that I import directly into Access instead of going to Excel first, where I can create a field to hold the primary key. Also, at least initially, it would be much easier to go to Excel first in order to capture my own numbers and do number crunching. I realize that this can all be accomplished in Access but considering my very limited knowledge of Access, SQL, and VBA, that would be a very tall order that would prevent me from using the database anytime in the near future.

If I go to Excel first and do all the number crunching and create sheets that mirror the tables I create in Access then I can start querying the database as soon as I get the tables created and the data imported from Excel to Access.

Doesn't this make sense?

sjk
10-03-2008, 07:24 AM
I still don't see the value in having a single field that contains the exact same information that is in 3 fields already present. It is generally considered good design to avoid having the same information in two places to avoid any potential inconsistency issues.

Tom's suggestion is an excellent one. One day you will inadvertently import information that has already been imported. With the primary key Access ignores. Without the key you will have a mess to unravel.

raybo
10-03-2008, 06:11 PM
I still don't see the value in having a single field that contains the exact same information that is in 3 fields already present. It is generally considered good design to avoid having the same information in two places to avoid any potential inconsistency issues.

Tom's suggestion is an excellent one. One day you will inadvertently import information that has already been imported. With the primary key Access ignores. Without the key you will have a mess to unravel.

I have the primary key already in the table. It's a multifield key containing "racedate, track,and racenum". So, you and Tom are suggesting that I just leave the "Races" table this way?

podonne suggested the following:

If you are going to scope tables, I would look at:

Races - Contains all the information about the race itself, uses race_id as a primary key (autonumber)
Horses - Contains all the information about the horse itself, like breeder, birth year, sire, auction, etc... Uses horse_id as a primary key (autonumber)
Entries - Contains all the information about the horse when it is entered in a race, uses entry_id as a primary key (autonumber), but also a unique key on race_id and horse_id.
PPs - The pps for the entry (indexed with entry_id)
Workouts - the workouts for the entry (indexed with entry_id)
Results - the results for the entry (indexed with entry_id and maybe race_id if you care)

Is he suggesting that I create a primary key field named "race_id" that is a multifield key, or is this not what he means? If I create a field named "RaceID" that is based on 3 fields that will also be in the table, I realize that this is somewhat redundant but, the primary key will prevent me from importing the same data twice. And, having my tables all have a named primary key field will make it much easier in future queries to join tables using these primary keys. Example: To join the "Races" table with the "Entries" table I could join on "RaceID" as it would be a primary key in the "Races" table and a foreign key in the "Entries" table. Without the field "RaceID" in both tables, I would have to join on "RaceDate, Track, and RaceNum".

Which is the more precise, more efficient, and easiest method? All of the db information I have read says to have a descriptive "ID" field as the primary key in every table. With it I can join or query on the "ID" field or any of the 3 fields it includes, individually or in combination. If I want to query on the "RaceDate" I can do that and if I want to query on the "Track", I can do that, and if I want to query on a particular race I can use the "ID" field for that without having to specify "RaceDate, Track, and RaceNum".

Maybe I'm overthinking this thing, but it just makes sense, to me, to include all 4 fields in the table.

sjk
10-03-2008, 06:19 PM
You have correctly understood my post. That is the way I do it. I have hundreds of queries and it has never crossed my mind that it is a burden to draw 3 lines rather than one to join tables on date track and race.

There is usually more than one way to get things done and I am sure the other approach is workable too.

raybo
10-03-2008, 06:36 PM
You have correctly understood my post. That is the way I do it. I have hundreds of queries and it has never crossed my mind that it is a burden to draw 3 lines rather than one to join tables on date track and race.

There is usually more than one way to get things done and I am sure the other approach is workable too.

Ok, I will accept your suggestion because I know your db works well, as proven in many previous posts. Thanks for the explanation.

podonne
10-05-2008, 04:19 PM
Is he suggesting that I create a primary key field named "race_id" that is a multifield key, or is this not what he means? If I create a field named "RaceID" that is based on 3 fields that will also be in the table, I realize that this is somewhat redundant but, the primary key will prevent me from importing the same data twice.

yes and no. I would suggest a primary key that is numeric and uses the "autonumber" functionality, basically just an incrementing number every time you add a record. Integer-based numbers are the easiest (fastest) data types to use to join tables.

The primary key is not the only index on a table. You can add another unique index on racedate, track, and racenum to prevent duplicates, without making the primary key derive from them in some complicated way.

podonne
10-05-2008, 07:36 PM
This is the table definition for my races table. It illustrates the index concept. Granted this is a MySQL table, but I think it shows the point.

/*DDL Information For - unit_test.races*/
-----------------------------------------

Table Create Table
------ --------------------------------------------------------------
races CREATE TABLE `races` (
`raceid` bigint(20) unsigned NOT NULL auto_increment,
`track` char(3) default NULL,
`racedate` date default NULL,
`racenum` tinyint(4) default NULL,
`distance` mediumint(9) default NULL,
`surface` char(1) default NULL,
`racetype` char(2) default NULL,
`agesexrestrictions` char(3) default NULL,
`raceclassification` char(14) default NULL,
`purse` int(10) unsigned default NULL,
`claimingprice` mediumint(8) unsigned default NULL,
`trackrecord` float default NULL,
`raceconditions` text,
`todayslasixlist` text,
`todaysbutelist` text,
`todayscoupledlist` text,
`todaysmutuellist` text,
`simulcasthosttrackcode` char(3) default NULL,
`simulcasthosttrackraceum` tinyint(4) default NULL,
`breedtype` char(2) default NULL,
`paceparforlevel2f` tinyint(3) unsigned default NULL,
`paceparforlevel4f` tinyint(3) unsigned default NULL,
`paceparforlevel6f` tinyint(3) unsigned default NULL,
`speedparforclasslevel` tinyint(3) unsigned default NULL,
`latepaceparforlevel` tinyint(3) unsigned default NULL,
`drfraceconditionline1` text,
`drfraceconditionline2` text,
`drfraceconditionline3` text,
`drfraceconditionline4` text,
`drfraceconditionline5` text,
`drfraceconditionline6` text,
`lowclaimingprice` mediumint(8) unsigned default NULL,
`statebredflag` char(1) default NULL,
`wagertype1` text,
`wagertype2` text,
`wagertype3` text,
`wagertype4` text,
`wagertype5` text,
`wagertype6` text,
`wagertype7` text,
`wagertype8` text,
`wagertype9` text,
`equibaseracename` text,
`reserved1` char(12) default NULL,
`trackid` bigint(20) unsigned default NULL,
`earlyfieldsize` tinyint(4) default NULL,
PRIMARY KEY (`raceid`),
UNIQUE KEY `part_races` (`track`,`racedate`,`racenum`)
) ENGINE=MyISAM AUTO_INCREMENT=155155 DEFAULT CHARSET=latin1

raybo
10-05-2008, 08:09 PM
yes and no. I would suggest a primary key that is numeric and uses the "autonumber" functionality, basically just an incrementing number every time you add a record. Integer-based numbers are the easiest (fastest) data types to use to join tables.

The primary key is not the only index on a table. You can add another unique index on racedate, track, and racenum to prevent duplicates, without making the primary key derive from them in some complicated way.

Gotcha! Thanks podonne.

richrosa
10-07-2008, 09:03 AM
This is the table definition for my races table. It illustrates the index concept. Granted this is a MySQL table, but I think it shows the point.

/*DDL Information For - unit_test.races*/
-----------------------------------------

Table Create Table
------ --------------------------------------------------------------
races CREATE TABLE `races` (
`raceid` bigint(20) unsigned NOT NULL auto_increment,


I'm glad you are using MySQL which will get you much better throughput and flexibility than Access. I think in the end, you'll find it easier too.

A performance note. Your use of a bigint as a PK here is unnecessary, especially in an indexed field. You can use an unsigned int which will give you over 4 billion values to work with. The larger int value will make your index bigger, thus slower. Its not a big deal, but since you're just starting, I thought you might benefit from this performance tip.

podonne
10-07-2008, 11:35 PM
A performance note. Your use of a bigint as a PK here is unnecessary, especially in an indexed field. You can use an unsigned int which will give you over 4 billion values to work with. The larger int value will make your index bigger, thus slower. Its not a big deal, but since you're just starting, I thought you might benefit from this performance tip.

Hi Rich,

Thanks for this. I think we all benefit from performance tips, regardless of our experience. I always used bigint unsigned as a gut reaction, but you make an excellent point, especially if there is a performance impact.

Thanks again,
Phil