Thursday 24 July 2008

ArtenSUITE: Part 4: Primary Keys

ArtenSUITE
A Development Journal
Part 4: Primary Keys


I had an email this morning from someone who asks me why I appear to be using surrogate keys rather than natural keys in my database design. A surrogate key is a key that is totally unrelated to the data in the table whereas a natural key would relate directly to the data in a table.

An example of a surrogate key:

Table: Customer
Surrogate Primary Key Value: 7878900
Customer Name Value: Bilbo Enterprises

An example of a natural key:

Table: Customer
Surrogate Primary Key Value: BILBO01
Customer Name Value: Bilbo Enterprises

One of reasons I use a surrogate key is that if Bilbo retires to Rivendell and leaves his business to Frodo, Frodo may well change the name of the company. So with a natural key the data now looks like this:

Table: Customer
Surrogate Primary Key Value: BILBO01
Customer Name Value: Frodo Enterprises

As BILBO01 is a primary key and may be related to other tables using this primary key it may not be straightforward to change. You therefore now have information that looks inconsistent, it offends my sense of order :-)

There are many advantages to natural keys (especially to do with reducing the complexity of some queries), but for me I prefer using surrogates. It’s a hotly debated subject. Research both and use what you feel comfortable with. Do everything else right and it makes no difference in the long run.

"Far and away the best prize that life offers is the chance to work hard at work worth doing." - Theodore Roosevelt

www.artenscience.co.uk
Honest Expert Independent Technology Advice for Business

2 comments:

Phillip Flores said...

Steve,

Just like you I prefer using surrogate keys. I just find it more intuitive and it makes the setting up of relationships much easier. What I do is to assign a two-letter prefix to each table e.g. Client Table gets CL. The primary key is called CL_PK and the other fields CL_NAME etc. If I need to relate a client to an order for example, I use the table prefix of the client as part of the foreign key name e.g. OR_CL. Just my two cents.

Steve Cholerton said...

Hi Philip. I do the same for my tables except I use a three letter prefix, ie: CUS_PK, CUS_NAME. I find three letters allows me more flexibility if I have lots of tables, more room for variation :-)

When it comes to column naming the only thing you should be aiming for I believe is what you find easy to work with. So along as it's consistent you can use it easily and anyone who has to maintain your code, even though you may be using a different style to what they are used to, can easily grasp what's going on.
Cheers - Steve