I was reading an article by Rsnake this morning on the problems of using a username as a primary key, and it reminded me of something I’ve been meaning to write about for a while.

As a former database designer and current security geek I’m often stunned by how often designers/developers choose really bad primary keys for their databases. Even back in my developer days, when security for me meant taking down drunks at concerts, I knew better than to use something like a username, credit card number, or Social Security Number as a primary key. To be honest, it had nothing to do with security and everything to do with good database design.

Back when I was starting my IT career I was fortunate to work closely with the professor at the University of Colorado who taught database design. One of his cardinal rules was to, wherever possible, use system generated keys as the primary key. Randomly generated keys, as opposed to sequential keys that could “leak” information. Our designs should also strive to conform to at least Fourth Normal Form, although full normalization wasn’t always possible for practical reasons.

We never used Social Security Numbers as primary keys because they are neither always unique (there are mistakes in the system) nor available for all potential users (foreign students were assigned fakes). Credit card numbers are bad because they are not a unique identifier for an individual- I have multiple credit card numbers, any of which can identify me, all of which are temporal (change over time). I have no idea why retailers so often use credit card numbers as all or part of a primary key, since I may use multiple cards even on the same day.

Usernames more closely conform to a viable primary key from a pure, non-security design perspective, but I never liked them for some of the reasons RSnake cites, and I always feel like usernames are temporal, even when they aren’t, and while I haven’t tested it I think the performance of numeric keys is probably higher.

Thus my first rule in picking a primary key, from both a pure design and security perspective, is to use system generated random keys (preferably not sequential auto increment). For other fields you want unique, like username, SSN, or credit card number, just set a unique index on the field. Worst case, you can even use this to correlate across unrelated systems assuming the rest of the attributes line up (we used to do this using the SSN, before we all learned using them at all is bad).

One criticism of Rsnake’s examples of account hijacking is that even with a pure primary key, if you hard delete a username someone can still impersonate the account, depending on the design of the system.


p style=”text-align:right;font-size:10px;”>Technorati Tags: , ,