Sunday

Seven Deadly Sins of Database Design

Several factors can lead to a poor database design -- lack of experience, a shortage of the necessary skills, tight timelines and insufficient resources can all contribute. In turn, poor database design leads to many problems down the line, such as sub-par performance, the inability to make changes to accommodate new features, and low-quality data that can cost both time and money as the application evolves.

Sin #1: Poor or missing documentation for database(s) in production
Documentation for databases usually falls into three categories: incomplete, inaccurate, or none at all. To make matters worse, it is almost never centralized in one place. Without proper, centralized documentation, understanding the impact of a change is difficult at best, impossible at worst. This causes developers, DBAs, architects, and business analysts to scramble to get on the same page. They are left up to their own imagination to interpret the meaning and usage of the data.

Sin #2: Little or no normalization

There are times to denormalize a database structure to achieve optimized performance, but sacrificing flexibility is terrible. Despite the long-held belief by developers, one table to store everything is not always optimal. The “one table” approach may make data access easier, but invariably there will be many NULLs for columns that do not apply to a record, and special application code will be needed to handle it. Another common mistake is repeating values stored in a table. This can greatly decrease flexibility and increase difficulty when updating the data.

Sin #3: Not treating the data model like a living, breathing organism

The sin arises when changes creep into the database due to critical production issues. Inevitably, the model is then left languishing on the side if there is not a process to update the model along with the database. As more changes occur in the database, the model becomes useless.

Undocumented data can also lead to security and compliance risks, poor understanding of future changes and the inability to adapt to future needs of the business. Although the design-then-build practice may be a utopia that is never reached, the changes need to find their way back to the model.

Sin #4: Improper storage of reference data

Two main problems with reference data. It is either stored in many places or, even worse, embedded in the application code. It is almost never captured in the data model with the rest of the documentation. This causes huge problems when a reference value needs to change. Reference values provide valuable documentation which should be communicated. Your best chance is often via the model. It may not be practical to store reference values in the data model if you have large volumes, but there is no excuse not to point to them from the model. The key is to have it defined in one place and used in other places.

Sin #5: Not using foreign keys or check constraints

Customers always complain all the time about the lack of referential integrity (RI) or validation checks defined in the database when reverse engineering databases. For older database systems, it was thought that foreign keys and check constraints slowed performance, thus, the RI and checks should be done via the application. This might have been the case in the past, but DBMSs have come a long way.

Sin #6: Not using domains and naming standards

Domains and naming standards are probably two of the most important things you can incorporate into your modeling practices. Domains allow you to create reusable attributes so that the same attributes are not created in different places with different properties. It is extremely important to have a common set that everyone can use across all models. Naming standards allow us to clearly identify those attributes consistently.

Having a set of standards also ensures consistency across systems and promotes readability of models and code. You don’t want short, cryptic names that users need to interpret. Given the advanced nature of the latest vendor releases, the days of limited column length name are over when building new databases. Always have a common set of classwords to identify key types of data and use modifiers as needed.

Sin #7: Not choosing primary keys properly

The simplest principle to remember when picking a primary key is SUM: Static, Unique, Minimal. It is not necessary to delve into the whole natural vs. surrogate key debate; however, it is important to know that although surrogate keys may uniquely indentify therecord, they do not always unique identify the data. There is a time and a place for both, and you can always create an alternate key for natural keys if a surrogate is used as the primary key.

Author, Jason Tiret

Information retrieved from

0 comments:

Post a Comment


I made this widget at MyFlashFetish.com.