Database Database design principles On this page
Database design principles 1. Avoid redundancy
can lead to inconsistencies: data is updated in one place, but not updated in the other place, sync
impact the performance: increase the storage space
exception: dimensional schemas used for analytical processing and data warehousing
2. Primary keys and unique identifiers
every table must have a primary key
it guarantees the uniqueness of each row within the table
it is used to establish relationships with other tables
3. Null values
only allow the column to support null values if you're certain that at some point the value may be unknown, missing, or not applicable
it's important to differentiate null values from empty
values, such as zero or empty string
primary keys cannot store null values
a null value can be applied to columns of any data type (as long as the column supports null values)
null values are ignored in unique and foreign key constraints
4. Referential integrity
establishing relationships between tables using foreign keys and primary keys
5. Atomicity
don't use a single column for complex or compound data, instead use separate columns to represent the atomic data, for example: divide name
into firstName
and lastName
, divide address
into state
, city
, postcode
.
6. Normalization
normalizing a model is a matter of bringing it to the 3rd normal form (3NF)
it helps maintain data integrity, reduces redundancy and optimizes the storage space
7. Data typing
use appropriate data types
8. Indexing
create index for primary keys and foreign keys (automatically by RDBMS)
create index based on use cases
create index for fields that are frequently used for querying or filtering data, but avoid creating unnecessary or duplicate indexes
consider columns' cardinality before you index. If you create an index for a column with low cardinality, it will rarely speed up queries
prioritize narrow
columns for index since indexes take up storage space
index creation if often part of database maintenance rather than design
Design databases for maximum usability, reliability and maintainability
1. Understand the requirements
understand the requirements of the users and business context
identify data sources, types, and relationships
functional and non functional requirements: the expected queries, transactions, operations, constraints and quality attributes
common technique: data dictionary
2.Choose a data model
choose a data model that suites the requirements and the the nature of the data
a relational model: based on tables, columns, and rows, and is widely used for structural and consistent data
3. Design a schema
ERD (entity-relationship diagrams): graphical representations of the entities, attributes, and relationships in the database
Normalization: the process of organizing data into tables that minimize redundancy and anomalies
Denormalization: the process of combining or duplicating data to improve performance and readability
4. Ensure security and privacy
Encryption: the process of transforming the data into a unreadable form, using a key or a algorithm
Authentication: the process of verifying the identity of the users using username/password or other credentials
Authorization: the process of granting or denying access the data based on the roles and privileges of the users
Auditing: the process of recording and monitoring the activities and events in the database, such as queries, transactions and errors
Masking: the process of hiding or replacing sensitive data such as personal or financial information with dummy or random values
5. Plan for backup and recovery
backup and recovery are essential for ensuring the reliability and maintainability of the database as they prevent or minimize the loss or corruption of data due to hardware failures, software errors, human errors or natural disasters
methods: full, incremental, differential backups, restore points, snapshots, replication, mirroring