Database Design Patterns for Modern Applications
Database design is the foundation of every successful application. Poor database design leads to performance issues, data inconsistencies, and maintenance nightmares. Let's explore proven patterns and best practices for modern database architecture.
Why Database Design Matters
The impact of good database design:
- Performance - Optimized queries and efficient data access
- Scalability - Handle millions of records without degradation
- Data Integrity - Prevent inconsistencies and anomalies
- Maintainability - Easy to evolve with business requirements
- Cost Efficiency - Reduce storage and compute costs
Relational vs Document Databases
Choose the right database for your use case.
PostgreSQL (Relational)
Best for:
- Complex relationships and joins
- ACID transactions
- Data integrity constraints
- Structured data with fixed schema
- Financial applications
sql
-- Strong relationships and constraints
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
total DECIMAL(10,2) NOT NULL,
status VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);
MongoDB (Document)
Best for:
- Flexible, evolving schemas
- Nested/hierarchical data
- High write throughput
- Rapid prototyping
- Content management systems
javascript
// Flexible document structure
{
_id: ObjectId("..."),
email: "user@example.com",
profile: {
name: "John Doe",
bio: "Full-stack developer",
avatar: "https://..."
},
orders: [
{
orderId: "ORD-001",
total: 99.99,
items: [
{ productId: "P123", quantity: 2, price: 49.99 }
],
status: "completed",
createdAt: ISODate("2025-12-01T10:00:00Z")
}
],
createdAt: ISODate("2025-01-01T00:00:00Z")
}
Normalization Strategies
First Normal Form (1NF)
Eliminate repeating groups.
sql
-- ❌ Bad: Repeating columns
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
phone1 VARCHAR(20),
phone2 VARCHAR(20),
phone3 VARCHAR(20)
);
-- ✅ Good: Separate table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255)
);
CREATE TABLE phone_numbers (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
phone VARCHAR(20),
type VARCHAR(20) -- 'mobile', 'work', 'home'
);
Second Normal Form (2NF)
Remove partial dependencies.
sql
-- ❌ Bad: Course name depends only on course_id
CREATE TABLE enrollments (
student_id INTEGER,
course_id INTEGER,
course_name VARCHAR(255),
enrollment_date DATE,
PRIMARY KEY (student_id, course_id)
);
-- ✅ Good: Separate course table
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL
);
CREATE TABLE enrollments (
student_id INTEGER,
course_id INTEGER REFERENCES courses(id),
enrollment_date DATE,
PRIMARY KEY (student_id, course_id)
);
Third Normal Form (3NF)
Eliminate transitive dependencies.
sql
-- ❌ Bad: City and country depend on postal_code
CREATE TABLE addresses (
id SERIAL PRIMARY KEY,
street VARCHAR(255),
postal_code VARCHAR(10),
city VARCHAR(100),
country VARCHAR(100)
);
-- ✅ Good: Separate location reference
CREATE TABLE postal_codes (
code VARCHAR(10) PRIMARY KEY,
city VARCHAR(100),
country VARCHAR(100)
);
CREATE TABLE addresses (
id SERIAL PRIMARY KEY,
street VARCHAR(255),
postal_code VARCHAR(10) REFERENCES postal_codes(code)
);
Common Design Patterns
One-to-Many Relationship
sql
-- Blog posts with comments
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title VARCHAR(255) NOT NULL,
content TEXT,
author_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE comments (
id SERIAL PRIMARY KEY,
post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
content TEXT NOT NULL,
author_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_comments_post_id ON comments(post_id);
Many-to-Many Relationship
sql
-- Students and courses
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
code VARCHAR(20) UNIQUE NOT NULL
);
-- Junction table
CREATE TABLE student_courses (
student_id INTEGER REFERENCES students(id) ON DELETE CASCADE,
course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP DEFAULT NOW(),
grade VARCHAR(2),
PRIMARY KEY (student_id, course_id)
);
CREATE INDEX idx_student_courses_student ON student_courses(student_id);
CREATE INDEX idx_student_courses_course ON student_courses(course_id);
Polymorphic Associations
sql
-- Comments can belong to posts, photos, or videos
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title VARCHAR(255),
content TEXT
);
CREATE TABLE photos (
id SERIAL PRIMARY KEY,
url VARCHAR(500),
caption TEXT
);
CREATE TABLE comments (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
commentable_type VARCHAR(50) NOT NULL, -- 'post', 'photo', 'video'
commentable_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_comments_polymorphic ON comments(commentable_type, commentable_id);
Self-Referencing (Hierarchical Data)
sql
-- Organizational hierarchy
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
manager_id INTEGER REFERENCES employees(id),
department VARCHAR(100)
);
CREATE INDEX idx_employees_manager ON employees(manager_id);
-- Query hierarchy
WITH RECURSIVE employee_hierarchy AS (
-- Anchor: Top-level managers
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Direct reports
SELECT e.id, e.name, e.manager_id, eh.level + 1
FROM employees e
JOIN employee_hierarchy eh ON e.manager_id = eh.id
)
SELECT * FROM employee_hierarchy ORDER BY level, name;
Soft Deletes Pattern
Preserve data instead of hard deletes.
sql
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price DECIMAL(10,2) NOT NULL,
deleted_at TIMESTAMP NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Only active products
CREATE VIEW active_products AS
SELECT * FROM products
WHERE deleted_at IS NULL;
-- Soft delete
UPDATE products SET deleted_at = NOW() WHERE id = 1;
-- Restore
UPDATE products SET deleted_at = NULL WHERE id = 1;
Audit Trail Pattern
Track all changes for compliance.
sql
CREATE TABLE audit_logs (
id SERIAL PRIMARY KEY,
table_name VARCHAR(100) NOT NULL,
record_id INTEGER NOT NULL,
action VARCHAR(20) NOT NULL, -- 'INSERT', 'UPDATE', 'DELETE'
old_values JSONB,
new_values JSONB,
user_id INTEGER,
created_at TIMESTAMP DEFAULT NOW()
);
-- Trigger function
CREATE OR REPLACE FUNCTION audit_trigger_func()
RETURNS TRIGGER AS $$
BEGIN
IF (TG_OP = 'DELETE') THEN
INSERT INTO audit_logs (table_name, record_id, action, old_values)
VALUES (TG_TABLE_NAME, OLD.id, 'DELETE', row_to_json(OLD));
RETURN OLD;
ELSIF (TG_OP = 'UPDATE') THEN
INSERT INTO audit_logs (table_name, record_id, action, old_values, new_values)
VALUES (TG_TABLE_NAME, NEW.id, 'UPDATE', row_to_json(OLD), row_to_json(NEW));
RETURN NEW;
ELSIF (TG_OP = 'INSERT') THEN
INSERT INTO audit_logs (table_name, record_id, action, new_values)
VALUES (TG_TABLE_NAME, NEW.id, 'INSERT', row_to_json(NEW));
RETURN NEW;
END IF;
END;
$$ LANGUAGE plpgsql;
-- Attach trigger
CREATE TRIGGER audit_users
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW EXECUTE FUNCTION audit_trigger_func();
Partitioning for Scale
Handle large tables efficiently.
Range Partitioning
sql
-- Partition by date
CREATE TABLE events (
id BIGSERIAL,
event_type VARCHAR(50),
data JSONB,
created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE events_2025_01 PARTITION OF events
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE events_2025_02 PARTITION OF events
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
-- Index on each partition
CREATE INDEX idx_events_2025_01_created ON events_2025_01(created_at);
CREATE INDEX idx_events_2025_02_created ON events_2025_02(created_at);
Hash Partitioning
sql
-- Partition by user_id hash
CREATE TABLE user_actions (
id BIGSERIAL,
user_id INTEGER NOT NULL,
action VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW()
) PARTITION BY HASH (user_id);
-- Create 4 partitions
CREATE TABLE user_actions_0 PARTITION OF user_actions
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_actions_1 PARTITION OF user_actions
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_actions_2 PARTITION OF user_actions
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_actions_3 PARTITION OF user_actions
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
Indexing Strategies
Choose the right indexes for performance.
B-Tree Indexes (Default)
sql
-- Single column
CREATE INDEX idx_users_email ON users(email);
-- Composite index (order matters!)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Query benefits from this index
SELECT * FROM orders WHERE user_id = 1 AND status = 'completed';
Partial Indexes
sql
-- Index only active users
CREATE INDEX idx_active_users ON users(email)
WHERE deleted_at IS NULL;
-- Index only pending orders
CREATE INDEX idx_pending_orders ON orders(created_at)
WHERE status = 'pending';
Expression Indexes
sql
-- Index for case-insensitive searches
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Query uses the index
SELECT * FROM users WHERE LOWER(email) = 'john@example.com';
Full-Text Search
sql
-- Add tsvector column
ALTER TABLE posts ADD COLUMN search_vector tsvector;
-- Update with trigger
CREATE FUNCTION posts_search_trigger() RETURNS trigger AS $$
BEGIN
NEW.search_vector := to_tsvector('english', COALESCE(NEW.title, '') || ' ' || COALESCE(NEW.content, ''));
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER tsvector_update BEFORE INSERT OR UPDATE
ON posts FOR EACH ROW EXECUTE FUNCTION posts_search_trigger();
-- Create GIN index
CREATE INDEX idx_posts_search ON posts USING GIN(search_vector);
-- Search
SELECT * FROM posts
WHERE search_vector @@ to_tsquery('english', 'postgresql & database');
Caching Strategies
Materialized Views
sql
-- Pre-computed aggregations
CREATE MATERIALIZED VIEW user_statistics AS
SELECT
u.id,
u.name,
COUNT(o.id) AS total_orders,
SUM(o.total) AS total_spent,
MAX(o.created_at) AS last_order_date
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;
CREATE INDEX idx_user_stats_id ON user_statistics(id);
-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics;
Redis Caching Pattern
javascript
import { Redis } from "ioredis";
import { prisma } from "./prisma";
const redis = new Redis(process.env.REDIS_URL);
async function getUserById(id: string) {
const cacheKey = user:${id};
// Check cache
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// Fetch from database
const user = await prisma.user.findUnique({
where: { id },
});
if (user) {
// Cache for 1 hour
await redis.setex(cacheKey, 3600, JSON.stringify(user));
}
return user;
}
// Invalidate on update
async function updateUser(id: string, data: any) {
const user = await prisma.user.update({
where: { id },
data,
});
// Invalidate cache
await redis.del(user:${id});
return user;
}
Data Integrity Constraints
Enforce business rules at the database level.
sql
-- Check constraints
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
stock INTEGER NOT NULL CHECK (stock >= 0),
discount_percent INTEGER CHECK (discount_percent BETWEEN 0 AND 100)
);
-- Unique constraints
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
username VARCHAR(50) UNIQUE NOT NULL
);
-- Composite unique
CREATE TABLE product_reviews (
id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
rating INTEGER CHECK (rating BETWEEN 1 AND 5),
UNIQUE (product_id, user_id) -- One review per user per product
);
-- Foreign key with cascade
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INTEGER NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
product_id INTEGER NOT NULL REFERENCES products(id) ON DELETE RESTRICT,
quantity INTEGER NOT NULL CHECK (quantity > 0)
);
Performance Best Practices
1. Use EXPLAIN ANALYZE
sql
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 1 AND status = 'pending';
2. Avoid N+1 Queries
javascript
// ❌ Bad: N+1 queries
const users = await prisma.user.findMany();
for (const user of users) {
const posts = await prisma.post.findMany({
where: { authorId: user.id },
});
}
// ✅ Good: Single query with include
const users = await prisma.user.findMany({
include: { posts: true },
});
3. Use Connection Pooling
javascript
import { Pool } from "pg";
const pool = new Pool({
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
4. Batch Operations
javascript
// ✅ Batch insert
await prisma.user.createMany({
data: [
{ email: "user1@example.com", name: "User 1" },
{ email: "user2@example.com", name: "User 2" },
{ email: "user3@example.com", name: "User 3" },
],
});
Conclusion
Effective database design requires understanding normalization, choosing appropriate patterns, implementing proper indexing, and maintaining data integrity. Start with a solid foundation, optimize iteratively, and always consider your application's specific requirements.
Remember: premature optimization is the root of all evil, but proper database design from the start saves countless hours of refactoring later.
Happy designing! 🗄️