Database Design and Modeling: Relational vs. NoSQL, Normalization, and Indexing

Author

Kritim Yantra

Apr 11, 2025

Database Design and Modeling: Relational vs. NoSQL, Normalization, and Indexing

Choosing the right database design is crucial for performance, scalability, and maintainability. Should you use SQL or NoSQL? How do you structure tables efficiently? When should you normalize or denormalize?

In this blog, we’ll break down:
Relational (SQL) vs. NoSQL databases – When to use each?
Normalization & Denormalization – Finding the right balance.
Indexing Strategies – Speeding up queries without wasting space.
Schema Design Best Practices – Real-world examples (e-commerce, social apps).

Let’s dive in!


1. Relational (SQL) vs. NoSQL Databases

A. Relational Databases (SQL)

  • Structure: Tables with rows and columns (strict schema).
  • Relationships: Foreign keys link tables (e.g., usersorders).
  • Examples: PostgreSQL, MySQL, SQL Server.

When to Use SQL?
✔ Complex queries (JOINs, transactions).
✔ Strong consistency (e.g., banking systems).
✔ Structured data (e.g., user profiles, orders).

Example Schema (E-Commerce):

CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100) UNIQUE
);

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT FOREIGN KEY REFERENCES users(id),
    total DECIMAL(10,2)
);

B. NoSQL Databases

  • Structure: Flexible schemas (JSON-like documents, key-value, graphs).
  • Relationships: Often denormalized (data nested inside documents).
  • Examples: MongoDB (document), Redis (key-value), Neo4j (graph).

When to Use NoSQL?
✔ Rapidly changing data (e.g., social media posts).
✔ High write scalability (e.g., IoT sensor data).
✔ Unstructured/semi-structured data (e.g., logs, product catalogs).

Example Schema (MongoDB Blog Post):

{
  "_id": "post123",
  "title": "Database Design 101",
  "author": { "name": "Alice", "id": "user789" },
  "tags": ["db", "design"],
  "comments": [
    { "user": "Bob", "text": "Great post!" }
  ]
}

2. Normalization vs. Denormalization

A. Normalization (Minimizing Redundancy)

  • Goal: Split data into small, related tables to avoid duplication.
  • Levels:
    1. 1NF – Each column has atomic values (no lists).
    2. 2NF – No partial dependencies (all fields depend on the full primary key).
    3. 3NF – No transitive dependencies (non-key fields don’t depend on other non-key fields).

Pros:
✅ Less storage usage.
✅ Easier updates (change data in one place).

Cons:
❌ More JOINs = Slower reads.

B. Denormalization (Optimizing for Reads)

  • Goal: Duplicate data for faster queries (common in NoSQL).

Pros:
✅ Faster reads (no JOINs needed).
✅ Simpler queries.

Cons:
❌ Harder to keep consistent (e.g., updating user names across multiple posts).

Example:

  • Normalized (SQL):
    SELECT u.name, o.total FROM users u JOIN orders o ON u.id = o.user_id;
    
  • Denormalized (NoSQL):
    { "order_id": "123", "user_name": "Alice", "total": 99.99 }
    

3. Indexing: Speeding Up Queries

Indexes help databases find data faster (like a book’s index).

Common Index Types

Type Works Best For
B-Tree Default in SQL (range queries: WHERE age > 25).
Hash Exact matches (WHERE id = 123).
Composite Multiple columns (WHERE user_id=1 AND status='paid').

Best Practices:
✔ Index frequently filtered columns (e.g., user_id, created_at).
❌ Avoid over-indexing (slows down writes).

Example (SQL):

CREATE INDEX idx_user_orders ON orders(user_id);  -- Speeds up `WHERE user_id=123`

4. Schema Design Examples

A. Social Media App (Twitter-like)

SQL (Normalized) NoSQL (Denormalized)
- users table
- tweets table
- likes table (JOIN-heavy)
- Embedded likes in tweets:
json<br> {<br> "tweet_id": "123",<br> "text": "Hello world!",<br> "likes": ["user1", "user2"]<br> }<br>

B. E-Commerce (Amazon-like)

SQL (Normalized) NoSQL (Denormalized)
- products
- categories
- product_categories (many-to-many)
- Nested categories:
json<br> {<br> "product_id": "p123",<br> "name": "Laptop",<br> "categories": ["Electronics", "Computers"]<br> }<br>

5. When to Choose SQL vs. NoSQL?

Factor SQL (Relational) NoSQL
Data Structure Fixed schema Flexible schema
Scalability Vertical + limited horizontal Horizontal (easy scaling)
Consistency Strong (ACID) Eventual (BASE)
Use Cases Banking, ERP, reporting Social media, IoT, real-time apps

Hybrid Approach?
Some systems use both:

  • SQL for transactions (e.g., payments).
  • NoSQL for feeds/activity logs (e.g., user notifications).

6. Best Practices for Database Design

  1. Start with SQL if unsure (easier to migrate to NoSQL later).
  2. Denormalize for read-heavy apps (e.g., analytics, timelines).
  3. Use indexes wisely (benchmark before/after adding).
  4. Plan for growth (sharding, partitioning).
  5. Document your schema (helps future devs understand relationships).

7. What’s Next?

  • Deep Dive: PostgreSQL vs. MongoDB – A Detailed Comparison.
  • Advanced Topic: Database Sharding for Massive Scale.
  • Case Study: How Uber Manages Trip Data with Multiple Databases.

Which topic should I cover next? Let me know in the comments! 🚀


Final Thoughts

  • SQL = Structured, relational, strong consistency.
  • NoSQL = Flexible, scalable, high performance for specific workloads.
  • Normalize for integrity, denormalize for speed.
  • Indexes are powerful but require careful use.

Are you team SQL or NoSQL? Share your experiences below! 👇

Tags

System Design

Comments

No comments yet. Be the first to comment!

Please log in to post a comment:

Continue with Google

Related Posts