Question 11 · Section 17

What is the difference between sharding and partitioning

Both split data, but at different levels:

Language versions: English Russian Ukrainian

Junior Level

Both split data, but at different levels:

  • Partitioning — splitting within one database on a single server
  • Sharding — splitting across different servers
Partitioning (one DB):
Table -> Partition 1 (2023)
      -> Partition 2 (2024)
      -> Partition 3 (2025)

Sharding (many servers):
DB -> Shard 1 (Server 1)
   -> Shard 2 (Server 2)
   -> Shard 3 (Server 3)

Middle Level

Partitioning

-- PostgreSQL range partitioning
-- PARTITION BY RANGE — PostgreSQL physically splits the table into parts.
-- Each partition stores its own data, but logically it is one table.
CREATE TABLE orders (
    id BIGINT,
    created_at DATE
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

// A query WHERE created_at = ‘2023-06-15’ doesn’t match any partition! // Need a default partition or error handling.

Sharding

MongoDB:
sh.shardCollection("db.orders", { user_id: "hashed" })

Cassandra:
Automatic sharding by partition key

Comparison

Partitioning Sharding
One server Many servers
One database Many databases
Simpler More complex
For query optimization For scaling

Senior Level

When to use what

If the table fits in memory and queries are fast — neither partitioning nor sharding is needed.

Partitioning:

✅ Large table, one server
✅ Optimize queries by date
✅ Archive old data

Sharding:

✅ One server is not enough
✅ Horizontal scaling is needed
✅ Data doesn't fit on one server

Best Practices

✅ Partitioning for optimization
✅ Sharding for scaling
✅ Start with partitioning
✅ Shard when you hit limits

❌ Sharding without necessity
❌ Complex cross-shard queries

Interview Cheat Sheet

Must know:

  • Partitioning — splitting within ONE database on a single server
  • Sharding — splitting across DIFFERENT servers
  • Partitioning for query optimization, sharding for scaling
  • Start with partitioning — simpler, less overhead
  • PostgreSQL supports PARTITION BY RANGE/LIST/HASH
  • MongoDB/Cassandra support automatic sharding
  • If the table fits in RAM and queries are fast — neither is needed

Common follow-up questions:

  • When partitioning? Large table, one server, date-based optimization, archiving.
  • When sharding? One server is not enough, data doesn’t fit, horizontal scaling is needed.
  • Can they be combined? Yes — partitioning within a shard.
  • How does PostgreSQL partitioning work? PARTITION BY RANGE/LIST — physical split, logically one table.

Red flags (DO NOT say):

  • “Partitioning and sharding are the same thing” — no, different levels
  • “Sharding is always better” — no, more complex and expensive
  • “Partitioning solves scaling problems” — no, only one server’s limits
  • “Need to shard from the start” — no, start when you hit limits

Related topics:

  • [[10. What is sharding]]
  • [[12. How to implement horizontal scaling of microservices]]
  • [[13. What is Database per Service pattern]]
  • [[14. What problems arise when using a shared database]]
  • [[26. What tools are used for microservice orchestration]]