What is the difference between sharding and partitioning
Both split data, but at different levels:
Junior Level
Both split data, but at different levels:
- Partitioning — splitting within one database on a single server
- Sharding — splitting across different servers
Partitioning (one DB):
Table -> Partition 1 (2023)
-> Partition 2 (2024)
-> Partition 3 (2025)
Sharding (many servers):
DB -> Shard 1 (Server 1)
-> Shard 2 (Server 2)
-> Shard 3 (Server 3)
Middle Level
Partitioning
-- PostgreSQL range partitioning
-- PARTITION BY RANGE — PostgreSQL physically splits the table into parts.
-- Each partition stores its own data, but logically it is one table.
CREATE TABLE orders (
id BIGINT,
created_at DATE
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE orders_2025 PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
// A query WHERE created_at = ‘2023-06-15’ doesn’t match any partition! // Need a default partition or error handling.
Sharding
MongoDB:
sh.shardCollection("db.orders", { user_id: "hashed" })
Cassandra:
Automatic sharding by partition key
Comparison
| Partitioning | Sharding |
|---|---|
| One server | Many servers |
| One database | Many databases |
| Simpler | More complex |
| For query optimization | For scaling |
Senior Level
When to use what
If the table fits in memory and queries are fast — neither partitioning nor sharding is needed.
Partitioning:
✅ Large table, one server
✅ Optimize queries by date
✅ Archive old data
Sharding:
✅ One server is not enough
✅ Horizontal scaling is needed
✅ Data doesn't fit on one server
Best Practices
✅ Partitioning for optimization
✅ Sharding for scaling
✅ Start with partitioning
✅ Shard when you hit limits
❌ Sharding without necessity
❌ Complex cross-shard queries
Interview Cheat Sheet
Must know:
- Partitioning — splitting within ONE database on a single server
- Sharding — splitting across DIFFERENT servers
- Partitioning for query optimization, sharding for scaling
- Start with partitioning — simpler, less overhead
- PostgreSQL supports PARTITION BY RANGE/LIST/HASH
- MongoDB/Cassandra support automatic sharding
- If the table fits in RAM and queries are fast — neither is needed
Common follow-up questions:
- When partitioning? Large table, one server, date-based optimization, archiving.
- When sharding? One server is not enough, data doesn’t fit, horizontal scaling is needed.
- Can they be combined? Yes — partitioning within a shard.
- How does PostgreSQL partitioning work? PARTITION BY RANGE/LIST — physical split, logically one table.
Red flags (DO NOT say):
- “Partitioning and sharding are the same thing” — no, different levels
- “Sharding is always better” — no, more complex and expensive
- “Partitioning solves scaling problems” — no, only one server’s limits
- “Need to shard from the start” — no, start when you hit limits
Related topics:
- [[10. What is sharding]]
- [[12. How to implement horizontal scaling of microservices]]
- [[13. What is Database per Service pattern]]
- [[14. What problems arise when using a shared database]]
- [[26. What tools are used for microservice orchestration]]