Question 1 · Section 16

What is the N+1 Problem and How to Solve It

The N+1 problem is one of the most common performance issues in Hibernate. It occurs when related entities are loaded incorrectly, causing a linear growth in SQL queries: 1 quer...

Language versions: English Russian Ukrainian

Overview

The N+1 problem is one of the most common performance issues in Hibernate. It occurs when related entities are loaded incorrectly, causing a linear growth in SQL queries: 1 query for the parent + N queries for child entities.


Junior Level

What is the N+1 Problem

This is a situation where Hibernate executes one query to get a list of objects, and then N additional queries (one per object) to load related data.

@Entity
public class Order {
    @Id
    private Long id;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private List<OrderItem> items;
}

// N+1 problem:
List<Order> orders = entityManager.createQuery("SELECT o FROM Order o", Order.class)
    .getResultList();  // 1 query: SELECT * FROM orders

for (Order order : orders) {
    order.getItems().size();  // N queries: SELECT * FROM order_items WHERE order_id = ?
}

// Total: 1 + N = N+1 queries!
// 100 orders -> 101 queries to the DB

How to Recognize It

  • Logs show many identical SELECT queries
  • Application is slow when loading collections
  • Number of queries is proportional to the number of records

Basic Solutions

  1. JOIN FETCH - merge into one query
  2. @BatchSize - load in batches
  3. EntityGraph - dynamically specify what to load

When NOT to use each approach

  • JOIN FETCH - do not use with multiple collections (Cartesian product)
  • @BatchSize - do not use when data is ALWAYS needed (better to use JOIN FETCH)
  • EntityGraph - do not use in simple CRUD operations (overhead of creating graph)
  • Subselect - do not use when parents were loaded one at a time (Hibernate falls back to SELECT)

Middle Level

Detailed Solutions

1. JOIN FETCH in JPQL

@Query("SELECT DISTINCT o FROM Order o JOIN FETCH o.items")
List<Order> findAllWithItems();

// Generates one query:
// SELECT DISTINCT o.* FROM orders o
// JOIN order_items oi ON o.id = oi.order_id

Always use DISTINCT - without it, Hibernate returns duplicate Orders (one per OrderItem).

2. @BatchSize

@Entity
public class Order {
    @OneToMany(mappedBy = "order")
    @BatchSize(size = 10)
    private List<OrderItem> items;
}

// Instead of N queries: N/10 queries
// Hibernate: SELECT * FROM order_items WHERE order_id IN (?,?,?,?,?,?,?,?,?,?)

3. EntityGraph

@EntityGraph(attributePaths = {"items"})
@Query("SELECT o FROM Order o")
List<Order> findAllWithItems();

// Or programmatically:
EntityGraph<Order> graph = entityManager.createEntityGraph(Order.class);
graph.addSubgraph("items");

List<Order> orders = entityManager.createQuery("SELECT o FROM Order o", Order.class)
    .setHint("jakarta.persistence.fetchgraph", graph)
    .getResultList();

Comparison of Approaches

Approach When to use Queries
JOIN FETCH Related data always needed 1
@BatchSize Related data needed sometimes N/batchSize
EntityGraph Dynamic loading 1
EAGER Never (antipattern) N

Common Mistakes

// EAGER loading for collections
@OneToMany(fetch = FetchType.EAGER)
private List<OrderItem> items;
// Always loads all elements -> performance problems

// Multiple JOIN FETCH -> Cartesian Product
@Query("SELECT o FROM Order o JOIN FETCH o.items JOIN FETCH o.payments")
// 10 orders x 5 items x 3 payments = 150 rows!

Senior Level

Internal Implementation

The N+1 problem occurs because of how Hibernate works with lazy loading proxies:

1. Order.getItems() returns a PersistentCollection proxy
   A proxy is a special subclass that Hibernate creates instead of the real object. When you access its fields, Hibernate intercepts the call and loads data from the DB.
2. On first access to the proxy, Hibernate checks if the collection is initialized
3. If not - executes SELECT for the specific order_id
4. For each Order - a separate SELECT, since each proxy is initialized individually

Advanced Strategies

1. Subselect fetching

@OneToMany(mappedBy = "order")
@Fetch(FetchMode.SUBSELECT)
private List<OrderItem> items;

// One subquery for all:
// SELECT * FROM order_items WHERE order_id IN (SELECT id FROM orders)

SUBSELECT vs @BatchSize: SUBSELECT executes one query for ALL loaded parents, while @BatchSize splits into groups. SUBSELECT is more efficient when there are few parents, BatchSize when there are many.

2. Hibernate 6 improvements

// Hibernate 6.2+ supports separate SELECT batching by default
// @BatchSize is no longer needed for collections
// But still useful for @ManyToOne

3. Detection in production

// p6spy - logging all SQL
// datasource-proxy - programmatic interception
// Spring Boot Actuator + Hibernate Statistics

@Bean
public DataSource dataSource() {
    return ProxyDataSourceBuilder.create(actualDataSource)
        .logQueryBySlf4j()
        .countQuery()
        .build();
}

Architectural Recommendations

LAZY by default for all associations
JOIN FETCH when related data is always needed
@BatchSize (10-50) for collections
DTO projection for read-only scenarios
Monitor SQL queries in production
Regular slow query log review

EAGER without good reason
Ignoring N+1 in code review
No query monitoring
Using entities instead of DTOs for API

When N+1 is Acceptable

In rare cases, N+1 may be acceptable:

  • Small data (< 10 records)
  • Related data is cached in L2 cache
  • Used with @BatchSize with a large size

Interview Cheat Sheet

Must know:

  • N+1 problem: 1 query for parents + N queries for child entities
  • Main solution - JOIN FETCH (one query with JOIN)
  • @BatchSize loads in batches: N queries -> N/batchSize
  • EntityGraph for dynamic loading of associations
  • EAGER for collections is an antipattern, causes N+1
  • DISTINCT is required with JOIN FETCH to eliminate duplicates
  • Multiple JOIN FETCH -> Cartesian product (avoid it)
  • Monitor SQL queries in production (p6spy, datasource-proxy)

Frequent follow-up questions:

  • When is N+1 acceptable? Small data (<10 records), L2 cache, or with @BatchSize
  • Why is DISTINCT needed? JOIN returns one row per child, DISTINCT eliminates duplicate root objects in memory
  • What’s better - JOIN FETCH or @BatchSize? JOIN FETCH when data is always needed, @BatchSize when sometimes
  • How to detect N+1? Enable SQL logging, count queries, use p6spy

Red flags (DO NOT say):

  • “I use EAGER to avoid N+1” - EAGER itself causes N+1
  • “N+1 is not a problem for modern ORMs” - it’s a problem in any ORM
  • “I always use JOIN FETCH for everything” - Cartesian product with multiple collections
  • “I don’t monitor SQL queries” - without monitoring, N+1 is invisible

Related topics:

  • [[2. What is the Difference Between Lazy and Eager Loading]]
  • [[5. What Fetch Strategies Exist in Hibernate]]
  • [[6. What Does the @BatchSize Annotation Do]]
  • [[28. How to Use JOIN FETCH to Solve the N+1 Problem]]
  • [[29. What is Projection in JPA]]