What Does the @BatchSize Annotation Do
@BatchSize is one of the most useful Hibernate annotations for combating the N+1 problem. It allows loading related entities in batches, significantly reducing the number of SQL...
Overview
@BatchSize is one of the most useful Hibernate annotations for combating the N+1 problem. It allows loading related entities in batches, significantly reducing the number of SQL queries.
Junior Level
What is @BatchSize
The @BatchSize annotation makes Hibernate load related entities in batches instead of executing a separate query for each entity.
// Without @BatchSize - N queries
for (Order order : orders) {
order.getItems().size();
// SELECT * FROM order_items WHERE order_id = ? (for each Order)
}
// 100 orders -> 100 queries
// With @BatchSize(size = 10) - N/10 queries
@Entity
public class Order {
@OneToMany(mappedBy = "order")
@BatchSize(size = 10)
private List<OrderItem> items;
}
// Hibernate: SELECT * FROM order_items WHERE order_id IN (?,?,?,?,?,?,?,?,?,?)
// 100 orders -> 10 queries (10 in each)
IN clause - part of an SQL WHERE id IN (…) query. Databases limit the number of parameters. Prepared statement - a pre-compiled SQL query.
How It Works
1. Hibernate collects IDs of entities for which the association needs to be loaded
2. Instead of N separate queries - forms one query with IN (id1, id2, ...)
3. Batch size = @BatchSize value
4. Number of queries = N / batchSize
Where It Can Be Used
// On a collection
@OneToMany(mappedBy = "order")
@BatchSize(size = 25)
private List<OrderItem> items;
// On a class (for loading multiple entities by ID)
@Entity
@BatchSize(size = 50)
public class User { }
// When loading
List<User> users = session.createQuery("FROM User", User.class)
.setHint(AvailableHints.HINT_SPEC_FETCH_SIZE, 50)
.getResultList();
// This is a different mechanism: fetch size controls how many rows the JDBC driver
// fetches at a time from the cursor. @BatchSize controls how many IDs
// are substituted into the IN-clause. Don't confuse them!
When @BatchSize is Useless
- A single entity is loaded - batch not needed
- JOIN FETCH is used - data is already loaded
- Collection is small (1-3 elements) - batch overhead > benefit
Middle Level
Choosing Optimal Size
// Too small size
@BatchSize(size = 2) // hardly helps, 50 queries instead of 100
// Optimal range
@BatchSize(size = 10) // minimum
@BatchSize(size = 25) // usually good
@BatchSize(size = 50) // maximum (more may cause IN clause issues)
Global Configuration
# application.yml - global batch size for all
spring:
jpa:
properties:
hibernate:
default_batch_fetch_size: 50
// Or via Java Config
@Configuration
public class HibernateConfig {
@Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
// ...
properties.put("hibernate.default_batch_fetch_size", 50);
return factory;
}
}
Common Mistakes
// Too small size
@BatchSize(size = 1) // equivalent to no @BatchSize
// Too large size (DB issues)
@BatchSize(size = 1000) // may exceed IN clause parameter limit
// Oracle: max 1000 parameters
// PostgreSQL: depends on configuration
// Using for @ManyToOne without understanding
@ManyToOne
@BatchSize(size = 25)
private User user;
// Works, but only makes sense when many entities reference different Users
Comparison with @Fetch
// @BatchSize - loads in batches on access
@BatchSize(size = 25)
private List<OrderItem> items;
// @Fetch(SUBSELECT) - one subquery for all
@Fetch(FetchMode.SUBSELECT)
private List<OrderItem> items;
// Combination - best approach
@BatchSize(size = 25)
@Fetch(FetchMode.SUBSELECT)
private List<OrderItem> items;
Senior Level
Internal Implementation
In Hibernate 6+, batch fetching works more efficiently thanks to an improved IN-clause formation algorithm.
How @BatchSize works:
1. On first access to a collection proxy:
- Hibernate checks BatchSize
- Collects available parent entity IDs
2. Forms the query:
- WHERE parent_id IN (id1, id2, ..., idN)
- N = min(@BatchSize, available IDs, default_batch_fetch_size)
3. Loads all related entities in one query
4. Caches them in the Persistence Context
5. On access to other collections - takes from cache
Important:
- Real batch size = min(@BatchSize, hibernate.default_batch_fetch_size)
- Hibernate dynamically adjusts IN clause size
Database Limitations
Oracle: max 1000 parameters in IN clause
PostgreSQL: depends on max_stack_depth and prepared statement
MySQL: depends on max_allowed_packet
SQL Server: up to 2100 parameters
Optimization in Production
// Configuration via Hibernate 6
spring.jpa.properties.hibernate.default_batch_fetch_size=50
// For specific entities - override
@Entity
@BatchSize(size = 100) // larger for frequently used
public class User { }
// Smaller for large collections
@OneToMany
@BatchSize(size = 10)
private List<LogEntry> logs;
Combining Strategies
@Entity
public class Order {
// JOIN FETCH for mandatory data
@Query("SELECT DISTINCT o FROM Order o JOIN FETCH o.user WHERE o.id = :id")
Order findByIdWithUser(@Param("id") Long id);
// @BatchSize for additional data
@OneToMany(mappedBy = "order")
@BatchSize(size = 25)
private List<OrderItem> items;
// EntityGraph for dynamic loading
@EntityGraph(attributePaths = {"items", "payments"})
List<Order> findByStatus(String status);
}
Monitoring and Debugging
# Enable SQL logging for verification
spring.jpa.show-sql: true
spring.jpa.properties.hibernate.format_sql: true
# For query counting
logging.level.org.hibernate.SQL: DEBUG
logging.level.org.hibernate.type.descriptor.sql.BasicBinder: TRACE
Best Practices
@BatchSize(10-50) for collections
Global default_batch_fetch_size = 25-50
Combine with JOIN FETCH
Account for DB limitations (IN clause limits)
Monitor actual query counts
Too small size (< 5)
Too large size (> 100 requires verification: make sure your DB supports such an IN clause and the query doesn't exceed max_allowed_packet. For reference tables with 5000 records, batch size 100 may be optimal.)
Without query monitoring
Using instead of JOIN FETCH when data is always needed
Interview Cheat Sheet
Must know:
- @BatchSize loads related entities in batches: N queries -> N/batchSize
- Real size = min(@BatchSize, hibernate.default_batch_fetch_size)
- Optimal range: 10-50, too small - doesn’t help, too large - IN clause problems
- Can be used on collections, classes, and globally via default_batch_fetch_size
- DBs have limits: Oracle - 1000 parameters, SQL Server - 2100
- Combines with JOIN FETCH and SUBSELECT for optimal results
Frequent follow-up questions:
- Where can @BatchSize be used? On collections (@OneToMany), on classes, globally via configuration
- What if parents were loaded one by one? @BatchSize still works - collects IDs on access
- @BatchSize vs @Fetch(SUBSELECT)? SUBSELECT - one query for all parents, BatchSize - splits into groups
- When is @BatchSize useless? Single entity, JOIN FETCH used, or very small collection
Red flags (DO NOT say):
- “I set @BatchSize(size = 1000)” - will exceed DB parameter limits
- “@BatchSize replaces JOIN FETCH” - they complement, not replace each other
- “Global size 1000 for everything” - IN clause and performance problems
- “I don’t monitor SQL after @BatchSize” - can’t see the effect without monitoring
Related topics:
- [[1. What is the N+1 Problem and How to Solve It]]
- [[5. What Fetch Strategies Exist in Hibernate]]
- [[28. How to Use JOIN FETCH to Solve the N+1 Problem]]
- [[10. What is the Second-Level Cache and When to Use It]]