Spring data jpa批量执行所有插入操作

huangapple go评论96阅读模式
英文:

Spring data jpa batch execute all inserts operations

问题

我需要在MySQL中插入大量数据(大约10万条左右),然后我尝试使用Spring Data Jpa进行批量插入,为此我使用了一个包含30条记录的简单示例。

首先,我移除了我的实体上的 @GeneratedValue 并实现了 Persistable,以避免在插入之前进行选择:

  1. @Entity
  2. public class User implements Persistable {
  3. @Id
  4. private Integer id;
  5. // 在这里添加属性...

然后,在我的 application.yml 文件中:

  1. spring:
  2. jpa:
  3. properties:
  4. hibernate.jdbc.batch_size: 30
  5. hibernate.generate_statistics: true
  6. show-sql: true
  7. hibernate:
  8. ddl-auto: validate
  9. datasource:
  10. driverClassName: com.mysql.cj.jdbc.Driver
  11. url: jdbc:mysql://localhost:3306/db?cachePrepStmts=true&reWriteBatchedInserts=true
  12. // 用户名和密码

我有一个简单的存储库:

  1. public interface UserRepository extends JpaRepository<User, Integer> { }

以及插入方法:

  1. public void process() {
  2. List<User> users = new ArrayList<>();
  3. for (int i = 1; i <= 30; i++) {
  4. User user = new User();
  5. user.setId(i);
  6. // 设置属性...
  7. users.add(user);
  8. if (i % 30 == 0) {
  9. userRepository.saveAll(users);
  10. users.clear();
  11. }
  12. }
  13. }

然后我认为正确的是进行1次批处理操作,但是我得到了29个语句:

  1. 1745893 纳秒用于获取1JDBC连接;
  2. 0 纳秒用于释放0JDBC连接;
  3. 3524622 纳秒用于准备30JDBC语句;
  4. 68290171 纳秒用于执行29JDBC语句;
  5. 215125391 纳秒用于执行1JDBC批处理;
  6. 0 纳秒用于执行0L2C放置操作;
  7. 0 纳秒用于执行0L2C命中操作;
  8. 0 纳秒用于执行0L2C未命中操作;
  9. 240389888 纳秒用于执行1次刷新操作(刷新了29个实体和29个集合);
  10. 0 纳秒用于执行0次部分刷新操作(刷新了0个实体和0个集合)。

有什么建议吗?谢谢!

英文:

I need to insert a lot of data in a mysql(Something around of 100k) then I'm trying to use Spring Data Jpa to insert in batch, for that I'm using a simple example with 30 records.

The first thing was remove @GeneratedValue os my entity end implements Persistable to no need a select before a insert:

  1. @Entity
  2. public class User implements Persistable {
  3. @Id
  4. private Integer id;
  5. // properties here...

Then, on my application.yml:

  1. spring:
  2. jpa:
  3. properties:
  4. hibernate.jdbc.batch_size: 30
  5. hibernate.generate_statistics: true
  6. show-sql: true
  7. hibernate:
  8. ddl-auto: validate
  9. datasource:
  10. driverClassName: com.mysql.cj.jdbc.Driver
  11. url: jdbc:mysql://localhost:3306/db?cachePrepStmts=true&amp;reWriteBatchedInserts=true
  12. // user and password

I have a simple repository:

  1. public interface UserRepository extends JpaRepository&lt;User, Integer&gt; { }

and the insert method:

  1. public void process() {
  2. List&lt;User&gt; users = new ArrayList&lt;&gt;();
  3. for (int i = 1 ; i &lt;= 30; i++) {
  4. User user = new User();
  5. user.setId(i);
  6. // set properties
  7. users.add(user);
  8. if(i % 30 == 0) {
  9. userRepository.saveAll(users);
  10. users.clear();
  11. }
  12. }
  13. }

Then I think the correct is 1 batch operation only, but I had 29 statements:

  1. 1745893 nanoseconds spent acquiring 1 JDBC connections;
  2. 0 nanoseconds spent releasing 0 JDBC connections;
  3. 3524622 nanoseconds spent preparing 30 JDBC statements;
  4. 68290171 nanoseconds spent executing 29 JDBC statements;
  5. 215125391 nanoseconds spent executing 1 JDBC batches;
  6. 0 nanoseconds spent performing 0 L2C puts;
  7. 0 nanoseconds spent performing 0 L2C hits;
  8. 0 nanoseconds spent performing 0 L2C misses;
  9. 240389888 nanoseconds spent executing 1 flushes (flushing a total of 29 entities and 29 collections);
  10. 0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

Any ideias?

thanks!

答案1

得分: 1

spring.jpa.properties.hibernate.order_inserts=true

英文:

Test next property:

  1. spring.jpa.properties.hibernate.order_inserts=true

huangapple
  • 本文由 发表于 2020年8月11日 02:48:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/63346266.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定