PreparedStatement批量插入/更新 – 死锁问题

PreparedStatement Batch Insert/Update - DeadLock Issue



I'm working on a system that supports multiple databases. All the insert and updates happen in bulk and the same is achieved with the help of PreparedStatement batches. However, with PostgreSQL, there are quite a few times where it is causing a deadlock over updating in batch. I would like to know if there is a way to avoid this.

> ERROR: deadlock detected
> Detail: Process 30655 waits for ExclusiveLock on relation 295507848 of database 17148; blocked by process 30662.

I have retry logic in place but it gives:

> ERROR: current transaction is aborted, commands ignored until end of transaction block

I'm a bit confused about how to handle this situation.

  1. public void InsertUpdateBatch(String auditPrefix, String _tableName, TableStructure&lt;?&gt; ts,
  2. StringBuilder sb, String operation) throws Exception {
  3. boolean retry = true;
  4. boolean isInsert = &quot;insert&quot;.equalsIgnoreCase(operation) ? true : false;
  5. int minTry = 0;
  6. int maxTries = 2;
  7. ThreadLocal&lt;PreparedStatement&gt; statement = isInsert ? pstmt : updateStmt;
  8. ThreadLocal&lt;List&lt;Object[]&gt;&gt; dataToProcess = isInsert ? insertBatchData : updateBatchData;
  9. while (retry) {
  10. try {
  11. long t1 = System.currentTimeMillis();
  12. int[] retCount = {};
  13. retCount = statement.get().executeBatch();
  14. // Clearing the batch and batch data
  15. statement.get().clearBatch();
  16. dataToProcess.get().clear();
  17. if(isInsert) {
  18. syncReport.addInsert(ts.getTableName(), retCount.length);
  19. } else {
  20. syncReport.addUpdate(ts.getTableName(), retCount.length);
  21. }
  22. this.syncReport.addDatabaseTime(t1, System.currentTimeMillis());
  23. retry = false;
  24. } catch (Exception e) {
  25. // Clearing the batch explicitly
  26. statement.get().clearBatch();
  27. log.log(Level.INFO, &quot;Thread &quot; + Thread.currentThread().getName() + &quot;: tried the operation &quot; + operation + &quot; for &quot; + (minTry + 1) + &quot; time(s)&quot;);
  28. if (++minTry == maxTries) {
  29. retry = false;
  30. minTry = 0;
  31. e.printStackTrace();
  32. commitSynchException(auditPrefix, _tableName, ts, sb, operation, isInsert, e);
  33. } else {
  34. trackRecordCount(e, ts, !isInsert);
  35. // Rebuild Batch
  36. rebuildBatch(ts, dataToProcess.get(), e);
  37. // Clearing old batch data after rebuilding the batch
  38. dataToProcess.get().clear();
  39. }
  40. }
  41. }
  42. }


下面是一个修复了你所有问题的指数随机回退的示例,除了你需要重新创建你的(Prepared)Statement 对象并关闭旧对象的部分;你的片段没有清楚地表明这是在哪里发生的。

  1. } catch (SQLException e) { // 捕获 SQLEx,而不是 Ex
  2. String ps = e.getSQLState();
  3. if (ps != null && ps.length() == 5 && ps.startsWith("40")) {
  4. // 对于 postgres,这意味着重试。这是特定于数据库的!
  5. retryCount++;
  6. if (retryCount > 50) throw e;
  7. try {
  8. Thread.sleep((retryCount * 2) + rnd.nextInt(8 * retryCount));
  9. continue; // 继续重试循环。
  10. } catch (InterruptedException e2) {
  11. // 被中断;停止重试并只抛出异常。
  12. throw e;
  13. }
  14. }
  15. // 它不是重试;只是抛出它。
  16. throw e;
  17. }

或者,为自己做一个巨大的好事,摆脱所有这些工作,使用一个库。JDBC 设计得非常让人讨厌、不一致和丑陋,适合“最终用户” - 这是因为 JDBC 的目标受众不是你。它是数据库供应商。这是可以想象的最低级别的粘合剂,包含各种奇怪的技巧,以便所有数据库供应商都可以公开它们的专有功能。

那些使用 JDBC 访问数据库的人应该使用在其上构建的抽象库!

例如,JDBI 很好,支持重试非常好,带有 lambda 表达式。


Retry is the solution. But you haven't properly implemented it.

-- EDIT as suggested by @Mark Rotteveel --

You need to explicitly call .abort() on your connection and then you can retry. You can probably get away with keeping your PreparedStatement / Statement objects, but if you still run into trouble consider closing and recreating these.

-- END EDIT ---

Your second problem is lack of nagled exponential backoff.

Computers are reliable. Very reliable. Better than swiss watches.

If two threads do a job, and as part of that job they deadlock each other, and they both will see this, abort their transactions, and start over, then...

probably the exact same thing will happen again. And again. And again. And again. Computers can be that reliable in unlucky scenarios.

The solution is randomized exponential backoff. One way to ensure that the two threads don't keep doing things the same way in the exact same order with the exact same timing is to literally start flipping coins to forcibly make it less stable. This sounds stupid, but without this concept the internet wouldn't exist (Ethernet works precisely like this: All systems on an ethernet network send data immediately and then check for spikes on the line that indicates multiple parties all sent at the same time, and the result was an unreadable mess. If they detect this, they wait randomly with exponential backoff and then send it again. This seemingly insane solution beat the pants off of token ring networks).

The 'exponential' part means: As retries roll in, make the delays longer (and still random).

Your final error is that you always retry, instead of only when that's sensible to do.

Here's an example exponential randomized backoff that fixes all your problems except the part where you need to make your (Prepared)Statement objects anew and close the old ones; your snippet does not make clear where that happens.

  1. } (catch SQLException e) { // catch SQLEx, not Ex
  2. String ps = e.getSQLState();
  3. if (ps != null &amp;&amp; ps.length() == 5 &amp;&amp; ps.startsWith(&quot;40&quot;)) {
  4. // For postgres, this means retry. It&#39;s DB specific!
  5. retryCount++;
  6. if (retryCount &gt; 50) throw e;
  7. try {
  8. Thread.sleep((retryCount * 2) + rnd.nextInt(8 * retryCount);
  9. continue; // continue the retry loop.
  10. } catch (InterruptedException e2) {
  11. // Interrupted; stop retrying and just throw the exception.
  12. throw e;
  13. }
  14. }
  15. // it wasn&#39;t retry; just throw it.
  16. throw e;
  17. }

Or, do yourself a huge favour, ditch all this work and use a library. JDBC is designed to be incredibly annoying, inconsistent, and ugly for 'end users' - that's because the target audience for JDBC is not you. It's the DB vendors. It's the lowest level glue imaginable, with all sorts of weird hacks so that all DB vendors can expose their pet features.

Those using JDBC to access DBs are supposed to use an abstraction library built on top!

For example, JDBI is great, and supports retry very well, with lambdas.

