PreparedStatement批量插入/更新 – 死锁问题

huangapple go评论67阅读模式
英文:

PreparedStatement Batch Insert/Update - DeadLock Issue

问题

我正在处理一个支持多个数据库的系统。所有的插入和更新都是批量进行的,使用PreparedStatement批处理来实现。然而,在使用PostgreSQL时,有很多情况下会导致批量更新时发生死锁。我想知道是否有办法避免这种情况。

> 错误:检测到死锁
> 详细信息:进程30655在数据库17148的关系295507848上等待独占锁;被进程30662阻塞。

我已经放置了重试逻辑,但出现以下错误:

> 错误:当前事务已中止,直到事务块结束后命令被忽略

我对如何处理这种情况有些困惑。

public void InsertUpdateBatch(String auditPrefix, String _tableName, TableStructure<?> ts, 
        StringBuilder sb, String operation) throws Exception {
		
    boolean retry = true;
    boolean isInsert = "insert".equalsIgnoreCase(operation) ? true : false;
    int minTry = 0;
    int maxTries = 2;

    ThreadLocal<PreparedStatement> statement = isInsert ? pstmt : updateStmt;
    ThreadLocal<List<Object[]>> dataToProcess = isInsert ? insertBatchData : updateBatchData;
		
    while (retry) {
        try {
            long t1 = System.currentTimeMillis();
            int[] retCount = {};
				
            retCount = statement.get().executeBatch();
				
            // 清空批处理和批处理数据
            statement.get().clearBatch();
            dataToProcess.get().clear();
				
            if (isInsert) {
                syncReport.addInsert(ts.getTableName(), retCount.length);
            } else {
                syncReport.addUpdate(ts.getTableName(), retCount.length);
            }
				
            this.syncReport.addDatabaseTime(t1, System.currentTimeMillis());

            retry = false;
				
        } catch (Exception e) {
            // 显式清空批处理
            statement.get().clearBatch();
				
            log.log(Level.INFO, "Thread " + Thread.currentThread().getName() + ": tried the operation " + operation + " for "  + (minTry + 1) + " time(s)");

            if (++minTry == maxTries) {
                retry = false;
                minTry = 0;
                e.printStackTrace();
                commitSynchException(auditPrefix, _tableName, ts, sb, operation, isInsert, e);
            } else {
                trackRecordCount(e, ts, !isInsert);
                // 重建批处理
                rebuildBatch(ts, dataToProcess.get(), e);
                // 重建批处理后清除旧的批处理数据
                dataToProcess.get().clear();
            }
				
        }
    }
}
英文:

I'm working on a system that supports multiple databases. All the insert and updates happen in bulk and the same is achieved with the help of PreparedStatement batches. However, with PostgreSQL, there are quite a few times where it is causing a deadlock over updating in batch. I would like to know if there is a way to avoid this.

> ERROR: deadlock detected
> Detail: Process 30655 waits for ExclusiveLock on relation 295507848 of database 17148; blocked by process 30662.

I have retry logic in place but it gives:

> ERROR: current transaction is aborted, commands ignored until end of transaction block

I'm a bit confused about how to handle this situation.

    public void InsertUpdateBatch(String auditPrefix, String _tableName, TableStructure&lt;?&gt; ts, 
StringBuilder sb, String operation) throws Exception {
boolean retry = true;
boolean isInsert = &quot;insert&quot;.equalsIgnoreCase(operation) ? true : false;
int minTry = 0;
int maxTries = 2;
ThreadLocal&lt;PreparedStatement&gt; statement = isInsert ? pstmt : updateStmt;
ThreadLocal&lt;List&lt;Object[]&gt;&gt; dataToProcess = isInsert ? insertBatchData : updateBatchData;
while (retry) {
try {
long t1 = System.currentTimeMillis();
int[] retCount = {};
retCount = statement.get().executeBatch();
// Clearing the batch and batch data
statement.get().clearBatch();
dataToProcess.get().clear();
if(isInsert) {
syncReport.addInsert(ts.getTableName(), retCount.length);
} else {
syncReport.addUpdate(ts.getTableName(), retCount.length);
}
this.syncReport.addDatabaseTime(t1, System.currentTimeMillis());
retry = false;
} catch (Exception e) {
// Clearing the batch explicitly
statement.get().clearBatch();
log.log(Level.INFO, &quot;Thread &quot; + Thread.currentThread().getName() + &quot;: tried the operation &quot; + operation + &quot; for &quot;  + (minTry + 1) + &quot; time(s)&quot;);
if (++minTry == maxTries) {
retry = false;
minTry = 0;
e.printStackTrace();
commitSynchException(auditPrefix, _tableName, ts, sb, operation, isInsert, e);
} else {
trackRecordCount(e, ts, !isInsert);
// Rebuild Batch
rebuildBatch(ts, dataToProcess.get(), e);
// Clearing old batch data after rebuilding the batch
dataToProcess.get().clear();
}
}
}
}

答案1

得分: 0

重试是解决方案。但你还没有正确实现它。

-- 根据 @Mark Rotteveel 的建议进行编辑 --

你需要在连接上明确调用 .abort(),然后才能重试。你可能可以继续使用你的 PreparedStatement / Statement 对象,但如果仍然遇到问题,考虑关闭并重新创建它们。

-- 编辑结束 ---

你的第二个问题是缺乏合适的指数回退。

计算机是可靠的。非常可靠。比瑞士手表还要可靠。

如果两个线程执行一个任务,并且作为任务的一部分它们相互死锁,然后它们都会看到这一点,中止它们的事务,然后重新开始,那么...

可能会再次发生__完全相同的事情__。一次又一次。计算机在不幸的情况下可以如此可靠。

解决方案是随机化的指数回退。确保这两个线程不会以完全相同的方式以完全相同的顺序以完全相同的时间进行操作的一种方法是_实际上_开始翻转硬币,以强制使其不那么稳定。这听起来很愚蠢,但如果没有这个概念,互联网就不会存在(以太网正是这样工作的:以太网网络上的所有系统立即发送数据,然后检查线路上的尖峰,表示多个方发送了数据,结果是无法阅读的混乱。如果它们检测到这一点,它们会_随机等待指数回退_,然后重新发送。这个看似疯狂的解决方案打败了令人窒息的令牌环网络)。

“指数”部分的意思是:随着重试的进行,使延迟变得更长(仍然是随机的)。

你最后的错误是总是重试,而不仅在有意义的情况下重试。

下面是一个修复了你所有问题的指数随机回退的示例,除了你需要重新创建你的(Prepared)Statement 对象并关闭旧对象的部分;你的片段没有清楚地表明这是在哪里发生的。

} catch (SQLException e) { // 捕获 SQLEx,而不是 Ex
    String ps = e.getSQLState();
    if (ps != null && ps.length() == 5 && ps.startsWith("40")) {
        // 对于 postgres,这意味着重试。这是特定于数据库的!
        retryCount++;
        if (retryCount > 50) throw e;
        try {
            Thread.sleep((retryCount * 2) + rnd.nextInt(8 * retryCount));
            continue; // 继续重试循环。
        } catch (InterruptedException e2) {
            // 被中断;停止重试并只抛出异常。
            throw e;
        }
     }
     // 它不是重试;只是抛出它。
     throw e;
}

或者,为自己做一个巨大的好事,摆脱所有这些工作,使用一个库。JDBC 设计得非常让人讨厌、不一致和丑陋,适合“最终用户” - 这是因为 JDBC 的目标受众不是你。它是数据库供应商。这是可以想象的最低级别的粘合剂,包含各种奇怪的技巧,以便所有数据库供应商都可以公开它们的专有功能。

那些使用 JDBC 访问数据库的人应该使用在其上构建的抽象库!

例如,JDBI 很好,支持重试非常好,带有 lambda 表达式。

英文:

Retry is the solution. But you haven't properly implemented it.

-- EDIT as suggested by @Mark Rotteveel --

You need to explicitly call .abort() on your connection and then you can retry. You can probably get away with keeping your PreparedStatement / Statement objects, but if you still run into trouble consider closing and recreating these.

-- END EDIT ---

Your second problem is lack of nagled exponential backoff.

Computers are reliable. Very reliable. Better than swiss watches.

If two threads do a job, and as part of that job they deadlock each other, and they both will see this, abort their transactions, and start over, then...

probably the exact same thing will happen again. And again. And again. And again. Computers can be that reliable in unlucky scenarios.

The solution is randomized exponential backoff. One way to ensure that the two threads don't keep doing things the same way in the exact same order with the exact same timing is to literally start flipping coins to forcibly make it less stable. This sounds stupid, but without this concept the internet wouldn't exist (Ethernet works precisely like this: All systems on an ethernet network send data immediately and then check for spikes on the line that indicates multiple parties all sent at the same time, and the result was an unreadable mess. If they detect this, they wait randomly with exponential backoff and then send it again. This seemingly insane solution beat the pants off of token ring networks).

The 'exponential' part means: As retries roll in, make the delays longer (and still random).

Your final error is that you always retry, instead of only when that's sensible to do.

Here's an example exponential randomized backoff that fixes all your problems except the part where you need to make your (Prepared)Statement objects anew and close the old ones; your snippet does not make clear where that happens.

} (catch SQLException e) { // catch SQLEx, not Ex
String ps = e.getSQLState();
if (ps != null &amp;&amp; ps.length() == 5 &amp;&amp; ps.startsWith(&quot;40&quot;)) {
// For postgres, this means retry. It&#39;s DB specific!
retryCount++;
if (retryCount &gt; 50) throw e;
try {
Thread.sleep((retryCount * 2) + rnd.nextInt(8 * retryCount);
continue; // continue the retry loop.
} catch (InterruptedException e2) {
// Interrupted; stop retrying and just throw the exception.
throw e;
}
}
// it wasn&#39;t retry; just throw it.
throw e;
}

Or, do yourself a huge favour, ditch all this work and use a library. JDBC is designed to be incredibly annoying, inconsistent, and ugly for 'end users' - that's because the target audience for JDBC is not you. It's the DB vendors. It's the lowest level glue imaginable, with all sorts of weird hacks so that all DB vendors can expose their pet features.

Those using JDBC to access DBs are supposed to use an abstraction library built on top!

For example, JDBI is great, and supports retry very well, with lambdas.

huangapple
  • 本文由 发表于 2020年8月7日 21:01:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/63302438.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定