英文:
SQLite best practices for dealing with multiple process access
问题
我们有一个单一的SQLite数据库,由许多(约20个)来自C++和C#的进程在Windows上访问。
数据库以WAL模式打开。
每个进程打开数据库一次,然后每秒执行几次连续的读取和读/写事务。每个事务非常短:一个单独的选择查询,一个单独的插入,或者一个单独的覆盖(执行为删除后跟插入)。我们在单独的事务中执行每个操作。
所有与这些查询相关的字段和组合都具有相关的索引。
我们观察到性能相当差,需要大量的重试来克服SQL_BUSY - 我们一直重试了10000次,每次后面随机休眠1-10毫秒,有时在查询通过之前需要数千次重试(请注意,此运行的服务器总体上可能会非常繁忙)。
我们还观察到WAL文件变得非常庞大 - 偶尔会有几吉字节的大小。为了解决这个问题,我们每小时运行PRAGMA wal_checkpoint(TRUNCATE)
,然后运行VACUUM
并删除旧条目。数据库文件本身保持相对稳定(约100MB)。
我们是否在这里做错了什么,或者有提高性能的方法。我们是否应该为每个事务执行数据库的打开和关闭,还是有其他可以尝试的方法?
英文:
We have a single SQLite database accessed by many (~20) processes on windows from C++ and C#.
The database is opened in WAL mode.
Each process open the database once, then perform continuous read and read/write transactions a few times a second. Each transaction is very short: a single select query, a single insert, or a single overwrite (performed as a delete followed by an insert). We perform each operation in a separate transaction.
All relevant fields and combinations for these queries have relevant indexes.
We observe quite poor performance, we need a huge amount of retries to get over SQL_BUSY - we went up to 10000 retries each followed by a random sleep of 1-10 ms, we sometimes need thousand of retries before the query gets through. (note the servers this runs on can get very busy in general).
We also observed the WAL file getting huge - occasionally gigabytes in size. To go around that every hour we run PRAGMA wal_checkpoint(TRUNCATE)
followed by VACUUM
and remove old entries. The database file itself stays reasonably steady (~100MB).
Are we doing anything wrong here or is there a way to increase performance. Should we instead so a database open and close for each transaction or are there other things we can try ?
答案1
得分: 2
- 处理 SQL_BUSY
我假设你正在从代码中处理错误,等待并重试。你可以尝试通过在连接的开始处设置PRAGMA busy_timeout = 毫秒;
来让SQLite库处理重试。文档在这里。
- 管理文件大小
WAL文件不会自动缩小,因为覆盖它比附加到它更快。你可以尝试设置PRAGMA journal_size_limit = N;
来保持它在一个可管理的大小,而不是截断它。文档在这里。此外,请注意,VACUUM会生成较大的WAL大小,因此在截断WAL文件后立即进行数据库的VACUUM可能会部分抵消截断的效果。也许在设置了PRAGMA journal_size_limit
之后,你可以发现数据库可以每天只进行一次VACUUM。
- 改进并发性
由于在WAL模式下,许多读者和一个写者可以同时运行,冲突来自多个并发写入者。因此,你应该尽量降低写入的频率并提高速度。你可以尝试在连接的开始处设置PRAGMA synchronous = NORMAL
,这应该允许更快的写入,而不会破坏数据库的风险(参见这里)。
之后,你可以查看数据库中的索引。每次写入(插入、删除或更新)不仅需要写入表的相应页面,还需要写入为该表定义的每个索引的修改页面。检查是否有一些索引是非必需的,可以被移除的。也许你添加了一些索引以加速某些SELECT查询,但如果必须等待100毫秒才能完成写事务,那么将查询从10毫秒加速到5毫秒是没有意义的。最好将其保持在10毫秒,使写事务在80毫秒内完成。
所有这些建议应该很容易通过对你的代码进行轻微修改来进行测试。如果不起作用,那么你可能需要对你的实现进行较大的修改。在这种情况下,如mason建议的那样,考虑切换到不同的关系数据库管理系统可能会更容易。
英文:
I see different problems and I would face them one at a time.
1. Handling SQL_BUSY
I assume you are handling the error from the code, waiting and retrying. You could try letting the SQLite library handle the retry by setting PRAGMA busy_timeout = milliseconds;
at the start of the connection. Docs here.
2. Managing file size
WAL file doesn't get automatically shrinked because overwriting it is faster than appending to it. Instead of truncating you could set PRAGMA journal_size_limit = N;
to keep it at a manageable size. Doc here.
Also beware that VACUUM does generate a large WAL size, so vacuuming the database just after truncating the wal file will partially negate the truncate. Maybe after setting PRAGMA journal_size_limit
you could discover that the db can be VACUUMed just once a day.
3. Improve concurrency
Since in WAL mode many readers and one writer can run concurrently, the conflicts come from multiple concurrent writers.
You should then work to keep writes as low and fast as possible.
First thing you can try is setting PRAGMA synchronous = NORMAL
at the start of the connection. This should allow for faster writes without the risk of corrupting the database (see here).
After that you can look at the indexes in your db. Every write (insert, delete or update) will not only need to write the corresponding pages of the table, but also the modified pages of every index defined for that table.
Check if there are indexes which are not essential and can be removed.
Maybe you added some of them to speed up some SELECT query, but there is no point of speeding up a query from 10 to 5 ms, if it has to wait 100ms for a write transaction to complete. Better leave it at 10 ms and have the write transaction complete in 80 ms.
All of these suggestions should be easy to test with slight modifications to your code. If it doesn't work, then you probably need to make greater modifications to your implementation. In this case you could find out that switching to a different RDBMS, as suggested by mason, could be easier.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论