英文:
Is prepared statement significantly slower than single query for large number of inserts?
问题
以下是翻译好的部分:
我正在向一个 MySql 表中插入几百万行数据。我正在使用如下所示的预编译语句。
从 https://stackoverflow.com/questions/11976989/is-22-seconds-a-good-time-for-inserting-500-rows-in-mysql 获取的单字符串插入方法:
INSERT INTO example
(example_id, name, value, other_value)
VALUES
(100, 'Name 1', 'Value 1', 'Other 1'),
(101, 'Name 2', 'Value 2', 'Other 2'),
(102, 'Name 3', 'Value 3', 'Other 3'),
(103, 'Name 4', 'Value 4', 'Other 4');
我目前正在做的:
//
// 执行上传的方法
//
public static void doUpload(Connection conn) {
log.info("正在删除现有数据...");
Database.update("truncate table attribute", conn);
log.info("正在执行插入");
String sqlString = "insert into attribute values (null,?,?,?)";
int max = 1000000;
PreparedStatement ps = Database.getPreparedStatement(sqlString, conn);
for (int i = 0; i < max; i++) {
// 添加参数
String subjectId = i+"";
addParam(subjectId, "GENDER", getGender(), ps);
addParam(subjectId, "AGE", getAge(), ps);
addParam(subjectId, "CITY", getCity(), ps);
addParam(subjectId, "FAVORITE_COLOR", getColor(), ps);
addParam(subjectId, "PET", getPet(), ps);
if (i % 1000 == 0) {
log.info("正在执行第 " + i + " 条,共 " + max + " 条");
Database.execute(ps);
log.info("批量更新完成");
ps = Database.getPreparedStatement(sqlString, conn);
}
}
if (Database.isClosed(ps) == false) {
Database.execute(ps);
}
}
//
// 添加参数到预编译语句的方法
//
private static void addParam(String subjectId, String name, String val, PreparedStatement ps) {
ArrayList<String> params;
params = new ArrayList<String>();
params.add(subjectId + "");
params.add(name);
params.add(val);
Database.addToBatch(params, ps);
}
//
// addToBatch 方法
//
public static void addToBatch(List<String> params, PreparedStatement ps) {
try {
for (int i = 0; i < params.size(); i++) {
ps.setString((i + 1), params.get(i));
}
ps.addBatch();
} catch (Exception exp) {
throw new RuntimeException(exp);
}
}
什么是完成此类插入的最快方法?
我目前在大约 5 秒内插入 1000 行数据。是否合理期望能够比这个速度更快?
我正在本地运行,并且已经删除了要插入的表上的所有索引。
英文:
I'm inserting a few million of rows into a MySql table. I'm using prepared statement as shown below.
Would creating a single insert string like the one directly below be expected to be substantially faster?
Single string approach from https://stackoverflow.com/questions/11976989/is-22-seconds-a-good-time-for-inserting-500-rows-in-mysql :
INSERT INTO example
(example_id, name, value, other_value)
VALUES
(100, 'Name 1', 'Value 1', 'Other 1'),
(101, 'Name 2', 'Value 2', 'Other 2'),
(102, 'Name 3', 'Value 3', 'Other 3'),
(103, 'Name 4', 'Value 4', 'Other 4');
What I'm currently doing:
//
// method to do upload
//
public static void doUpload(Connection conn) {
log.info("Deleting existing data...");
Database.update("truncate table attribute", conn);
log.info("Doing inserts");
String sqlString = "insert into attribute values (null,?,?,?)";
int max = 1000000;
PreparedStatement ps = Database.getPreparedStatement(sqlString, conn);
for (int i = 0; i < max; i++) {
// add params
String subjectId = i+"";
addParam(subjectId, "GENDER", getGender(), ps);
addParam(subjectId, "AGE", getAge(), ps);
addParam(subjectId, "CITY", getCity(), ps);
addParam(subjectId, "FAVORITE_COLOR", getColor(), ps);
addParam(subjectId, "PET", getPet(), ps);
if (i % 1000 == 0) {
log.info("Executing " + i + " of " + max);
Database.execute(ps);
log.info("Done with batch update");
ps = Database.getPreparedStatement(sqlString, conn);
}
}
if (Database.isClosed(ps) == false) {
Database.execute(ps);
}
}
//
// method to add param to the prepared statement
//
private static void addParam(String subjectId, String name, String val, PreparedStatement ps) {
ArrayList<String> params;
params = new ArrayList<String>();
params.add(subjectId + "");
params.add(name);
params.add(val);
Database.addToBatch(params, ps);
}
//
// addToBatch
//
public static void addToBatch(List<String> params, PreparedStatement ps) {
try {
for (int i = 0; i < params.size(); i++) {
ps.setString((i + 1), params.get(i));
}
ps.addBatch();
} catch (Exception exp) {
throw new RuntimeException(exp);
}
}
What is the fastest way to do this type of insert?
I'm currently inserting 1000 rows in about 5 seconds. Is it reasonable to expect much better than this?
I'm running locally and have already dropped all indexes on the table I'm inserting into.
答案1
得分: 1
使用JDBC进行批量插入的最快方法是使用addBatch / executeBatch,您似乎已经在使用。
有关示例代码,请参见:
但这只能带来有限的性能提升。要获得真正的性能提升,请在您的JDBC URL中添加rewriteBatchedStatements=true
。您将会看到显著的改善。
请参阅https://stackoverflow.com/questions/26307760/mysql-and-jdbc-with-rewritebatchedstatements-true
请记住,在您在“单字符串方法”中提到的内容类似,但rewriteBatchedStatements=true
还可以使与数据库的网络通信更加高效。
英文:
The fastest way to do batch inserts with JDBC is use addBatch / executeBatch,
which you appear to be already doing.
For sample code, see
But that will only get you so much performance.
For a real performance boost, add rewriteBatchedStatements=true
to your JDBC url.
You will see a significant improvement.
See https://stackoverflow.com/questions/26307760/mysql-and-jdbc-with-rewritebatchedstatements-true
Keep in mind that what you suggest in your "Single String Approach" is similar, but rewriteBatchedStatements=true
also makes the network communication with the database more efficient.
答案2
得分: 1
不确定 Database.getPreparedStatement
在做什么,但通常不需要在每次批处理执行后重新创建 PreparedStatement
对象,仍然可以重用它。
另外,您尝试过增大批处理大小吗?目前您的批处理大小是1000,您尝试过增大它吗?
英文:
Not sure what Database.getPreparedStatement
is doing but you usually do not need to recreate PreparedStatement
object after each batch execution, you can still reuse it.
Also have you tried to set larger batch size? As of now your batch size is 1000, have you tried making it bigger?
答案3
得分: 0
预编译语句具有安全性优势。从理论上讲,预编译语句是预先编译的预编译,应该仍然提供更好的性能。
英文:
Prepared statements offer the advantage of security. Theoretically, the prepared statement is precompiled and should still offer better performance.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论