使用Java原生插入查询将数据插入到BigQuery中。

huangapple go评论76阅读模式
英文:

Insert data into BigQuery using native Insert query using Java

问题

以下是翻译好的内容:

我使用JAVA的InsertAll方法将行插入到BigQuery中,一直都能正常工作。但是当我们尝试从JAVA代码中更新同一行时,我会收到以下错误,

> com.google.cloud.bigquery.BigQueryException UPDATE或DELETE DML语句会影响到处于流式缓冲区中的表project123:mydataset.test中的行,这是不受支持的。

因此我尝试从BigQuery控制台中进行操作。

我使用INSERT查询插入了一行,然后立即对同一行进行了UPDATE。这能正常工作。

当我阅读BIGQUERY的文章时,他们提到了既可以使用JAVA的InsertAll,也可以使用控制台的INSERT查询来使用流式缓冲区。在这种情况下,控制台查询的执行应该会失败。

为什么控制台查询能正常工作?但是从Java的InsertAll中却抛出异常。

如果有人能帮助我了解确切的细节,将会非常有帮助。

如果有任何建议,可以在Java中使用Native的插入查询来代替BigQuery的InsertAll,那将会是很大的帮助。

以下是代码片段:

首先,我使用以下代码片段将值插入到BigQuery中:

Map<String, Object> map = new HashMap<>();
map.put("1", "name");
map.put("2", "age");

BigQuery bQuery = BigQueryOptions.newBuilder().setCredentials(credentials).setProjectId(id)
			.build().getService();
InsertAllResponse response = bQuery .insertAll(InsertAllRequest.newBuilder(tableId).addRow(map).build());

一旦插入完成,我尝试使用以下代码片段更新该表中的行:

String updateQuery = String.format( "UPDATE `%s` SET name = \"%s\" WHERE age = \"%s\")", name, age);
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();
bQuery.query(queryConfig);

插入工作正常。但是当我尝试更新同一插入的行时,就会出现流式缓冲区错误。

提前感谢您的帮助。

英文:

I Insert rows into BigQuery with the InsertAll method using JAVA. It is working always fine. But when we try to update the same row from the JAVA code am getting the below error,

> com.google.cloud.bigquery.BigQueryException UPDATE or DELETE DML statements over table project123:mydataset.test would affect rows in the streaming buffer, which is not supported

So I tried from BigQueryConsole.

I inserted a row using the INSERT query then immediately UPDATE the same row. It worked fine.

When I read the articles of BIGQUERY, they are mentioning both InsertAll from JAVA and INSERT query from Console using Streaming Buffer. In that case, the console query execution should be got failed.

Why Console query is working fine? But from Java InsertAll it is throwing me an exception.

It will be really helpful if anyone helps me to know the exact details.

If any suggestions to use Native insert query insertion from Java instead of InsertAll to BigQuery, It will be a great help.

Please find the code snippet

First am inserting the values to the BigQuery using the below code snippet

Map<String, Object> map = new HashMap<>();
map.put("1", "name");
map.put("2", "age");

BigQuery bQuery = BigQueryOptions.newBuilder().setCredentials(credentials).setProjectId(id)
			.build().getService();
InsertAllResponse response = bQuery .insertAll(InsertAllRequest.newBuilder(tableId).addRow(map).build());

Once it is getting inserted, am trying to update the row in that table with the following code snippet

String updateQuery = String.format( "UPDATE `%s` SET name = \"%s\" WHERE age = \"%s\")", name, age);
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();
bQuery.query(queryConfig);

Insert is working fine. when I tried to update the same inserted row am getting the streaming buffer error.

Thanks in advance.

答案1

得分: 2

当您阅读文档时,可以清楚地看到insertAll执行向BigQuery进行流式写入。

当您使用INSERT DML(INSERT INTO <table> [VALUES....|SELECT...])时,您执行的是查询,而不是流式写入。因此,数据管理并不相同。性能也是不同的(流式写入每秒可以写入多达100万行数据,DML是逐个查询,对于较少的数据需要更多时间)。

因此,我不了解您的代码以及您想要实现什么。但是,如果您想使用常规查询(INSERT、UPDATE、DELETE),请使用查询API。

编辑

我尝试调整了您的代码(但是可能是错误的,我进行了一些假设),我可以为您提供以下建议。只需执行查询,而不是加载作业或流式写入。

        String tableName = "YOUR_TABLE_NAME";
        
        String insertQuery = String.format("INSERT INTO %s(name, age) VALUES (1,2)", tableName);
        QueryRequest queryRequest = QueryRequest.builder(insertQuery).build();
        bQuery.query(queryRequest);


        String updateQuery = String.format("UPDATE `%s` SET name = \"%s\" WHERE age = \"%s\")", tableName, name, age);
        queryRequest = QueryRequest.builder(updateQuery).build();
        bQuery.query(queryRequest);
英文:

When you read the documentation, it's clear that the insertAll perform a stream write into BigQuery.

When you use INSERT DML (INSERT INTO &lt;table&gt; [VALUES....|SELECT...]), you perform a query, not a stream write. So, the data management isn't the same. The performance are also different (Stream write can write up to 1 million of rows per seconds, the DML is query by query, and took more time for less data).

So, I don't know your code and what you want to achieve. But if you want to use usual query (INSERT, UPDATE, DELETE), use query API.

EDIT

I tried to adapt your code (but it was wrong, I took some assumptions) and I can propose you this. Simply perform a QUERY, not a Load Job or a Streaming write.

        String tableName = &quot;YOUR_TABLE_NAME&quot;;
        
        String insertQuery = String.format(&quot;INSERT INTO %s(name, age) VALUES (1,2)&quot;, tableName);
        QueryRequest queryRequest = QueryRequest.builder(insertQuery).build();
        bQuery.query(queryRequest);


        String updateQuery = String.format( &quot;UPDATE `%s` SET name = \&quot;%s\&quot; WHERE age = \&quot;%s\&quot;)&quot;, tableName, name, age);
        queryRequest = QueryRequest.builder(updateQuery).build();
        bQuery.query(queryRequest);

huangapple
  • 本文由 发表于 2020年9月30日 11:04:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/64130246.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定