英文:
Inserting into bigquery using go updates / overwrites instead of inserts
问题
我有一个用Go编写的AWS Lambda函数,应该插入到BigQuery中。它引用了cloud.google.com/go/bigquery包。
client, err := bigquery.NewClient(ctx, projectID, gcpOption)
if err != nil {
println(fmt.Sprintf("error creating new client, %v", err))
return fmt.Errorf("bigquery.NewClient: %v", err)
}
defer client.Close()
inserter := client.Dataset(datasetID).Table(tableID).Inserter()
if err := inserter.Put(ctx, items); err != nil {
println(fmt.Sprintf("error inserting, %v", err))
if multiError, ok := err.(bigquery.PutMultiError); ok {
for _, err1 := range multiError {
for _, err2 := range err1.Errors {
fmt.Println(err2)
}
}
} else {
fmt.Println(err)
}
return err
} else {
println("Inserted record")
}
运行时,会插入一条记录,但再次运行会导致先前插入的行被更新。这不是我期望的行为。我对Golang和GCP相对陌生,所以也许我的期望是错误的。
BigQuery中的表没有分区。Items是一个结构体数组。
英文:
I have an aws lambda written in Go that should insert into bigquery.
It reference the cloud.google.com/go/bigquery package.
client, err := bigquery.NewClient(ctx, projectID, gcpOption)
if err != nil {
println(fmt.Sprintf("error creating new client, %v", err))
return fmt.Errorf("bigquery.NewClient: %v", err)
}
defer client.Close()
inserter := client.Dataset(datasetID).Table(tableID).Inserter()
if err := inserter.Put(ctx, items); err != nil {
println(fmt.Sprintf("error inserting, %v", err))
if multiError, ok := err.(bigquery.PutMultiError); ok {
for _, err1 := range multiError {
for _, err2 := range err1.Errors {
fmt.Println(err2)
}
}
} else {
fmt.Println(err)
}
return err
} else {
println("Inserted record")
}
When run, a record will be inserted, but running again will result in the previously inserted row being updated. This is not the behaviour I was expecting.
I am relatively new to Golang and GCP, so perhaps I have the wrong expectations?
The table in big query is not partitioned.
Items is an array of structs.
答案1
得分: 1
Inserter
可以用于实现至少一次的数据插入语义。插入机制无法实现upsert行为,这似乎是你所描述的。
我不清楚你是如何验证这种行为的,但我建议你再仔细检查一下。
关于tabledata.insertAll
流式API的更多信息,该API是go语言Inserter
的基础,可以在这里找到:https://cloud.google.com/bigquery/streaming-data-into-bigquery
英文:
The Inserter
can be used to achieve at-least-once data insertion semantics. The insert mechanism is not capable of upsert behavior, which is what you appear to be describing.
It's unclear to me how you're validating this behavior, but I'd take another look at that as a starting point.
More information about the tabledata.insertAll
streaming API which underlies the go Inserter
can be found here: https://cloud.google.com/bigquery/streaming-data-into-bigquery
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论