英文:
Bulk insert csv data using pgx.CopyFrom into a postgres database
问题
我再次尝试将大量的CSV数据推送到PostgreSQL数据库中。
过去,我创建了一个结构体来保存数据,并在将每列解包到结构体之前将其推送到数据库表中,这个方法运行良好。然而,我刚刚发现了pgx.CopyFrom*,似乎我应该能够更好地使用它。
到目前为止,我已经将表的列标题存储在一个字符串切片中,将CSV数据存储在另一个字符串切片中,但我无法弄清楚将其推送到数据库的语法。
我找到了这篇帖子,它基本上实现了我想要的功能,但使用的是[][]interface{}而不是[]string。
我目前的代码是:
// 循环遍历行,并找到第一行带有时间戳的行
for {
line, err := csvReader.Read()
if err == io.EOF {
break
} else if err != nil {
log.Error("Error reading csv data", "Loading Loop", err)
}
// 检查是否有以时间戳开头的字符串
_, err := time.Parse(timeFormat, line[0])
if err == nil {
// 我们有一行数据
_, err := db.CopyFrom(context.Background(), pgx.Identifier{"emms.scada_crwf.pwr_active"}, col_headings, pgx.CopyFromRows(line))
}
}
但是pgx.CopyFromRows期望的是[][]interface{}而不是[]string。
语法应该是什么样的?我是不是在错误的方向上努力?
英文:
I'm once again trying to push lots of csv data into a postgres database.
In the past I've created a struct to hold the data and unpacked each column into the struct before bumping the lot into the database table, and that is working fine, however, I've just found pgx.CopyFrom* and it would seem as though I should be able to make it work better.
So far I've got the column headings for the table into a slice of strings and the csv data into another slice of strings but I can't work out the syntax to push this into the database.
I've found this post which sort of does what I want but uses a [][]interface{} rather than []strings.
The code I have so far is
// loop over the lines and find the first one with a timestamp
for {
line, err := csvReader.Read()
if err == io.EOF {
break
} else if err != nil {
log.Error("Error reading csv data", "Loading Loop", err)
}
// see if we have a string starting with a timestamp
_, err := time.Parse(timeFormat, line[0])
if err == nil {
// we have a data line
_, err := db.CopyFrom(context.Background(), pgx.Identifier{"emms.scada_crwf.pwr_active"}, col_headings, pgx.CopyFromRows(line))
}
}
}
But pgx.CopyFromRows expects [][]interface{} not []string.
What should the syntax be? Am I barking up the wrong tree?
答案1
得分: 2
我建议你读取CSV文件,并为每条记录创建一个[]interface{}
,将[]interface{}
添加到行的集合([][]interface{}
),然后将行传递给pgx。
var rows [][]interface{}
// 在CSV“body”循环之外读取标题
header, _ := reader.Read()
// 在CSV读取器“body”循环内部...
row := make([]interface{}, len(record))
// 从这里开始使用你的逻辑/筛选条件
row[0] = record[0] // 时间戳
// 转换浮点数
for i := 1; i < len(record); i++ {
val, _ := strconv.ParseFloat(record[i], 10)
row[i] = val
}
rows = append(rows, row)
...
copyCount, err := conn.CopyFrom(
pgx.Identifier{"floaty-things"},
header,
pgx.CopyFromRows(rows),
)
我无法模拟整个程序,但这里有一个将CSV转换为[][]interface{}
的完整演示:https://go.dev/play/p/efbiFN2FJMi。
并且请查阅文档:https://pkg.go.dev/github.com/jackc/pgx/v4。
英文:
I recommend reading your CSV and creating a []interface{}
for each record you read, appending the []interface{}
to a collection of rows ([][]interface{}
), then passing rows on to pgx.
var rows [][]interface{}
// read header outside of CSV "body" loop
header, _ := reader.Read()
// inside your CSV reader "body" loop...
row := make([]interface{}, len(record))
// use your logic/gate-keeping from here
row[0] = record[0] // timestamp
// convert the floats
for i := 1; i < len(record); i++ {
val, _ := strconv.ParseFloat(record[i], 10)
row[i] = val
}
rows = append(rows, row)
...
copyCount, err := conn.CopyFrom(
pgx.Identifier{"floaty-things"},
header,
pgx.CopyFromRows(rows),
)
I can't mock up the entire program, but here's a full demo of converting the CSV to [][]interface{}
, <https://go.dev/play/p/efbiFN2FJMi>.
And check in with the documentation, <https://pkg.go.dev/github.com/jackc/pgx/v4>.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论