英文:
Converting parquet file to Golang struct with nested elements
问题
我正在尝试使用xitongsys/parquet-go库在Go中读取一个包含嵌套数组/结构的parquet文件。列表数据没有被读取,也没有看到值。以下是我在Golang中的结构体:
type Play struct {
SID string `parquet:"name=si, type=BYTE_ARRAY, convertedtype=UTF8, encoding=PLAIN_DICTIONARY, repetitiontype=OPTIONAL" json:"si,omitempty"`
TimeStamp int `parquet:"name=ts, type=INT64, repetitiontype=OPTIONAL" json:"ts,omitempty"`
SingleID int `parquet:"name=sg, type=INT64, repetitiontype=OPTIONAL" json:"sg,omitempty"`
PID int `parquet:"name=playid, type=INT64, repetitiontype=OPTIONAL" json:"playid,omitempty"`
StartTimeStamp string `parquet:"name=startts, type=BYTE_ARRAY,repetitiontype=OPTIONAL"`
Price []Price1 `parquet:"name=price, type=LIST, repetitiontype=REQUIRED" json:"price,omitempty"`
}
type Price1 struct {
CurrID int `parquet:"name=currId, type=INT64, repetitiontype=REQUIRED" json:"currId,omitempty"`
LPTag string `parquet:"name=lptag, type=BYTE_ARRAY,convertedtype=UTF8, repetitiontype=REQUIRED" json:"lptag,omitempty"`
LPrice Money `parquet:"name=lpmoney, type=STRUCT" json:"lpmoney,omitempty"`
}
type Money struct {
AdmCurrCode string `parquet:"name=admCC, type=BYTE_ARRAY, repetitiontype=OPTIONAL" json:"admCC,omitempty"`
AdmCurrValue string `parquet:"name=admCV, type=BYTE_ARRAY" json:"admCV,omitempty"`
}
即使parquet文件中有有效值,CurrID和LPTag也为空。
英文:
I am trying to read a parquet file with nested arrays/structs in Go using xitongsys/parquet-go library. The list data is not getting read and not seeing the values. Below is my struct in Golang
type Play struct {
SID string `parquet:"name=si, type=BYTE_ARRAY, convertedtype=UTF8, encoding=PLAIN_DICTIONARY, repetitiontype=OPTIONAL" json:"si,omitempty"`
TimeStamp int `parquet:"name=ts, type=INT64, repetitiontype=OPTIONAL" json:"ts,omitempty"`
SingleID int `parquet:"name=sg, type=INT64, repetitiontype=OPTIONAL" json:"sg,omitempty"`
PID int `parquet:"name=playid, type=INT64, repetitiontype=OPTIONAL" json:"playid,omitempty"`
StartTimeStamp string `parquet:"name=startts, type=BYTE_ARRAY,repetitiontype=OPTIONAL"`
Price []Price1 `parquet:"name=price, type=LIST, repetitiontype=REQUIRED" json:"price,omitempty"`
}
type Price1 struct {
CurrID int `parquet:"name=currId, type=INT64, repetitiontype=REQUIRED" json:"currId,omitempty"`
LPTag string `parquet:"name=lptag, type=BYTE_ARRAY,convertedtype=UTF8, repetitiontype=REQUIRED" json:"lptag,omitempty"`
LPrice Money `parquet:"name=lpmoney, type=STRUCT" json:"lpmoney,omitempty"`
}
type Money struct {
AdmCurrCode string `parquet:"name=admCC, type=BYTE_ARRAY, repetitiontype=OPTIONAL" json:"admCC,omitempty"`
AdmCurrValue string `parquet:"name=admCV, type=BYTE_ARRAY" json:"admCV,omitempty"`
}
CurrID and LPTag are coming as empty even though the parquet file is having valid values
答案1
得分: 1
我发现github.com/segmentio/parquet-go
包可以正确读取文件。你是否需要坚持使用github.com/xitongsys/parquet-go
包?
package main
import (
"fmt"
"github.com/segmentio/parquet-go"
)
type Play struct {
SID string `parquet:"si"`
TimeStamp int `parquet:"ts"`
SingleID int `parquet:"sg"`
PID int `parquet:"playid"`
StartTimeStamp string `parquet:"startts"`
Price []Price `parquet:"price,list"`
}
type Price struct {
CurrID int `parquet:"currId"`
LPTag string `parquet:"lptag"`
LPrice Money `parquet:"lpmoney"`
}
type Money struct {
AdmCurrCode string `parquet:"admCC"`
AdmCurrValue string `parquet:"admCV"`
}
func main() {
rows, err := parquet.ReadFile[Play]("s3.parquet")
if err != nil {
panic(err)
}
for _, c := range rows {
fmt.Printf("%+v\n", c)
}
}
英文:
I found that the github.com/segmentio/parquet-go
package can read the file correctly. Do you need to stick to the github.com/xitongsys/parquet-go
package?
package main
import (
"fmt"
"github.com/segmentio/parquet-go"
)
type Play struct {
SID string `parquet:"si"`
TimeStamp int `parquet:"ts"`
SingleID int `parquet:"sg"`
PID int `parquet:"playid"`
StartTimeStamp string `parquet:"startts"`
Price []Price `parquet:"price,list"`
}
type Price struct {
CurrID int `parquet:"currId"`
LPTag string `parquet:"lptag"`
LPrice Money `parquet:"lpmoney"`
}
type Money struct {
AdmCurrCode string `parquet:"admCC"`
AdmCurrValue string `parquet:"admCV"`
}
func main() {
rows, err := parquet.ReadFile[Play]("s3.parquet")
if err != nil {
panic(err)
}
for _, c := range rows {
fmt.Printf("%+v\n", c)
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论