英文:
Using the datatypes specified in datatype.go of golang apache arrow implementation for constructing a schema
问题
以下是翻译好的内容:
我正在学习Apache Arrow,并希望了解如何创建模式和Arrow记录。为此,我参考了一些资料,但到目前为止,它们都只使用原始类型来构建模式,就像这样:
schema := arrow.NewSchema(
[]arrow.Field{
{Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32},
{Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64},
},
nil,
)
我想使用一些在PrimitiveTypes中不存在的数据类型,例如bool或decimal128。我在浏览Golang Arrow库时发现了一个名为datatype.go
的文件,其中包含我想使用的所有可能的数据类型。但是这里的类型不是构建模式时所需的DataType
类型。
因此,我有以下三个问题:
- 如果可能的话,我如何使用
datatype.go
中的这些数据类型来构建我的模式? - 如果我想使用十进制类型,我如何指定精度和标度?
- 使用扩展类型的示例。
英文:
I am learning apache Arrow and wanted to learn more about how to create a schema and an arrow record. For this I referenced some material but so far all of them just use the primitive types for building a schema like this:`
schema := arrow.NewSchema(
[]arrow.Field{
{Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32},
{Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64},
},
nil,
)
There are some datatypes not present in PrimitiveTypes that I want to work with. For example, I want to use bool or decimal128. I was looking through Golang arrow library and came across file datatype.go
which has all possible datatypes that I want to use.
But the type here is not of type DataType
which is required when constructing the schema.
So, I have the following three questions:
- How can I use these datatypes from
datatype.go
, if possible, for constructing my schema? - How can I specify a precision and scale if I want to use a decimal type?
- An example of using extension type.
答案1
得分: 0
这些在datatype.go
中定义的数据类型常量已经被用于创建所需的新类型的一部分。其中一些是type Decimal128Type struct
和type BooleanType struct
,如果你检查这些结构体的源代码中的ID
方法,你会发现它们返回datatype.go
中定义的常量,其名称与结构体的名称相似。而且这些结构体已经实现了DataType
接口,这意味着你可以将它们赋值给arrow.Field.Type
,因为该字段的类型是DataType
。
我所指的是:
在datatype_fixedwidth.go
中,BOOL
常量在datatype.go
中定义,被用作type BooleanType struct
的ID
方法的返回值。
func (t *BooleanType) ID() Type { return BOOL }
同样的情况也适用于type Decimal128Type struct
。
func (*Decimal128Type) ID() Type { return DECIMAL128 }
这些结构体的方法展示了它们实现了DataType
接口:
func (*Decimal128Type) BitWidth() int
func (t *Decimal128Type) Fingerprint() string
func (*Decimal128Type) ID() Type
func (*Decimal128Type) Name() string
func (t *Decimal128Type) String() string
这些方法是针对type Decimal128Type struct
的。
DataType
接口的定义如下:
type DataType interface {
ID() Type
Name() string
Fingerprint() string
}
type BooleanType struct
也实现了该接口。
因此,你可以将它们用于以下Type
字段的定义:
type Field struct {
Name string // 字段名称
Type DataType // 字段的数据类型
Nullable bool // 字段可以为空
Metadata Metadata // 字段的元数据(如果有的话)
}
一个示例:
package main
import (
"fmt"
"github.com/apache/arrow/go/arrow"
)
func main() {
booltype := &arrow.BooleanType{}
decimal128type := &arrow.Decimal128Type{Precision: 1, Scale: 1}
schema := arrow.NewSchema(
[]arrow.Field{
{Name: "f1-bool", Type: booltype},
{Name: "f2-decimal128", Type: decimal128type},
},
nil,
)
fmt.Println(schema)
}
输出:
schema:
fields: 2
- f1-bool: type=bool
- f2-decimal128: type=decimal(1, 1)
你可以在文档中找到它们。
还有一些与扩展类型相关的内容,但我对扩展类型不熟悉,所以无法给出示例。但如果你熟悉它,你可以轻松解决它。
英文:
These data type named constants defined in the datatype.go
are used already for a part of making new types that you want. Some of them are type Decimal128Type struct
and type BooleanType struct
if you inspect source code of these structs' ID
methods, they return the constant defined in the datatype.go
whose name is similar to struct's name. And these structs have already implemented the DataType
interface means you can assign them to the arrow.Field.Type
because that field's type is DataType
.
With they I mean:
The BOOL
constant defined in the datatype.go
is used as type BooleanType struct
's ID
method's return value in datatype_fixedwidth.go
.
func (t *BooleanType) ID() Type { return BOOL }
Same thing valid for the type Decimal128Type struct
too.
func (*Decimal128Type) ID() Type { return DECIMAL128 }
.
Methods of one of these structs to show they are implement the DataType
interface:
func (*Decimal128Type) BitWidth() int
func (t *Decimal128Type) Fingerprint() string
func (*Decimal128Type) ID() Type
func (*Decimal128Type) Name() string
func (t *Decimal128Type) String() string
Those methods are for type Decimal128Type struct
.
And definition of the DataType
interface:
type DataType interface {
ID() Type
// Name is name of the data type.
Name() string
Fingerprint() string
}
type BooleanType struct
also implements it.
Hence you can use them for the Type
field of:
type Field struct {
Name string // Field name
Type DataType // The field's data type
Nullable bool // Fields can be nullable
Metadata Metadata // The field's metadata, if any
}
A demonstrative example:
package main
import (
"fmt"
"github.com/apache/arrow/go/arrow"
)
func main() {
booltype := &arrow.BooleanType{}
decimal128type := &arrow.Decimal128Type{Precision: 1, Scale: 1}
schema := arrow.NewSchema(
[]arrow.Field{
{Name: "f1-bool", Type: booltype},
{Name: "f2-decimal128", Type: decimal128type},
},
nil,
)
fmt.Println(schema)
}
Output:
schema:
fields: 2
- f1-bool: type=bool
- f2-decimal128: type=decimal(1, 1)
You can find them in the documentation.
There are also somethings which are related to the extension type.
But I am not familiar with the extension type hence I could not show an example from it. But if you are familiar with it, you can solve it easily.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论