Go语言中联合体的最佳实践

huangapple go评论89阅读模式
英文:

Best practice for unions in Go

问题

Go语言中没有union类型。但是在许多场景中,union类型是必需的。XML在许多地方都大量使用了union或choice类型。我试图找出解决缺少union的首选方法。作为示例,我尝试为XML标准中的非终端符号Misc编写Go代码,它可以是注释处理指令空白字符

编写这三种基本类型的代码非常简单。它们分别映射到字符数组和结构体。

type Comment Chars

type ProcessingInstruction struct {
    Target *Chars
    Data *Chars
}

type WhiteSpace Chars

但是当我完成union的代码后,它变得非常臃肿,有许多冗余的函数。显然,必须有一个容器结构体。

type Misc struct {
    value interface{}
}

为了确保容器只包含三种允许的类型,我将value设为私有,并且必须为每种类型编写一个构造函数。

func MiscComment(c *Comment) *Misc {
    return &Misc{c}
}

func MiscProcessingInstruction(pi *ProcessingInstruction) *Misc {
    return &Misc{pi}
}

func MiscWhiteSpace(ws *WhiteSpace) *Misc {
    return &Misc{ws}
}

为了能够测试union的内容,需要编写三个判断函数:

func (m Misc) IsComment() bool {
    _, itis := m.value.(*Comment)
    return itis
}

func (m Misc) IsProcessingInstruction() bool {
    _, itis := m.value.(*ProcessingInstruction)
    return itis
}

func (m Misc) IsWhiteSpace() bool {
    _, itis := m.value.(*WhiteSpace)
    return itis
}

为了获取正确类型的元素,需要编写三个获取函数。

func (m Misc) Comment() *Comment {
    return m.value.(*Comment)
}

func (m Misc) ProcessingInstruction() *ProcessingInstruction {
    return m.value.(*ProcessingInstruction)
}

func (m Misc) WhiteSpace() *WhiteSpace {
    return m.value.(*WhiteSpace)
}

完成这些后,就可以创建Misc类型的数组并使用它了。

func main() {

    miscs := []*Misc{
        MiscComment((*Comment)(NewChars("comment"))),
        MiscProcessingInstruction(&ProcessingInstruction{
            NewChars("target"),
            NewChars("data")}),
        MiscWhiteSpace((*WhiteSpace)(NewChars(" \n")))}

    for _, misc := range miscs {
        if misc.IsComment() {
            fmt.Println((*Chars)(misc.Comment()))
        } else if misc.IsProcessingInstruction() {
            fmt.Println(*misc.ProcessingInstruction())
        } else if misc.IsWhiteSpace() {
            fmt.Println((*Chars)(misc.WhiteSpace()))
        } else {
            panic("invalid misc")
        }
    }
}

你可以看到,有很多几乎相同的代码。对于任何其他的union类型也是如此。所以我的问题是:在Go语言中是否有简化处理union的方法?

Go语言声称通过消除冗余来简化编程工作。但我认为上面的示例恰恰相反。我有什么遗漏吗?

完整示例代码请参考:http://play.golang.org/p/Zv8rYX-aFr

英文:

Go has no unions. But unions are necessary in many places. XML makes excessive use of unions or choice types. I tried to find out, which is the preferred way to work around the missing unions. As an example I tried to write Go code for the non terminal Misc in the XML standard which can be either a comment, a processing instruction or white space.

Writing code for the three base types is quite simple. They map to character arrays and a struct.

type Comment Chars

type ProcessingInstruction struct {
	Target *Chars
	Data *Chars
}

type WhiteSpace Chars

But when I finished the code for the union, it got quite bloated with many redundant functions. Obviously there must be a container struct.

type Misc struct {
	value interface {}
}

In order to make sure that the container holds only the three allowed types I made the value private and I had to write for each type a constructor.

func MiscComment(c *Comment) *Misc {
	return &Misc{c}
}

func MiscProcessingInstruction (pi *ProcessingInstruction) *Misc {
	return &Misc{pi}
}

func MiscWhiteSpace (ws *WhiteSpace) *Misc {
	return &Misc{ws}
}

In order to be able to test the contents of the union it was necessary to write three predicates:

func (m Misc) IsComment () bool {
	_, itis := m.value.(*Comment)
	return itis
}

func (m Misc) IsProcessingInstruction () bool {
	_, itis := m.value.(*ProcessingInstruction)
	return itis
}

func (m Misc) IsWhiteSpace () bool {
	_, itis := m.value.(*WhiteSpace)
	return itis
}

And in order to get the correctly typed elements it was necessary to write three getters.

func (m Misc) Comment () *Comment {
	return m.value.(*Comment)
}

func (m Misc) ProcessingInstruction () *ProcessingInstruction {
	return m.value.(*ProcessingInstruction)
}

func (m Misc) WhiteSpace () *WhiteSpace {
	return m.value.(*WhiteSpace)
}

After this I was able to create an array of Misc types and use it:

func main () {

	miscs := []*Misc{
		MiscComment((*Comment)(NewChars("comment"))),
		MiscProcessingInstruction(&ProcessingInstruction{
			NewChars("target"),
			NewChars("data")}),
		MiscWhiteSpace((*WhiteSpace)(NewChars(" \n")))}

	for _, misc := range miscs {
		if (misc.IsComment()) {
			fmt.Println ((*Chars)(misc.Comment()))
		} else if (misc.IsProcessingInstruction()) {
			fmt.Println (*misc.ProcessingInstruction())
		} else if (misc.IsWhiteSpace()) {
			fmt.Println ((*Chars)(misc.WhiteSpace()))
		} else {
			panic ("invalid misc");
		}
	}
}

You see there is much code looking almost the same. And it will be the same for any other union. So my question is: Is there any way to simplify the way to deal with unions in Go?

Go claims to simplify programing work by removing redundancy. But I think the above example shows the exact opposite. Did I miss anything?

Here is the complete example: http://play.golang.org/p/Zv8rYX-aFr

答案1

得分: 13

由于您想要类型安全,我首先会认为您的初始解决方案已经不安全,因为您有以下代码:

func (m Misc) Comment () *Comment {
    return m.value.(*Comment)
}

如果您在调用之前没有检查IsComment,这段代码将会引发 panic。因此,这种解决方案与 Volker 提出的类型切换相比没有任何好处。

如果您想要对代码进行分组,您可以编写一个函数来确定Misc元素是什么类型:

func IsMisc(v interface{}) bool {
    switch v.(type) {
        case Comment: return true
        // ...
    }
}

然而,这样做也无法进行编译器类型检查。

如果您想要通过编译器将某个对象标识为Misc,您可以考虑创建一个将对象标记为Misc的接口:

type Misc interface {
    ImplementsMisc()
}

type Comment Chars
func (c Comment) ImplementsMisc() {}

type ProcessingInstruction
func (p ProcessingInstruction) ImplementsMisc() {}

这样,您可以编写仅处理 misc. 对象的函数,并在稍后决定您真正想要处理的内容(注释、指令等)。

如果您想要模拟联合类型,那么您目前的写法是正确的。

英文:

As it seems that you're asking because you want type safety, I would firstly argue that your initial
solution is already unsafe as you have

func (m Misc) Comment () *Comment {
    return m.value.(*Comment)
}

which will panic if you haven't checked IsComment before. Therefore this solution has no benefits over
a type switch as proposed by Volker.

Since you want to group your code you could write a function that determines what a Misc element is:

func IsMisc(v {}interface) bool {
    switch v.(type) {
        case Comment: return true
        // ...
    }
}

That, however, would bring you no compiler type checking either.

If you want to be able to identify something as Misc by the compiler then you should
consider creating an interface that marks something as Misc:

type Misc interface {
	ImplementsMisc()
}

type Comment Chars
func (c Comment) ImplementsMisc() {}

type ProcessingInstruction
func (p ProcessingInstruction) ImplementsMisc() {}

This way you could write functions that are only handling misc. objects and get decide later
what you really want to handle (Comments, instructions, ...) in these functions.

If you want to mimic unions then the way you wrote it is the way to go as far as I know.

答案2

得分: 6

我认为这段代码的量可以减少,例如,我个人认为保护type Misc不包含“非法”内容并不是真正有帮助的:一个简单的type Misc interface{}就足够了,对吗?

这样你就可以省去构造函数,所有的Is{Comment,ProcessingInstruction,WhiteSpace}方法都可以简化为一个类型切换:

switch m := misc.(type) {
    Comment: fmt.Println(m)
    ...
    default: panic()
}

这就是encoding/xml包在Token中所做的。

英文:

I think this amount of code might be reduced, e.g. I personally do not think that safeguarding type Misc against containing "illegal" stuff is really helpful: A simple type Misc interface{} would do, or?

With that you spare the constructors and all the Is{Comment,ProcessingInstruction,WhiteSpace} methods boil down to a type switch

switch m := misc.(type) {
    Comment: fmt.Println(m)
    ... 
    default: panic()
}

Thats what package encoding/xml does with Token.

答案3

得分: 1

我不确定是否理解你的问题。一个“简单”的方法是使用encoding/xml包和interface{}。如果你不想使用接口,那么你可以像你之前做的那样做一些事情。
然而,正如你所说,Go是一种类型化的语言,因此应该用于类型化的需求。
如果你有一个结构化的XML,Go可能是一个很好的选择,但你需要编写你的模式。如果你想要一个可变的模式(一个给定的字段可以有多种类型),那么你可能更适合使用非类型化的语言。

非常有用的用于JSON的工具,可以很容易地改写为XML:
http://mholt.github.io/json-to-go/

你提供一个JSON输入,它会给你一个精确的Go结构体。你可以有多种类型,但你需要知道哪个字段有什么类型。如果你不知道,你需要使用反射,但这样你就失去了Go的很多优势。

英文:

I am not sure to understand your issue. The 'easy' way to do it would be like the encoding/xml package with interface{}. If you do not want to use interfaces, then you can do something like you did.
However, as you stated, Go is a typed language and therefore should be use for typed needs.
If you have a structured XML, Go can be a good fit, but you need to write your schema. If you want a variadic schema (one given field can have multiple types), then you might be better off with an non-typed language.

Very useful tool for json that could easily rewritten for xml:
http://mholt.github.io/json-to-go/

You give a json input and it gives you the exact Go struct. You can have multiple types, but you need to know what field has what type. If you don't, you need to use the reflection and indeed you loose a lot of the interest of Go.

答案4

得分: -3

TL;DR 你不需要使用union,interface{}在这种情况下更好地解决了问题。

在C中,union用于访问特殊的内存/硬件。它们也破坏了类型系统。Go语言没有语言原语来访问特殊的内存/硬件,它也因为同样的原因而避免使用volatile和位域。

在C/C++中,union也可以用于非常底层的优化/位压缩。这种权衡是:为了节省一些位,牺牲类型系统并增加复杂性。当然,这也伴随着所有关于优化的警告。

想象一下,如果Go语言有一个原生的union类型,代码会变得更好吗?用以下方式重写代码:

// 假设这个结构体是一个union
type MiscUnion struct {
  c *Comment
  pi *ProcessingInstruction
  ws *WhiteSpace
}

即使有了内置的union,访问MiscUnion的成员仍然需要进行某种形式的运行时检查。因此,使用接口并没有任何不利之处。可以说,接口更优越,因为运行时类型检查是内置的(不可能出错),并且具有处理它的非常好的语法。

union类型的一个优点是静态类型检查,以确保只有正确的具体类型放入了Misc中。Go语言解决这个问题的方式是使用"New..."函数,例如MiscCommentMiscProcessingInstructionMiscWhiteSpace

这是一个使用interface{}New*函数进行清理的示例:http://play.golang.org/p/d5bC8mZAB_

英文:

TL;DR You don't need a union, interface{} solves this better.

Unions in C are used to access special memory/hardware. They also subvert the type system. Go does not have the language primitives access special memory/hardware, it also shunned volatile and bit-fields for the same reason.

In C/C++ unions can also be used for really low level optimization / bit packing. The trade off: sacrifice the type system and increase complexity in favor of saving some bits. This of course comes with all the warnings about optimizations.

Imagine Go had a native union type. How would the code be better? Rewrite the code with this:

// pretend this struct was a union
type MiscUnion struct {
  c *Comment
  pi *ProcessingInstruction
  ws *WhiteSpace
}

Even with a builtin union accessing the members of MiscUnion requires a runtime check of some kind. So using an interface is no worse off. Arguably the interface is superior as the runtime type checking is builtin (impossible to get wrong) and has really nice syntax for dealing with it.

One advantage of a union type is static type check to make sure only proper concrete types where put in a Misc. The Go way of solving this is "New..." functions, e.g. MiscComment, MiscProcessingInstruction, MiscWhiteSpace.

Here is a cleaned up example using interface{} and New* functions: http://play.golang.org/p/d5bC8mZAB_

huangapple
  • 本文由 发表于 2014年2月4日 21:06:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/21553398.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定