在Go语言中,有没有一种快速排序类型的方法?

huangapple go评论85阅读模式
英文:

Is there a way to order types quickly in Go?

问题

我有一个需要快速排序的包。

目前我使用reflect.Type,通过Name()方法获取它们的名称,并将名称作为字符串进行排序:

if type1.Name() < type2.Name() { ...

然而,这使用了字符串比较。虽然它能工作,但我正在寻找更快的解决方案。

对于我来说,确切地说这个比较是如何工作的并不重要,我只需要:

  1. 比较的结果在进程的生命周期内保持一致。
  2. 对于不同的类型,比较结果应该是不相等的,对于相同的类型,比较结果应该是相等的。

直接使用<运算符比较reflect.Type变量是行不通的,因为在Go语言中,reflect.Type类型没有定义<运算符。

可以通过将类型名称生成为64位或128位整数的哈希值,然后比较这些整数。这是一种可能性,但我正在寻找更快的解决方案。

另一种可能性是通过它们的不安全指针(转换为int64)来比较reflect.Type变量,但据我所知,在Go语言中不能保证不安全指针地址在进程的生命周期内不会改变。因此,我将失去我的第一个要求(1)。

英文:

I have a package which has to order Go types, quickly.

Currently I use reflect.Types, get their name with Name(), and order the names as strings:

if type1.Name() < type2.Name() { ...

However, it uses a string comparison. It works, but I am looking for a more quick solution.

How exactly this comparison works, is not an important thing to me - only what I need,

  1. the result of the comparison should be same in the life of process.
  2. it should be non-equal for different types, but equal for the same type.

Comparing the reflect.Type variables directly with < doesn't work, as this operation is not defined for reflect.Types in Go.

It would be possible to generate a hash of the type names to 64- or 128 bit integers, and then compare these integers. It is a possibility, however I am looking for a yet quicker solution.

Another possibility would be to compare the reflect.Type variables by their unsafe pointers (casted to int64), but as far I know, there is no guarantee in Go that the unsafe pointer addresses won't change in the life cycle of the process. Thus, I would lose my (1) requirement.

答案1

得分: 2

使用Type.Name()是不好的

使用Type.Name()进行比较是一个非常糟糕的想法,因为有许多匿名类型,其中Type.Name()返回空字符串,例如[]int*int等。而且,Type.Name()返回的名称不包括包名,所以例如time.Timefoo.Time的名称将是"Time",因此被认为是相等的。不需要再多说了。请查看https://stackoverflow.com/questions/36310538/identify-non-builtin-types-using-reflect/37292523#37292523获取更多详细信息。

使用内部注册表

一种简单快速的方法是为需要比较的所有类型分配一个单独的整数(例如int值),然后您就不需要进行任何string比较或哈希,只需比较分配的int值即可。

唯一剩下的问题是如何分配和记住这些int值。分配的值可以简单地是连续的值,从1开始,然后是2、3等。这个分配和查找可以是自动的(对我们来说是隐藏的)。

基本上,我们想要一个类似这样的函数:

// Compare比较两个类型,如果t1 < t2,则返回负数,如果t1 > t2,则返回正数,否则返回0。
func Compare(t1, t2 reflect.Type) int {
}

实现非常简单/紧凑:

var (
    registry = map[reflect.Type]int{}
    max      int
)

func value(t reflect.Type) int {
    v := registry[t]
    if v == 0 {
        max++
        v, registry[t] = max, max
    }
    return v
}

// Compare比较两个类型,如果t1 < t2,则返回负数,如果t1 > t2,则返回正数,否则返回0。
func Compare(t1, t2 reflect.Type) int {
    v1, v2 := value(t1), value(t2)
    return v1 - v2
}

进行测试:

tstring := reflect.TypeOf("")
tint := reflect.TypeOf(int(1))
ttime := reflect.TypeOf(time.Time{})

fmt.Println(Compare(tstring, tstring)) // 0
fmt.Println(Compare(tint, ttime))      // -1,tint先注册
fmt.Println(Compare(ttime, tint))      // 1
fmt.Println(Compare(tint, tint))       // 0
fmt.Println(Compare(tstring, ttime))   // -2
fmt.Println(Compare(ttime, tstring))   // 2

输出结果(在Go Playground上尝试):

0
-1
1
0
-2
2

使其在并发使用时安全

然而,这个实现在并发使用(多个goroutine)时不安全。如果我们需要在并发使用时保证安全性,对registry映射(以及max变量)的访问必须进行同步:

var mu sync.RWMutex

func value(t reflect.Type) int {
    mu.RLock()
    v := registry[t]
    mu.RUnlock()
    if v == 0 {
        mu.Lock()
        max++
        v, registry[t] = max, max
        mu.Unlock()
    }
    return v
}

其余部分保持不变。输出结果相同。在Go Playground上尝试这个版本。

内部实现(更详细地看)

现在这个Compare()函数暗示了在内部使用锁。此外,为了获取类型的分配整数值,我们必须索引一个映射。这个映射中键的类型是reflect.Type,它是一个接口类型。当前的reflect.Type实现是*reflect.rtype,它是一个指针类型,因此仍然足够快,因为只有一个指针被哈希,然后在桶中查找(内置的映射是哈希映射实现)。

最快的解决方案(摆脱锁和映射索引)

现在这个Compare()函数暗示了在内部使用锁(和映射索引),所以如果您真的寻求最快的解决方案,那可能仍然是一个不可接受的“负担”。如果您获取了类型分配的内部int值,并直接使用它与另一个类型的值进行比较,那么在需要进行比较时,您可以轻松摆脱使用它们的所有时间。

为此,您只需要“发布”(导出)value()函数:

func Value(t reflect.Type) int {
    // ...
}

并且甚至可以删除Compare()函数。因此,需要这种类型比较的库可以查询分配给类型的整数值,并且它们可以“缓存”此值以避免再次调用它(因为它在进程的生命周期内不会更改),并且可以与通过此Value()函数获得的其他值进行比较。

例如:

vstring := Value(reflect.TypeOf(""))
vint := Value(reflect.TypeOf(int(1)))
vtime := Value(reflect.TypeOf(time.Time{}))

fmt.Println(vstring - vstring) // 0
fmt.Println(vint - vtime)      // -1,tint先注册
fmt.Println(vtime - vint)      // 1
fmt.Println(vint - vint)       // 0
fmt.Println(vstring - vtime)   // -2
fmt.Println(vtime - vstring)   // 2

Go Playground上尝试这个版本。

英文:

Using Type.Name() is bad

Using Type.Name() for comparison is a really bad idea as there are many anonymous types where Type.Name() returns the empty string such as []int, *int etc. Also the name returned by Type.Name() does not include the package name, so for example names of time.Time and foo.Time will be &quot;Time&quot; and therefore considered equal. No more is needed to be said. Check out https://stackoverflow.com/questions/36310538/identify-non-builtin-types-using-reflect/37292523#37292523 for more details.

With an internal registry

An easy and fast way would be to assign a single integer (e.g. an int value) to all types that need to be compared, and then you don't need any string comparison nor hashing, just compare the assigned int values.

Only question left is how to assign and remember these int values. The assigned values may simply be continuous values, starting with 1, then 2, 3 etc. This assignment and lookup may be automatic (hidden from our eyes).

Essentially we want a function something like this:

// Compare compares 2 types, returns a negative number
// if t1 &lt; t2, a positive number if t1 &gt; t2, and 0 otherwise.
func Compare(t1, t2 reflect.Type) int {
}

And the implementation is very simple / compact:

var (
	registry = map[reflect.Type]int{}
	max      int
)

func value(t reflect.Type) int {
	v := registry[t]
	if v == 0 {
		max++
		v, registry[t] = max, max
	}
	return v
}

// Compare compares 2 types, returns a negative number
// if t1 &lt; t2, a positive number if t1 &gt; t2, and 0 otherwise.
func Compare(t1, t2 reflect.Type) int {
	v1, v2 := value(t1), value(t2)
	return v1 - v2
}

Testing it:

tstring := reflect.TypeOf(&quot;&quot;)
tint := reflect.TypeOf(int(1))
ttime := reflect.TypeOf(time.Time{})

fmt.Println(Compare(tstring, tstring)) // 0
fmt.Println(Compare(tint, ttime))      // -1, tint gets registered first
fmt.Println(Compare(ttime, tint))      // 1
fmt.Println(Compare(tint, tint))       // 0
fmt.Println(Compare(tstring, ttime))   // -2
fmt.Println(Compare(ttime, tstring))   // 2

Output (try it on the Go Playground):

0
-1
1
0
-2
2

Making it safe for concurrent use

One weakness though: the implementation is not safe for concurrent use (by multiple goroutines). If we need safety for concurrent use, access to the registry map (and to the max variable) must be synchronized:

var mu sync.RWMutex

func value(t reflect.Type) int {
    mu.RLock()
    v := registry[t]
    mu.RUnlock()
    if v == 0 {
        mu.Lock()
        max++
        v, registry[t] = max, max
        mu.Unlock()
    }
    return v
}

The rest is unchanged. The output is the same. Try this one on the Go Playground.

Under the hood (looking closer)

Now this Compare() implies using a lock under the hood. Also in order to get the assigned integer value of a type, we have to index a map. The type of the key in this map is reflect.Type which is an interface type. Current implementation of reflect.Type is *reflect.rtype which is a pointer, so it is still fast enough as only a pointer is hashed and then looked for in a bucket (the builtin map is a hashmap implementation).

The fastest solution (getting rid of locks and map indexing)

Now this Compare() implies using a lock under the hood (and indexing a map), so that may still be an unacceptable "burden" if you really seek the fastest solution. You can easily get rid of using those all the time when comparison is needed if you acquire the internal int value assigned for the type, and use that, even directly to compare to another type's value.

For this, all you need to do is "publish" (export) the value() function:

func Value(t reflect.Type) int {
    // ...
}

And Compare() may even be removed. So libraries that need this kind of type comparison can query the integer value assigned to a type, and they can "cache" this value to avoid having to call it again (as it won't change during the lifetime of a process), and it can be compared to other values obtained by this Value() function.

For example:

vstring := Value(reflect.TypeOf(&quot;&quot;))
vint := Value(reflect.TypeOf(int(1)))
vtime := Value(reflect.TypeOf(time.Time{}))

fmt.Println(vstring - vstring) // 0
fmt.Println(vint - vtime)      // -1, tint gets registered first
fmt.Println(vtime - vint)      // 1
fmt.Println(vint - vint)       // 0
fmt.Println(vstring - vtime)   // -2
fmt.Println(vtime - vstring)   // 2

Try this one on the Go Playground.

huangapple
  • 本文由 发表于 2017年9月3日 00:48:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/46016043.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定