`sort.Slice` 的排序顺序是不确定的。

huangapple go评论81阅读模式
英文:

`sort.Slice` order is non-deterministic

问题

我正在尝试使用Go标准库中的sort.Slice来对字符串切片进行排序。我希望它们按字母顺序排序,但是我希望空字符串出现在所有其他字符串之后(因此不能只使用sort.Strings)。

对于less函数,我认为以下代码应该可以工作:

func(i, j int) bool {
    return s[j] == "" || s[i] < s[j]
}

然而,根据输入顺序的不同,我似乎得到了随机的答案。这是一个最小可行示例(MWE):

package main

import (
	"fmt"
	"math/rand"
	"sort"
	"time"
)

func main() {
	s := []string{"", "foo", "bar", "baz"}

	rand.Seed(time.Now().Unix())
	rand.Shuffle(len(s), func(i, j int) {
		s[i], s[j] = s[j], s[i]
	})
	fmt.Printf("%q\n", s)

	sort.Slice(s, func(i, j int) bool {
		return s[j] == "" || s[i] < s[j]
	})
	fmt.Printf("%q\n", s)
}

以下是运行几次后的输出结果:

$ go run ./z
["" "foo" "baz" "bar"]
["bar" "baz" "foo" ""]
$ go run ./z
["baz" "" "foo" "bar"]
["bar" "" "baz" "foo"]
$ go run ./z
["bar" "foo" "" "baz"]
["" "bar" "baz" "foo"]
$ go run ./z
["bar" "foo" "baz" ""]
["" "bar" "baz" "foo"]
英文:

I'm trying to use sort.Slice from the Go standard library to sort a slice of strings. I want them sorted alphabetically, except I want the empty string to appear after all other strings (hence I can't just use sort.Strings).

For the less function, I thought this would work:

func(i, j int) bool {
    return s[j] == &quot;&quot; || s[i] &lt; s[j]
}

However, I seem to be getting random answers depending on what the input order is. Here's a MWE:

package main

import (
	&quot;fmt&quot;
	&quot;math/rand&quot;
	&quot;sort&quot;
	&quot;time&quot;
)

func main() {
	s := []string{&quot;&quot;, &quot;foo&quot;, &quot;bar&quot;, &quot;baz&quot;}

	rand.Seed(time.Now().Unix())
	rand.Shuffle(len(s), func(i, j int) {
		s[i], s[j] = s[j], s[i]
	})
	fmt.Printf(&quot;%q\n&quot;, s)

	sort.Slice(s, func(i, j int) bool {
		return s[j] == &quot;&quot; || s[i] &lt; s[j]
	})
	fmt.Printf(&quot;%q\n&quot;, s)
}

and here's the output from running that a few times:

$ go run ./z
[&quot;&quot; &quot;foo&quot; &quot;baz&quot; &quot;bar&quot;]
[&quot;bar&quot; &quot;baz&quot; &quot;foo&quot; &quot;&quot;]
$ go run ./z
[&quot;baz&quot; &quot;&quot; &quot;foo&quot; &quot;bar&quot;]
[&quot;bar&quot; &quot;&quot; &quot;baz&quot; &quot;foo&quot;]
$ go run ./z
[&quot;bar&quot; &quot;foo&quot; &quot;&quot; &quot;baz&quot;]
[&quot;&quot; &quot;bar&quot; &quot;baz&quot; &quot;foo&quot;]
$ go run ./z
[&quot;bar&quot; &quot;foo&quot; &quot;baz&quot; &quot;&quot;]
[&quot;&quot; &quot;bar&quot; &quot;baz&quot; &quot;foo&quot;]

答案1

得分: 6

这是因为你的less()函数没有表达你想要的意思。

你说你希望空字符串在所有非空字符串之后排序。你的逻辑是:

return s[j] == "" || s[i] < s[j]

这确实告诉了我们,如果第二个字符串是空的,那么第一个字符串就是较小的。这基本上是正确的(除非两个字符串都是空的,"is-less"并不真正成立:它们是相等的)。但是如果第一个字符串是空的,而第二个字符串不是呢?那么你的函数应该返回false,但它却返回s[i] < s[j]。如果第二个字符串不是空的,这将返回true,告诉我们""小于另一个字符串,这与你想要的相反。

正确的"is-less"关系应该是这样的:

sort.Slice(s, func(i, j int) bool {
	if s[j] == "" && s[i] != "" {
		return true
	}
	if s[i] == "" && s[j] != "" {
		return false
	}
	return s[i] < s[j]
})

如果只有第二个字符串是空的,你希望第一个字符串较小。如果只有第一个字符串是空的,你希望它"不较小"。否则按正常顺序排序(按字节顺序)。在Go Playground上试一试。

请注意,如果第一个和第二个值都是空的,这个函数将返回false,因为""不小于""(它们是相等的)。这是正确的返回值,尽管在这里返回true仍然会得到正确的顺序(交换空元素将得到相同的结果),但这可能导致较少的交换。

使用异或运算符转换逻辑

请注意,在自定义逻辑中,如果只有一个字符串为空,与正常顺序有偏差。这是逻辑异或(Exclusive OR)关系a XOR btrue,只有当ab中只有一个为true时。在Go中没有逻辑XOR运算符,但a XOR b等同于a != b

如果检测到一个空字符串,如果第二个字符串为空,则结果为true(否则为false)。因此,我们可以将这个等价转换应用到我们的逻辑中:

sort.Slice(s, func(i, j int) bool {
	// 将空元素移到末尾:
	if (s[i] == "") != (s[j] == "") { // 如果只有一个为空
		return s[j] == ""
	}
	return s[i] < s[j]
})

这个版本更短,可能更高效,但正如你所看到的,它更难理解。只有在性能确实很重要的情况下才使用这个版本。在Go Playground上试一试。

英文:

This is because your less() function isn't saying what you want it to say.

You said you want empty strings to be sorted after all non-empty strings. Your logic:

return s[j] == &quot;&quot; || s[i] &lt; s[j]

This does tell if the second is &quot;&quot;, then the first is less. This is more or less correct (except if both are empty, "is-less" is not really true: they are equal). But what if the first is &quot;&quot; and the second isn't? Then your function should return false but instead it returns s[i] &lt; s[j]. If the second isn't empty, this will be true, telling &quot;&quot; is less than the other, exactly the opposite what you want.

The correct "is-less" relation is like this:

sort.Slice(s, func(i, j int) bool {
	if s[j] == &quot;&quot; &amp;&amp; s[i] != &quot;&quot; {
		return true
	}
	if s[i] == &quot;&quot; &amp;&amp; s[j] != &quot;&quot; {
		return false
	}
	return s[i] &lt; s[j]
})

If only the second is &quot;&quot;, you want the first to be less. If only the first is empty, you want it "not be less". Else use normal order (which is byte-wise).

Try it on the Go Playground.

Note that if both the first and second values would be empty, this function will return false because &quot;&quot; is not less than &quot;&quot; (they are equal). This is the proper value to return, although returning true here would still result in correct order (swapping empty elements would result in same result), but this may result in fewer swaps.

Transforming the logic using XOR

Note that in the custom logic there is deviation from the normal order if only one of the strings is empty. This is the logical XOR (Exclusive OR) relation: a XOR b is true if only a or only b is true. In Go there is no logical XOR operator, but a XOR b is equivalent to a != b.

If one empty string is "detected", the result is true if the second one is the empty (else false). So we could apply this identity transformation to our logic:

sort.Slice(s, func(i, j int) bool {
	// Move empty elements to the end:
	if (s[i] == &quot;&quot;) != (s[j] == &quot;&quot;) { // If only one is empty
		return s[j] == &quot;&quot;
	}
	return s[i] &lt; s[j]
})

This is shorter and probably more efficient, but as you can see, it's harder to understand. Use this only if performance does matter. Try this one on the Go Playground.

huangapple
  • 本文由 发表于 2022年11月25日 19:22:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/74572074.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定