英文:
`sort.Slice` order is non-deterministic
问题
我正在尝试使用Go标准库中的sort.Slice
来对字符串切片进行排序。我希望它们按字母顺序排序,但是我希望空字符串出现在所有其他字符串之后(因此不能只使用sort.Strings
)。
对于less函数,我认为以下代码应该可以工作:
func(i, j int) bool {
return s[j] == "" || s[i] < s[j]
}
然而,根据输入顺序的不同,我似乎得到了随机的答案。这是一个最小可行示例(MWE):
package main
import (
"fmt"
"math/rand"
"sort"
"time"
)
func main() {
s := []string{"", "foo", "bar", "baz"}
rand.Seed(time.Now().Unix())
rand.Shuffle(len(s), func(i, j int) {
s[i], s[j] = s[j], s[i]
})
fmt.Printf("%q\n", s)
sort.Slice(s, func(i, j int) bool {
return s[j] == "" || s[i] < s[j]
})
fmt.Printf("%q\n", s)
}
以下是运行几次后的输出结果:
$ go run ./z
["" "foo" "baz" "bar"]
["bar" "baz" "foo" ""]
$ go run ./z
["baz" "" "foo" "bar"]
["bar" "" "baz" "foo"]
$ go run ./z
["bar" "foo" "" "baz"]
["" "bar" "baz" "foo"]
$ go run ./z
["bar" "foo" "baz" ""]
["" "bar" "baz" "foo"]
英文:
I'm trying to use sort.Slice
from the Go standard library to sort a slice of strings. I want them sorted alphabetically, except I want the empty string to appear after all other strings (hence I can't just use sort.Strings
).
For the less function, I thought this would work:
func(i, j int) bool {
return s[j] == "" || s[i] < s[j]
}
However, I seem to be getting random answers depending on what the input order is. Here's a MWE:
package main
import (
"fmt"
"math/rand"
"sort"
"time"
)
func main() {
s := []string{"", "foo", "bar", "baz"}
rand.Seed(time.Now().Unix())
rand.Shuffle(len(s), func(i, j int) {
s[i], s[j] = s[j], s[i]
})
fmt.Printf("%q\n", s)
sort.Slice(s, func(i, j int) bool {
return s[j] == "" || s[i] < s[j]
})
fmt.Printf("%q\n", s)
}
and here's the output from running that a few times:
$ go run ./z
["" "foo" "baz" "bar"]
["bar" "baz" "foo" ""]
$ go run ./z
["baz" "" "foo" "bar"]
["bar" "" "baz" "foo"]
$ go run ./z
["bar" "foo" "" "baz"]
["" "bar" "baz" "foo"]
$ go run ./z
["bar" "foo" "baz" ""]
["" "bar" "baz" "foo"]
答案1
得分: 6
这是因为你的less()
函数没有表达你想要的意思。
你说你希望空字符串在所有非空字符串之后排序。你的逻辑是:
return s[j] == "" || s[i] < s[j]
这确实告诉了我们,如果第二个字符串是空的,那么第一个字符串就是较小的。这基本上是正确的(除非两个字符串都是空的,"is-less"并不真正成立:它们是相等的)。但是如果第一个字符串是空的,而第二个字符串不是呢?那么你的函数应该返回false
,但它却返回s[i] < s[j]
。如果第二个字符串不是空的,这将返回true
,告诉我们""
小于另一个字符串,这与你想要的相反。
正确的"is-less"关系应该是这样的:
sort.Slice(s, func(i, j int) bool {
if s[j] == "" && s[i] != "" {
return true
}
if s[i] == "" && s[j] != "" {
return false
}
return s[i] < s[j]
})
如果只有第二个字符串是空的,你希望第一个字符串较小。如果只有第一个字符串是空的,你希望它"不较小"。否则按正常顺序排序(按字节顺序)。在Go Playground上试一试。
请注意,如果第一个和第二个值都是空的,这个函数将返回false
,因为""
不小于""
(它们是相等的)。这是正确的返回值,尽管在这里返回true
仍然会得到正确的顺序(交换空元素将得到相同的结果),但这可能导致较少的交换。
使用异或运算符转换逻辑
请注意,在自定义逻辑中,如果只有一个字符串为空,与正常顺序有偏差。这是逻辑异或(Exclusive OR)关系:a XOR b
是true
,只有当a
或b
中只有一个为true
时。在Go中没有逻辑XOR
运算符,但a XOR b
等同于a != b
。
如果检测到一个空字符串,如果第二个字符串为空,则结果为true
(否则为false
)。因此,我们可以将这个等价转换应用到我们的逻辑中:
sort.Slice(s, func(i, j int) bool {
// 将空元素移到末尾:
if (s[i] == "") != (s[j] == "") { // 如果只有一个为空
return s[j] == ""
}
return s[i] < s[j]
})
这个版本更短,可能更高效,但正如你所看到的,它更难理解。只有在性能确实很重要的情况下才使用这个版本。在Go Playground上试一试。
英文:
This is because your less()
function isn't saying what you want it to say.
You said you want empty strings to be sorted after all non-empty strings. Your logic:
return s[j] == "" || s[i] < s[j]
This does tell if the second is ""
, then the first is less. This is more or less correct (except if both are empty, "is-less" is not really true: they are equal). But what if the first is ""
and the second isn't? Then your function should return false
but instead it returns s[i] < s[j]
. If the second isn't empty, this will be true
, telling ""
is less than the other, exactly the opposite what you want.
The correct "is-less" relation is like this:
sort.Slice(s, func(i, j int) bool {
if s[j] == "" && s[i] != "" {
return true
}
if s[i] == "" && s[j] != "" {
return false
}
return s[i] < s[j]
})
If only the second is ""
, you want the first to be less. If only the first is empty, you want it "not be less". Else use normal order (which is byte-wise).
Try it on the Go Playground.
Note that if both the first and second values would be empty, this function will return false
because ""
is not less than ""
(they are equal). This is the proper value to return, although returning true
here would still result in correct order (swapping empty elements would result in same result), but this may result in fewer swaps.
Transforming the logic using XOR
Note that in the custom logic there is deviation from the normal order if only one of the strings is empty. This is the logical XOR (Exclusive OR) relation: a XOR b
is true
if only a
or only b
is true
. In Go there is no logical XOR
operator, but a XOR b
is equivalent to a != b
.
If one empty string is "detected", the result is true
if the second one is the empty (else false
). So we could apply this identity transformation to our logic:
sort.Slice(s, func(i, j int) bool {
// Move empty elements to the end:
if (s[i] == "") != (s[j] == "") { // If only one is empty
return s[j] == ""
}
return s[i] < s[j]
})
This is shorter and probably more efficient, but as you can see, it's harder to understand. Use this only if performance does matter. Try this one on the Go Playground.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论