英文:
Which is faster in golang for finding intersection of two arrays?
问题
在golang中,使用map来查找两个数组的交集会更快一些。
第一个方法使用了一个map来存储目标数组的元素,然后遍历原始数组,在map中查找是否存在相同的元素。如果存在,则返回true。
第二个方法使用了两个嵌套的循环,分别遍历原始数组和目标数组,比较每个元素是否相同。如果存在相同的元素,则返回true。
总体而言,第一个方法的时间复杂度为O(n),而第二个方法的时间复杂度为O(n^2)。因此,第一个方法更快一些。
英文:
Which is faster in golang for finding intersection of two arrays?
Original can be a very large list, as can target
original := []string{"test", "test2", "test3"} // n amount of items
target := map[string]bool{
"test": true,
"test2": true,
}
for _, val := range original {
if target[val] {
return true
}
}
OR
original := []string{"test", "test2", "test3"} // n amount of items
target := []string{"test", "test2"}
for _, i := range original {
for _, x := range target {
if i == x {
return true
}
}
}
答案1
得分: 15
根据评论中指出的,你不是在寻找交集,而是在判断original中是否存在单个实体在target中。也就是说,你的第一个示例的时间复杂度是O(N),因为范围是O(N),而映射查找是O(1)。你的第二个示例的时间复杂度是O(N^2),因为有嵌套的范围循环。没有进行任何基准测试,但我可以告诉你,第一种方法在时间上会更优秀(在最坏情况下)。
我进行了基准测试来展示。在original中有5000个项,在target中有500个项,运行上述两个函数,并测试所有匹配和不匹配的元素:
BenchmarkMapLookup 50000 39756 ns/op
BenchmarkNestedRange 300 4508598 ns/op
BenchmarkMapLookupNoMatch 10000 103441 ns/op
BenchmarkNestRangeNoMatch 300 4528756 ns/op
ok so 7.072s
这是基准测试的代码:
package main
import (
"math/rand"
"testing"
"time"
)
var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func randSeq(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letters[rand.Intn(len(letters))]
}
return string(b)
}
var (
original = []string{}
target = []string{}
targetMap = map[string]bool{}
targetNoMatch = []string{}
targetMapNoMatch = map[string]bool{}
)
func init() {
rand.Seed(time.Now().UTC().UnixNano())
numItems := 5000
for i := 0; i < numItems; i++ {
original = append(original, randSeq(10))
}
i := rand.Intn(numItems)
if i >= 4500 {
i = 4499
}
stop := i + 500
for ; i < stop; i++ {
target = append(target, original[i])
targetMap[original[i]] = true
noMatch := randSeq(9)
targetNoMatch = append(target, noMatch)
targetMapNoMatch[noMatch] = true
}
}
func ON(orig []string, tgt map[string]bool) bool {
for _, val := range orig {
if tgt[val] {
return true
}
}
return false
}
func ON2(orig, tgt []string) bool {
for _, i := range orig {
for _, x := range tgt {
if i == x {
return true
}
}
}
return false
}
func BenchmarkMapLookup(b *testing.B) {
for i := 0; i < b.N; i++ {
ON(original, targetMap)
}
}
func BenchmarkNestedRange(b *testing.B) {
for i := 0; i < b.N; i++ {
ON2(original, target)
}
}
func BenchmarkMapLookupNoMatch(b *testing.B) {
for i := 0; i < b.N; i++ {
ON(original, targetMapNoMatch)
}
}
func BenchmarkNestRangeNoMatch(b *testing.B) {
for i := 0; i < b.N; i++ {
ON2(original, targetNoMatch)
}
}
英文:
As was pointed out in the comments, you are not finding an intersection rather you are finding if a single entity of original is present in target. That being said, your first example is O(N) because the range is O(N) and the map lookup is O(1). Your second example is O(N^2) because of the nested range loops. Without any benchmarking I can tell you the first method will be far superior time wise (in worst case.)
I benchmarked it just to show. With 5000 items in original, and 500 in target - running both functions above, and testing with all matching and no matching elements in target:
BenchmarkMapLookup 50000 39756 ns/op
BenchmarkNestedRange 300 4508598 ns/op
BenchmarkMapLookupNoMatch 10000 103441 ns/op
BenchmarkNestRangeNoMatch 300 4528756 ns/op
ok so 7.072s
This is the benchmarking code:
package main
import (
"math/rand"
"testing"
"time"
)
var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func randSeq(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letters[rand.Intn(len(letters))]
}
return string(b)
}
var (
original = []string{}
target = []string{}
targetMap = map[string]bool{}
targetNoMatch = []string{}
targetMapNoMatch = map[string]bool{}
)
func init() {
rand.Seed(time.Now().UTC().UnixNano())
numItems := 5000
for i := 0; i < numItems; i++ {
original = append(original, randSeq(10))
}
i := rand.Intn(numItems)
if i >= 4500 {
i = 4499
}
stop := i + 500
for ; i < stop; i++ {
target = append(target, original[i])
targetMap[original[i]] = true
noMatch := randSeq(9)
targetNoMatch = append(target, noMatch)
targetMapNoMatch[noMatch] = true
}
}
func ON(orig []string, tgt map[string]bool) bool {
for _, val := range orig {
if tgt[val] {
return true
}
}
return false
}
func ON2(orig, tgt []string) bool {
for _, i := range orig {
for _, x := range tgt {
if i == x {
return true
}
}
}
return false
}
func BenchmarkMapLookup(b *testing.B) {
for i := 0; i < b.N; i++ {
ON(original, targetMap)
}
}
func BenchmarkNestedRange(b *testing.B) {
for i := 0; i < b.N; i++ {
ON2(original, target)
}
}
func BenchmarkMapLookupNoMatch(b *testing.B) {
for i := 0; i < b.N; i++ {
ON(original, targetMapNoMatch)
}
}
func BenchmarkNestRangeNoMatch(b *testing.B) {
for i := 0; i < b.N; i++ {
ON2(original, targetNoMatch)
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论