在Golang中,对于查找两个数组的交集,哪种方法更快?

huangapple go评论97阅读模式
英文:

Which is faster in golang for finding intersection of two arrays?

问题

在golang中,使用map来查找两个数组的交集会更快一些。

第一个方法使用了一个map来存储目标数组的元素,然后遍历原始数组,在map中查找是否存在相同的元素。如果存在,则返回true。

第二个方法使用了两个嵌套的循环,分别遍历原始数组和目标数组,比较每个元素是否相同。如果存在相同的元素,则返回true。

总体而言,第一个方法的时间复杂度为O(n),而第二个方法的时间复杂度为O(n^2)。因此,第一个方法更快一些。

英文:

Which is faster in golang for finding intersection of two arrays?

Original can be a very large list, as can target

original := []string{"test", "test2", "test3"} // n amount of items

target := map[string]bool{
    "test": true,
    "test2": true,
}

for _, val := range original {
    if target[val] {
        return true
    }
}

OR

original := []string{"test", "test2", "test3"} // n amount of items
target := []string{"test", "test2"}

for _, i := range original {
    for _, x := range target {
        if i == x {
            return true
        }
    }
}

答案1

得分: 15

根据评论中指出的,你不是在寻找交集,而是在判断original中是否存在单个实体在target中。也就是说,你的第一个示例的时间复杂度是O(N),因为范围是O(N),而映射查找是O(1)。你的第二个示例的时间复杂度是O(N^2),因为有嵌套的范围循环。没有进行任何基准测试,但我可以告诉你,第一种方法在时间上会更优秀(在最坏情况下)。

我进行了基准测试来展示。在original中有5000个项,在target中有500个项,运行上述两个函数,并测试所有匹配和不匹配的元素:

BenchmarkMapLookup	           50000	     39756 ns/op
BenchmarkNestedRange	         300	   4508598 ns/op
BenchmarkMapLookupNoMatch	   10000	    103441 ns/op
BenchmarkNestRangeNoMatch	     300	   4528756 ns/op
ok  	so	7.072s

这是基准测试的代码:

package main

import (
	"math/rand"
	"testing"
	"time"
)

var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

func randSeq(n int) string {
	b := make([]rune, n)
	for i := range b {
		b[i] = letters[rand.Intn(len(letters))]
	}
	return string(b)
}

var (
	original         = []string{}
	target           = []string{}
	targetMap        = map[string]bool{}
	targetNoMatch    = []string{}
	targetMapNoMatch = map[string]bool{}
)

func init() {
	rand.Seed(time.Now().UTC().UnixNano())
	numItems := 5000
	for i := 0; i < numItems; i++ {
		original = append(original, randSeq(10))
	}

	i := rand.Intn(numItems)
	if i >= 4500 {
		i = 4499
	}
	stop := i + 500
	for ; i < stop; i++ {
		target = append(target, original[i])
		targetMap[original[i]] = true
		noMatch := randSeq(9)
		targetNoMatch = append(target, noMatch)
		targetMapNoMatch[noMatch] = true
	}

}

func ON(orig []string, tgt map[string]bool) bool {
	for _, val := range orig {
		if tgt[val] {
			return true
		}
	}
	return false
}

func ON2(orig, tgt []string) bool {
	for _, i := range orig {
		for _, x := range tgt {
			if i == x {
				return true
			}
		}
	}
	return false
}

func BenchmarkMapLookup(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON(original, targetMap)
	}
}

func BenchmarkNestedRange(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON2(original, target)
	}
}

func BenchmarkMapLookupNoMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON(original, targetMapNoMatch)
	}
}

func BenchmarkNestRangeNoMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON2(original, targetNoMatch)
	}
}
英文:

As was pointed out in the comments, you are not finding an intersection rather you are finding if a single entity of original is present in target. That being said, your first example is O(N) because the range is O(N) and the map lookup is O(1). Your second example is O(N^2) because of the nested range loops. Without any benchmarking I can tell you the first method will be far superior time wise (in worst case.)

I benchmarked it just to show. With 5000 items in original, and 500 in target - running both functions above, and testing with all matching and no matching elements in target:

BenchmarkMapLookup	           50000	     39756 ns/op
BenchmarkNestedRange	         300	   4508598 ns/op
BenchmarkMapLookupNoMatch	   10000	    103441 ns/op
BenchmarkNestRangeNoMatch	     300	   4528756 ns/op
ok  	so	7.072s

This is the benchmarking code:

package main

import (
	"math/rand"
	"testing"
	"time"
)

var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

func randSeq(n int) string {
	b := make([]rune, n)
	for i := range b {
		b[i] = letters[rand.Intn(len(letters))]
	}
	return string(b)
}

var (
	original         = []string{}
	target           = []string{}
	targetMap        = map[string]bool{}
	targetNoMatch    = []string{}
	targetMapNoMatch = map[string]bool{}
)

func init() {
	rand.Seed(time.Now().UTC().UnixNano())
	numItems := 5000
	for i := 0; i < numItems; i++ {
		original = append(original, randSeq(10))
	}

	i := rand.Intn(numItems)
	if i >= 4500 {
		i = 4499
	}
	stop := i + 500
	for ; i < stop; i++ {
		target = append(target, original[i])
		targetMap[original[i]] = true
		noMatch := randSeq(9)
		targetNoMatch = append(target, noMatch)
		targetMapNoMatch[noMatch] = true
	}

}

func ON(orig []string, tgt map[string]bool) bool {
	for _, val := range orig {
		if tgt[val] {
			return true
		}
	}
	return false
}

func ON2(orig, tgt []string) bool {
	for _, i := range orig {
		for _, x := range tgt {
			if i == x {
				return true
			}
		}
	}
	return false
}

func BenchmarkMapLookup(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON(original, targetMap)
	}
}

func BenchmarkNestedRange(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON2(original, target)
	}
}

func BenchmarkMapLookupNoMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON(original, targetMapNoMatch)
	}
}

func BenchmarkNestRangeNoMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ON2(original, targetNoMatch)
	}
}

huangapple
  • 本文由 发表于 2015年2月28日 05:26:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/28774572.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定