如何在golang中获取两个切片的交集?

huangapple go评论86阅读模式
英文:

How to get intersection of two slice in golang?

问题

有没有一种高效的方法在Go语言中获取两个切片的交集?

我想避免像嵌套循环那样的解决方案

slice1 := []string{"foo", "bar", "hello"}
slice2 := []string{"foo", "bar"}

intersection(slice1, slice2)
=> ["foo", "bar"]

字符串的顺序不重要

英文:

Is there any efficient way to get intersection of two slices in Go?

I want to avoid nested for loop like solution

slice1 := []string{"foo", "bar","hello"}
slice2 := []string{"foo", "bar"}

intersection(slice1, slice2)
=> ["foo", "bar"]

order of string does not matter

答案1

得分: 38

以下是要翻译的内容:

  • 简单交集:将A中的每个元素与B中的每个元素进行比较(O(n^2)
  • 哈希交集:将它们放入哈希表中(O(n)
  • 排序交集:对A进行排序并进行优化的交集操作(O(n*log(n))

这些方法在这里实现:

https://github.com/juliangruber/go-intersect/blob/master/intersect.go

英文:

https://stackoverflow.com/questions/13270491/how-do-i-get-the-intersection-between-two-arrays-as-a-new-array

  • Simple Intersection: Compare each element in A to each in B (O(n^2))
  • Hash Intersection: Put them into a hash table (O(n))
  • Sorted Intersection: Sort A and do an optimized intersection (O(n*log(n)))

All of which are implemented here

https://github.com/juliangruber/go-intersect/blob/master/intersect.go

答案2

得分: 5

简单、通用和多个切片!(Go 1.18)

时间复杂度:可能是线性的

func interSection[T constraints.Ordered](pS ...[]T) []T {
    hash := make(map[T]*int) // value, counter
    result := make([]T, 0)
    for _, slice := range pS {
        duplicationHash := make(map[T]bool) // duplication checking for individual slice
        for _, value := range slice {
            if _, isDup := duplicationHash[value]; !isDup { // is not duplicated in slice
                if counter := hash[value]; counter != nil { // is found in hash counter map
                    if *counter++; *counter >= len(pS) { // is found in every slice
                        result = append(result, value)
                    }
                } else { // not found in hash counter map
                    i := 1
                    hash[value] = &i
                }
                duplicationHash[value] = true
            }
        }
    }
    return result
}
func main() {
    slice1 := []string{"foo", "bar", "hello"}
    slice2 := []string{"foo", "bar"}
    fmt.Println(interSection(slice1, slice2))
    // [foo bar]

    ints1 := []int{1, 2, 3, 9, 8}
    ints2 := []int{10, 4, 2, 4, 8, 9} // have duplicated values
    ints3 := []int{2, 4, 8, 1}
    fmt.Println(interSection(ints1, ints2, ints3))
    // [2 8]
}

playground: https://go.dev/play/p/lE79D0kOznZ

请注意,这是一段使用Go语言编写的代码,用于计算多个切片的交集。它使用了哈希表和计数器来实现。在给定的切片中,如果一个元素在每个切片中都出现,则将其添加到结果切片中。这段代码可以在Go Playground上运行。

英文:

simple, generic and mutiple slices ! (Go 1.18)

Time Complexity : may be linear

func interSection[T constraints.Ordered](pS ...[]T) []T {
	hash := make(map[T]*int) // value, counter
	result := make([]T, 0)
	for _, slice := range pS {
		duplicationHash := make(map[T]bool) // duplication checking for individual slice
		for _, value := range slice {
			if _, isDup := duplicationHash[value]; !isDup { // is not duplicated in slice
				if counter := hash[value]; counter != nil { // is found in hash counter map
					if *counter++; *counter >= len(pS) { // is found in every slice
						result = append(result, value)
					}
				} else { // not found in hash counter map
					i := 1
					hash[value] = &i
				}
				duplicationHash[value] = true
			}
		}
	}
	return result
}
func main() {
	slice1 := []string{"foo", "bar", "hello"}
	slice2 := []string{"foo", "bar"}
	fmt.Println(interSection(slice1, slice2))
	// [foo bar]

	ints1 := []int{1, 2, 3, 9, 8}
	ints2 := []int{10, 4, 2, 4, 8, 9} // have duplicated values
	ints3 := []int{2, 4, 8, 1}
	fmt.Println(interSection(ints1, ints2, ints3))
	// [2 8]
}

playground : https://go.dev/play/p/lE79D0kOznZ

答案3

得分: 4

这是一个计算两个切片交集的最佳方法。时间复杂度非常低。

时间复杂度:O(m+n)

m = 第一个切片的长度。

n = 第二个切片的长度。

func intersection(s1, s2 []string) (inter []string) {
    hash := make(map[string]bool)
    for _, e := range s1 {
        hash[e] = true
    }
    for _, e := range s2 {
        // 如果元素存在于哈希表中,则将其添加到交集列表中。
        if hash[e] {
            inter = append(inter, e)
        }
    }
    // 从切片中删除重复项。
    inter = removeDups(inter)
    return
}

// 从切片中删除重复项。
func removeDups(elements []string)(nodups []string) {
    encountered := make(map[string]bool)
    for _, element := range elements {
        if !encountered[element] {
            nodups = append(nodups, element)
            encountered[element] = true
        }
    }
    return
}

希望对你有帮助!

英文:

It's a best method for intersection two slice. Time complexity is too low.

Time Complexity : O(m+n)

m = length of first slice.

n = length of second slice.

func intersection(s1, s2 []string) (inter []string) {
	hash := make(map[string]bool)
	for _, e := range s1 {
		hash[e] = true
	}
	for _, e := range s2 {
		// If elements present in the hashmap then append intersection list.
		if hash[e] {
			inter = append(inter, e)
		}
	}
	//Remove dups from slice.
	inter = removeDups(inter)
	return
}

//Remove dups from slice.
func removeDups(elements []string)(nodups []string) {
	encountered := make(map[string]bool)
	for _, element := range elements {
		if !encountered[element] {
			nodups = append(nodups, element)
			encountered[element] = true
		}
	}
	return
}

答案4

得分: 2

如果在你的[]string中不存在空白,也许你需要这段简单的代码:

func filter(src []string) (res []string) {
    for _, s := range src {
        newStr := strings.Join(res, " ")
        if !strings.Contains(newStr, s) {
            res = append(res, s)
        }
    }
    return
}

func intersections(section1, section2 []string) (intersection []string) {
    str1 := strings.Join(filter(section1), " ")
    for _, s := range filter(section2) {
        if strings.Contains(str1, s) {
            intersection = append(intersection, s)
        }
    }
    return
}
英文:

if there exists no blank in your []string, maybe you need this simple code:

func filter(src []string) (res []string) {
	for _, s := range src {
		newStr := strings.Join(res, " ")
		if !strings.Contains(newStr, s) {
			res = append(res, s)
		}
	}
	return
}

func intersections(section1, section2 []string) (intersection []string) {
	str1 := strings.Join(filter(section1), " ")
	for _, s := range filter(section2) {
		if strings.Contains(str1, s) {
			intersection = append(intersection, s)
		}
	}
	return
}

答案5

得分: 1

另一个时间复杂度为O(m+n)的解决方案,使用了哈希映射。与此处讨论的其他解决方案相比,它有两个不同之处:

  • 将目标切片作为参数传递,而不是返回一个新的切片
  • 对于常用类型(如字符串/整数),使用速度更快,而不是对所有类型进行反射操作

你可以在以下链接中找到代码实现的详细内容:
https://github.com/viant/toolbox/blob/a46fd679bbc5d07294b1d1b646aeacd44e2c7d50/collections.go#L869-L920

英文:

https://github.com/viant/toolbox/blob/a46fd679bbc5d07294b1d1b646aeacd44e2c7d50/collections.go#L869-L920

Another O(m+n) Time Complexity solution that uses a hashmap.
It has two differences compared to the other solutions discussed here.

  • Passing the target slice as a parameter instead of new slice returned
  • Faster to use for commonly used types like string/int instead of reflection for all

答案6

得分: 1

尝试一下

https://go.dev/play/p/eGGcyIlZD6y

first := []string{"one", "two", "three", "four"}
second := []string{"two", "four"}

result := intersection(first, second) // 或者 intersection(second, first)

func intersection(first, second []string) []string {
	out := []string{}
	bucket := map[string]bool{}

	for _, i := range first {
		for _, j := range second {
			if i == j && !bucket[i] {
				out = append(out, i)
				bucket[i] = true
			}
		}
	}

	return out
}
英文:

Try it

https://go.dev/play/p/eGGcyIlZD6y

first := []string{"one", "two", "three", "four"}
second := []string{"two", "four"}

result := intersection(first, second) // or intersection(second, first)

func intersection(first, second []string) []string {
	out := []string{}
	bucket := map[string]bool{}

	for _, i := range first {
		for _, j := range second {
			if i == j && !bucket[i] {
				out = append(out, i)
				bucket[i] = true
			}
		}
	}

	return out
}

答案7

得分: -8

是的,有几种不同的方法可以实现这个功能。这里有一个可以进行优化的示例。

package main

import "fmt"

func intersection(a []string, b []string) (inter []string) {
    // 首先交互较小的列表可能会更快...但差别不大,最坏情况下是相同的
    low, high := a, b
    if len(a) > len(b) {
        low = b
        high = a
    }

    done := false
    for i, l := range low {
        for j, h := range high {
            // 获取未来的索引值
            f1 := i + 1
            f2 := j + 1
            if l == h {
                inter = append(inter, h)
                if f1 < len(low) && f2 < len(high) {
                    // 如果未来的值不相同,那就是交集的结束
                    if low[f1] != high[f2] {
                        done = true
                    }
                }
                // 我们不想每次都遍历整个列表,所以删除已经遍历过的部分会使每次遍历更快
                high = high[:j+copy(high[j:], high[j+1:])]
                break
            }
        }
        // 没有未来的值,所以我们完成了
        if done {
            break
        }
    }
    return
}

func main() {
    slice1 := []string{"foo", "bar", "hello", "bar"}
    slice2 := []string{"foo", "bar"}
    fmt.Printf("%+v\n", intersection(slice1, slice2))
}

现在上面定义的intersection方法只能操作[]string类型的切片,就像你的示例一样。理论上,你可以创建一个定义如下的方法:func intersection(a []interface{}, b []interface{}) (inter []interface{}),但是这样你将依赖反射和类型转换来进行比较,这会增加延迟并使你的代码更难阅读。更容易维护和阅读的方法是为你关心的每种类型编写单独的函数。

func intersectionString(a []string, b []string) (inter []string)

func intersectionInt(a []int, b []int) (inter []int)

func intersectionFloat64(a []float64, b []float64) (inter []float64)

然后,你可以创建自己的包,并在确定了如何实现后进行重用。

package intersection

func String(a []string, b []string) (inter []string)

func Int(a []int, b []int) (inter []int)

func Float64(a []float64, b []float64) (inter []float64)

希望这可以帮助到你!

英文:

Yes there are a few different ways to go about it.. Here's an example that can be optimized.

package main

import &quot;fmt&quot;

func intersection(a []string, b []string) (inter []string) {
	// interacting on the smallest list first can potentailly be faster...but not by much, worse case is the same
	low, high := a, b
	if len(a) &gt; len(b) {
		low = b
		high = a
	}

	done := false
	for i, l := range low {
		for j, h := range high {
			// get future index values
			f1 := i + 1
			f2 := j + 1
			if l == h {
				inter = append(inter, h)
				if f1 &lt; len(low) &amp;&amp; f2 &lt; len(high) {
					// if the future values aren&#39;t the same then that&#39;s the end of the intersection
					if low[f1] != high[f2] {
						done = true
					}
				}
				// we don&#39;t want to interate on the entire list everytime, so remove the parts we already looped on will make it faster each pass
				high = high[:j+copy(high[j:], high[j+1:])]
				break
			}
		}
		// nothing in the future so we are done
		if done {
			break
		}
	}
	return
}

func main() {
	slice1 := []string{&quot;foo&quot;, &quot;bar&quot;, &quot;hello&quot;, &quot;bar&quot;}
	slice2 := []string{&quot;foo&quot;, &quot;bar&quot;}
	fmt.Printf(&quot;%+v\n&quot;, intersection(slice1, slice2))
}

Now the intersection method defined above will only operate on slices of strings, like your example.. You can in theory create a definition that looks like this func intersection(a []interface, b []interface) (inter []interface), however you would be relying on reflection and type casting so that you can compare, which will add latency and make your code harder to read. It's probably easier to maintain and read to write a separate function for each type you care about.

func intersectionString(a []string, b []string) (inter []string),

func intersectionInt(a []int, b []int) (inter []int),

func intersectionFloat64(a []Float64, b []Float64) (inter []Float64), ..ect

You can then create your own package and reuse once you settle how you want to implement it.

package intersection

func String(a []string, b []string) (inter []string)

func Int(a []int, b []int) (inter []int)

func Float64(a []Float64, b []Float64) (inter []Float64)

huangapple
  • 本文由 发表于 2017年7月7日 02:04:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/44956031.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定