为什么我的排序函数返回的值比输入值多?

huangapple go评论130阅读模式
英文:

Why is my sorting function returning more values than input

问题

我下面贴出的代码是一个最小可复现版本,因为我一直在尝试隔离问题。我之前使用Python,现在需要用Go重新编写这个脚本,出于性能原因,特别是使用并行化,我已经从示例中删除了这部分。

问题是我将N个值传递给排序函数,但返回的值>N。在第一个外部循环的每次迭代中,它都会为每个迭代创建一个新的切片,并似乎忽略了if !message1.Grouped的条件。我在Go方面没有太多经验,但在Python中可以正常工作。我猜测这可能与设置message2.Grouped = true在外部循环中无法被识别有关。

最终,我试图忽略在循环中之前已经分组的“messages”。

附注:我知道此脚本中的随机数生成器不起作用,因为我没有设置新的种子,但这不是我的实际脚本的一部分。

package main
import (
	"fmt"
	"math/rand"
)

type (
	BoolInt struct {
		Val int
		Grouped bool
	}
)


func sort_chunk_no_p(chunk []BoolInt) [][]BoolInt {
	COSINE_THRESHOLD := 0.90
	allGroups := [][]BoolInt{}
	for i, message1 := range chunk {
		if !message1.Grouped {
			message1.Grouped = true
			tempGroup := []BoolInt{message1}
			for _, message2 := range chunk[i+1:] {
				if !message2.Grouped {
					if rand.Float64() >= COSINE_THRESHOLD {
						message2.Grouped = true
						tempGroup = append(tempGroup, message2)
					}	
				}

			}
			allGroups = append(allGroups, tempGroup)
		}
	}
	return allGroups
}

func main() {
	lo, hi := 1, 100
	allMessages := make([]BoolInt, hi-lo+1)
	for i := range allMessages {
		allMessages[i].Val = i + lo
		allMessages[i].Grouped = false
	}

	sorted_chunk := sort_chunk_no_p(allMessages)


	fmt.Println(sorted_chunk)
	sum := 0
	for _, res := range sorted_chunk {
		sum += len(res)
	}
	fmt.Println(sum)
}

英文:

The code i have posted below is a minimum reproducible version as I have been trying to isolate the problem. I am coming from Python and need to rewrite this script in Go for performance reasons, particularly using parallelization that i have removed from the example.

The problem is I pass N values to the sorting function and get >N return values. It creates a new slice for each iteration in the first outer loop and seems to ignore if !message1.Grouped condition. I do not have much experience with Go and have this working with Python. I am assuming it has something to do with setting message2.Grouped = true being not seen by the outer loop for whatever reason.
ultimately im trying to ignore 'messages' that have already been grouped earlier in the loop.

side note: i know the random in this script is not working because i have not set a new seed but that is besides the point and is not part of my actual script

package main
import (
	"fmt"
	"math/rand"
)

type (
	BoolInt struct {
		Val int
		Grouped bool
	}
)


func sort_chunk_no_p(chunk []BoolInt) [][]BoolInt {
	COSINE_THRESHOLD := 0.90
	allGroups := [][]BoolInt{}
	for i, message1 := range chunk {
		if !message1.Grouped {
			message1.Grouped = true
			tempGroup := []BoolInt{message1}
			for _, message2 := range chunk[i+1:] {
				if !message2.Grouped {
					if rand.Float64() >= COSINE_THRESHOLD {
						message2.Grouped = true
						tempGroup = append(tempGroup, message2)
					}	
				}

			}
			allGroups = append(allGroups, tempGroup)
		}
	}
	return allGroups
}

func main() {
	lo, hi := 1, 100
	allMessages := make([]BoolInt, hi-lo+1)
	for i := range allMessages {
		allMessages[i].Val = i + lo
		allMessages[i].Grouped = false
	}

	sorted_chunk := sort_chunk_no_p(allMessages)


	fmt.Println(sorted_chunk)
	sum := 0
	for _, res := range sorted_chunk {
		sum += len(res)
	}
	fmt.Println(sum)
}

答案1

得分: 2

当你迭代一个切片时,所有的元素都会被复制到一个单独的、可重用的循环变量中。这意味着如果你修改了这个副本的字段,切片中的元素不会受到影响。

要么在切片中存储指针(元素仍然会被复制,但元素现在将是指向相同结构体值的指针),要么通过索引表达式修改元素,例如 chunk[i].Grouped = true

使用指针的话,代码如下所示:

func sort_chunk_no_p(chunk []*BoolInt) [][]*BoolInt {
	COSINE_THRESHOLD := 0.90
	allGroups := [][]*BoolInt{}
	for i, message1 := range chunk {
		if !message1.Grouped {
			message1.Grouped = true
			tempGroup := []*BoolInt{message1}
			for _, message2 := range chunk[i+1:] {
				if !message2.Grouped {
					if rand.Float64() >= COSINE_THRESHOLD {
						message2.Grouped = true
						tempGroup = append(tempGroup, message2)
					}
				}

			}
			allGroups = append(allGroups, tempGroup)
		}
	}
	return allGroups
}

调用它的方式如下:

allMessages := make([]*BoolInt, hi-lo+1)
for i := range allMessages {
	allMessages[i] = &BoolInt{Val: i + lo}
}

sorted_chunk := sort_chunk_no_p(allMessages)

Go Playground上试一试。

相关链接:

https://stackoverflow.com/questions/48826460/using-pointers-in-a-for-loop/48826629#48826629

https://stackoverflow.com/questions/44044245/register-multiple-routes-using-range-for-loop-slices-map/44045012#44045012

https://stackoverflow.com/questions/44715882/why-do-these-two-for-loop-variations-give-me-different-behavior/44716068#44716068

英文:

When you iterate over a slice, all elements are copied into a single, reused loop variable. This means if you modify fields of this copy, the elements in the slice are not affected.

Either store pointers in the slice (elements will still be copied, but elements now will be pointers pointing to the same struct value), or modify elements via an index expression such as chunk[i].Grouped = true.

Using pointers this is how it would look like:

func sort_chunk_no_p(chunk []*BoolInt) [][]*BoolInt {
	COSINE_THRESHOLD := 0.90
	allGroups := [][]*BoolInt{}
	for i, message1 := range chunk {
		if !message1.Grouped {
			message1.Grouped = true
			tempGroup := []*BoolInt{message1}
			for _, message2 := range chunk[i+1:] {
				if !message2.Grouped {
					if rand.Float64() >= COSINE_THRESHOLD {
						message2.Grouped = true
						tempGroup = append(tempGroup, message2)
					}
				}

			}
			allGroups = append(allGroups, tempGroup)
		}
	}
	return allGroups
}

And calling it:

allMessages := make([]*BoolInt, hi-lo+1)
for i := range allMessages {
	allMessages[i] = &BoolInt{Val: i + lo}
}

sorted_chunk := sort_chunk_no_p(allMessages)

Try it on the Go Playground.

See related:

https://stackoverflow.com/questions/48826460/using-pointers-in-a-for-loop/48826629#48826629

https://stackoverflow.com/questions/44044245/register-multiple-routes-using-range-for-loop-slices-map/44045012#44045012

https://stackoverflow.com/questions/44715882/why-do-these-two-for-loop-variations-give-me-different-behavior/44716068#44716068

huangapple
  • 本文由 发表于 2022年11月16日 23:28:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/74463131.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定