Go maps – 根据键值提取记录。关于搜索速度和行为的问题

huangapple go评论78阅读模式
英文:

Go maps - pulling records based off key value. A question of search speed and behavior

问题

尝试理解Go语言中的maps如何工作,通过一个简单的速度测试。

仅通过索引获取键的值非常快,根据我下面的代码,平均大约为~50.0微秒。

当我获取整个记录时,其中包含唯一键值和其他几个键。速度会在~500.0微秒左右波动。我知道这已经很快,少于一毫秒。但为什么获取整个记录,如果它已经被索引,需要的时间几乎是获取键值的10倍?

是记录中的额外键需要处理,从而需要更多的运行时间吗?

我在代码中是否使用了错误的maps方法?随着时间推移,这是否会成为一个值得担心的问题?

我甚至根据我正在搜索的唯一键对结构进行了排序。

type TravelItenariesCSV struct {
    ...
	FlightNum  int `csv:"flight_num"`
    ...

}
// 打印代码运行时间
func PrintExecutionTime(t time.Time) {
	fmt.Println("Execution time: ", time.Since(t))
}
flightNum := vars["flight"] // 获取我需要在所有TravelItenariesCSV.FlightNum键中查找的唯一整数值

startTime := time.Now()
itenariesMap := map[int]int{}

for _, v := range s.TravelItenariesCSV { // s.TravelItenariesCSV是上面结构的切片
	itenariesMap[v.FlightNum] = i
}

if v, ok := itenariesMap[i]; ok {
	w.WriteHeader(http.StatusOK)
	fmt.Fprintf(w, utils.PrettyPrint(s.TravelItenariesCSV[v])) // 这需要大约500.0微秒
    //fmt.Fprintf(w, utils.PrettyPrint(v)) // 这需要大约50.0微秒

}
defer utils.PrintExecutionTime(startTime)

希望能对这个主题有所了解。

英文:

Trying to understand how Go maps works, via a simple speed test.

Pulling ONLY the value of a key (via index) is really fast, I'm talking about on average around ~50.0µs according to my code below.

When I pull the entire record, which contains that unique key value and a handful of other keys. I'm getting speeds that fluctuate around ~500.0µs. I understand this is already fast and less than a millisecond. But why would pulling the entire record, if it's already indexed, take almost 10x as long?

Is it the extra key's in the record that have to be processed which requires more run time?

Am I using maps wrong in my code? Is this something to be worried about with more data over time?

I even sort my struct based off the unique key that I'm searching for.

type TravelItenariesCSV struct {
    ...
	FlightNum  int `csv:"flight_num"`
    ...

}
// Print code run time
func PrintExecutionTime(t time.Time) {
	fmt.Println("Execution time: ", time.Since(t))
}
    flightNum := vars["flight"] // Get unique int value that I need to find across all TravelItenariesCSV.FlightNum keys
    
    startTime := time.Now()
	itenariesMap := map[int]int{}

	for _, v := range s.TravelItenariesCSV { //s.TravelItenariesCSV is a slice of the struct above
		itenariesMap[v.FlightNum] = i
	}

	if v, ok := itenariesMap[i]; ok {
		w.WriteHeader(http.StatusOK)
		fmt.Fprintf(w, utils.PrettyPrint(s.TravelItenariesCSV[v])) // this takes ~500.0µs
        //fmt.Fprintf(w, utils.PrettyPrint(v)) // this takes ~50.0µs

	}
    defer utils.PrintExecutionTime(startTime)

Would appreciate clarity on the subject matter.

答案1

得分: 2

您的性能测量包含其他元素:

if v, ok := itenariesMap[i]; ok {
        w.WriteHeader(http.StatusOK)
        fmt.Fprintf(w, utils.PrettyPrint(s.TravelItenariesCSV[v])) // 这需要大约500.0微秒
        //fmt.Fprintf(w, utils.PrettyPrint(v)) // 这需要大约50.0微秒

    }
    defer utils.PrintExecutionTime(startTime)

这段代码有两个作用:fmt.Printf 需要时间。对于一个变量,它将是 x,对于 10 倍的参数,它将相应地扩展。

其次:在测量吞吐量时,不要使用 defer:defer 会延迟执行(因此称为 defer)。

因此,要测量 map 的性能,您可以这样做:

for _, v := range s.TravelItenariesCSV { // s.TravelItenariesCSV 是上面结构体的切片
        itenariesMap[v.FlightNum] = i
    }
utils.PrintExecutionTime(startTime)

代码的其余部分

这样应该能给您一个更好的吞吐量估计。

英文:

Your measurement of performance contains other elements:

if v, ok := itenariesMap[i]; ok {
        w.WriteHeader(http.StatusOK)
        fmt.Fprintf(w, utils.PrettyPrint(s.TravelItenariesCSV[v])) // this takes ~500.0µs
        //fmt.Fprintf(w, utils.PrettyPrint(v)) // this takes ~50.0µs

    }
    defer utils.PrintExecutionTime(startTime)

does 2 things: fmt.Printf takes time. For 1 variable it will be x, it will scale accordingly for 10x the parameters

Second: When measuring throughput, do not use defer: Defer runs later (hence the term defer).

So to measure the map performance you could do:

for _, v := range s.TravelItenariesCSV { //s.TravelItenariesCSV is a slice of the struct above
        itenariesMap[v.FlightNum] = i
    }
utils.PrintExecutionTime(startTime)

rest of the code

That should give you a better throughput estimate

huangapple
  • 本文由 发表于 2022年7月19日 03:43:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/73027780.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定