如何在Golang中为基准测试初始化测试数据?

huangapple go评论90阅读模式
英文:

How to initialize test data for benchmark test in golang?

问题

当我为我的算法编写基准测试时,我遇到了一个问题!

我的测试代码详细信息已经推送到了GitHub,并且我将其复制到这里并添加了一些注释。

https://github.com/hidstarshine/Algorithm/blob/master/leet/problem24_test.go

var TDBenchmarkSwapPairs1 *leet.ListNode

// 这个函数可能不好,应该使用init()吗?
func FTDBenchmarkSwapPairs1() {
    TDBenchmarkSwapPairs1 = &leet.ListNode{
        Val:  0,
        Next: nil,
    }
    changeNode := TDBenchmarkSwapPairs1
    for i := 1; i < 100; i++ {
        changeNode.Next = &leet.ListNode{
            Val:  i,
            Next: nil,
        }
        changeNode = changeNode.Next
    }
}

func BenchmarkSwapPairs1(b *testing.B) {
    FTDBenchmarkSwapPairs1() // 问题出在这里
    for i := 0; i < b.N; i++ {
        leet.SwapPairs1(TDBenchmarkSwapPairs1)
    }
}

在问题行中,我调用了FTDBenchmarkSwapPairs1(FTD表示填充测试数据)来初始化数据。

然后发生了一些奇怪的事情,BenchmarkSwapPairs1似乎在许多goroutine中运行。

因此,并发性带来了数据竞争,并且由于SwapPairs1的特殊逻辑,调试变得一团糟。

SwapPairs1将更改ListNode中的Next。

然后,我想将BenchmarkSwapPairs1移动到for循环的块中以解决这个问题。

但是数据竞争似乎仍然没有解决,由于初始化的时间,基准测试没有意义。

我在LeetCode上判断了算法并获得了通过!

Q:我该如何优雅地解决这个问题?需要一个好的想法!


我只是添加了一些调试信息,然后它就发生了panic。我也认为一开始不会有数据竞争。

当我看到panic时,我做了一个假设!

package leet_test

import (
    "fmt"
    "testing"

    "github.com/hidstarshine/Algorithm/leet"
)

var TDBenchmarkSwapPairs1 *leet.ListNode

func FTDBenchmarkSwapPairs1() {
    TDBenchmarkSwapPairs1 = &leet.ListNode{
        Val:  0,
        Next: nil,
    }
    changeNode := TDBenchmarkSwapPairs1
    for i := 1; i < 100; i++ {
        changeNode.Next = &leet.ListNode{
            Val:  i,
            Next: nil,
        }
        changeNode = changeNode.Next
    }
    AnotherChangeNode := TDBenchmarkSwapPairs1
    for AnotherChangeNode != nil {
        fmt.Println(AnotherChangeNode)
        AnotherChangeNode = AnotherChangeNode.Next
    }
}

func BenchmarkSwapPairs1(b *testing.B) {
    FTDBenchmarkSwapPairs1()
    for i := 0; i < b.N; i++ {
        fmt.Println(TDBenchmarkSwapPairs1.Next)
        fmt.Println(TDBenchmarkSwapPairs1.Next.Next)
        fmt.Println(TDBenchmarkSwapPairs1.Next.Next.Next)
        fmt.Println(TDBenchmarkSwapPairs1.Next.Next.Next.Next)
        leet.SwapPairs1(TDBenchmarkSwapPairs1)
    }
}

Panic信息(重要)

more...

&{98 0xc000044ac0}
&{99 <nil>}
&{1 0xc000044270}
&{2 0xc0000444a0}
&{3 0xc0000444c0}
&{4 0xc0000444d0}

一些系统信息

&{15 0xc000044ae0}
&{2 0xc000044bd0}
&{17 0xc000044b00}
&{4 0xc000044bf0}
&{17 0xc000044ae0}

无序消息

<nil>
&{4 0xc000044ae0}
&{2 <nil>}
<nil>
panic: runtime error: invalid memory address or nil pointer dereference // why
[signal 0xc0000005 code=0x0 addr=0x8 pc=0xbefb20]
英文:

When I write benchmark test for my algorithm, I was confused by an problem!

My test code detail was pushed to github and I copy it to here and add some comments.

https://github.com/hidstarshine/Algorithm/blob/master/leet/problem24_test.go

var TDBenchmarkSwapPairs1 *leet.ListNode

// This function may be not good, it should be init()?
func FTDBenchmarkSwapPairs1() {
	TDBenchmarkSwapPairs1 = &leet.ListNode{
		Val:  0,
		Next: nil,
	}
	changeNode := TDBenchmarkSwapPairs1
	for i := 1; i < 100; i++ {
		changeNode.Next = &leet.ListNode{
			Val:  i,
			Next: nil,
		}
		changeNode = changeNode.Next
	}
}

func BenchmarkSwapPairs1(b *testing.B) {
	FTDBenchmarkSwapPairs1() // problem is here
	for i := 0; i < b.N; i++ {
		leet.SwapPairs1(TDBenchmarkSwapPairs1)
	}
}

In the problem line, I call the FTDBenchmarkSwapPairs1(FTD means fill test data) to initialize the data.

Then something amzing happen, the BenchmarkSwapPairs1 seems to run in many goroutine.

So the concurrency bring the data race and due to the SwapPairs1 special logical the debug was in a mess.

SwapPairs1 will change the Next in the ListNode.

Then I want to move the BenchmarkSwapPairs1 to the block of for to solve this.

But the data race seems still unsolve and the benchmark test is no meaning because of the time of initialization.

I judge the argorithm on leetcode and get accepted!

Q: How could I solve this elegantly? Need a good idea!


NEW @Jimb

I add just add some debug info then it panic. I also think it will don't have data race at the begining.

I make the assumption when I saw the panic!

package leet_test

import (
	"fmt"
	"testing"

	"github.com/hidstarshine/Algorithm/leet"
)

var TDBenchmarkSwapPairs1 *leet.ListNode

func FTDBenchmarkSwapPairs1() {
	TDBenchmarkSwapPairs1 = &leet.ListNode{
		Val:  0,
		Next: nil,
	}
	changeNode := TDBenchmarkSwapPairs1
	for i := 1; i < 100; i++ {
		changeNode.Next = &leet.ListNode{
			Val:  i,
			Next: nil,
		}
		changeNode = changeNode.Next
	}
	AnotherChangeNode := TDBenchmarkSwapPairs1
	for AnotherChangeNode != nil {
		fmt.Println(AnotherChangeNode)
		AnotherChangeNode = AnotherChangeNode.Next
	}
}

func BenchmarkSwapPairs1(b *testing.B) {
	FTDBenchmarkSwapPairs1()
	for i := 0; i < b.N; i++ {
		fmt.Println(TDBenchmarkSwapPairs1.Next)
		fmt.Println(TDBenchmarkSwapPairs1.Next.Next)
		fmt.Println(TDBenchmarkSwapPairs1.Next.Next.Next)
		fmt.Println(TDBenchmarkSwapPairs1.Next.Next.Next.Next)
		leet.SwapPairs1(TDBenchmarkSwapPairs1)
	}
}


Panic Info( imprtant )

more...

&{98 0xc000044ac0}
&{99 <nil>}
&{1 0xc000044270}
&{2 0xc0000444a0}
&{3 0xc0000444c0}
&{4 0xc0000444d0}

Some system info

&{15 0xc000044ae0}
&{2 0xc000044bd0}
&{17 0xc000044b00}
&{4 0xc000044bf0}
&{17 0xc000044ae0}

Unorderd message

<nil>
&{4 0xc000044ae0}
&{2 <nil>}
<nil>
panic: runtime error: invalid memory address or nil pointer dereference // why
[signal 0xc0000005 code=0x0 addr=0x8 pc=0xbefb20]

答案1

得分: 5

如果你有多个基准函数,你可能不希望它们相互干扰各自的数据,所以使用一个局部变量而不是一个(共享的)包级变量。

你可以使用*B.ResetTimer来从整体基准运行时间中去除设置时间。

func BenchmarkSwapPairs1(b *testing.B) {
    root := &leet.ListNode{
        Val:  0,
        Next: nil,
    }
    changeNode := root
    for i := 1; i < 10000; i++ {
        changeNode.Next = &leet.ListNode{
            Val:  i,
            Next: nil,
        }
        changeNode = changeNode.Next
    }
	b.ResetTimer()

    for i := 0; i < b.N; i++ {
        root = leet.SwapPairs1(root)
    }
}
英文:

If you have multiple benchmark functions you probably don't want them to interfere with each other's data, so use a local variable instead of a (shared) package-level variable.

You can use *B.ResetTimer to remove the setup time from the overall benchmark running time.

func BenchmarkSwapPairs1(b *testing.B) {
    root := &amp;leet.ListNode{
        Val:  0,
        Next: nil,
    }
    changeNode := root
    for i := 1; i &lt; 10000; i++ {
        changeNode.Next = &amp;leet.ListNode{
            Val:  i,
            Next: nil,
        }
        changeNode = changeNode.Next
    }
	b.ResetTimer()

    for i := 0; i &lt; b.N; i++ {
        root = leet.SwapPairs1(root)
    }
}

huangapple
  • 本文由 发表于 2021年7月15日 21:54:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/68395191.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定