SoFunction
Updated on 2025-03-04

Go through benchmark to perform performance test on the code

Use of benchmark

When we want to write high-performance code or optimize the performance of the code in development, you must first know the performance of the current code. In Go, you can use the benchmark of the testing package to do benchmarking. First, we write a simple method to return random strings.

func randomStr(length int) string {
  (().UnixNano())
  letters := "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
  b := make([]byte, length)
  for i := range b {
    b[i] = letters[(len(letters))]
  }
  return string(b)
}

To benchmark the above code, we first need to create a new test file, for examplemain_test.go, and then create a new benchmark test methodBenchmarkRandomStr, similar to the normal test function Test, with the parameter t*, the benchmark function should start with Benchmark, the parameter b*, the codeIt represents the number of runs of this use case. This value will change. For each use case, it will increase from 1. I will introduce the specific implementation in the following implementation principle.

func BenchmarkRandomStr(b *) {
  for i := 0; i < ; i++ {
    randomStr(10000)
  }
}

Run Benchmark

We can usego test -bench .The command directly runs all benchmark test cases in the current directory, and the corresponding use cases can also be matched with regular or strings after -bench.

$  go test -bench='Str$'
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12               6692            181262 ns/op
PASS
ok      learn/learn_test        2.142s
​

We need to understand some of the above key indicators. First, the following BenchmarkRandomStr-12-12It representsGOMAXPROCSThis is related to the logical core number of your machine's CPU, and can be passed in the benchmark test-cpuParameters specify how many core CPUs you need to run test cases

$  go test -bench='Str$' -cpu=2,4,8 .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-2        6715            181197 ns/op
BenchmarkRandomStr-4        6471            180249 ns/op
BenchmarkRandomStr-8        6616            179510 ns/op
PASS
ok      learn/learn_test        4.516s
​

6715and181197 ns/opRepresentative use case was executed 6715 times, each time spent about 0.0001812s, and the total time was about 1.2s (the conversion of ns:s is 100000000:1)

Specify the test duration or number of tests

-benchtime=3s Specify the duration

-benchtime=100000x Specify the number of times

-coun=3 Specify the number of rounds

$  go test -bench='Str$' -benchtime=3s .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12              19988            177572 ns/op
PASS
ok      learn/learn_test        5.384s
​
$ go test -bench='Str$' -benchtime=10000x .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12              10000            184832 ns/op
PASS
ok      learn/learn_test        1.870s
​
$ go test -bench='Str$' -count=2 . 
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12               6702            177048 ns/op
BenchmarkRandomStr-12               6482            177861 ns/op
PASS
ok      learn/learn_test        3.269s
​
​

Reset time and pause time

Sometimes our test cases require some pre-preparation time-consuming behavior, which will have an impact on our test results. At this time, we need to reset the timing after the time-consuming operation. Let's use a pseudo-code to simulate it

func BenchmarkRandomStr(b *) {
  ( * 2) // Simulation time-consuming operation  for i := 0; i &lt; ; i++ {
    randomStr(10000)
  }
}
​

Then let's execute the use case

$ go test -bench='Str$' .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12                  1        2001588866 ns/op
PASS
ok      learn/learn_test        2.009s
​

It was found that it was executed only once, and the time became more than 2s, which obviously did not meet our expectations. It needs to be called at this time.()To reset the time

func BenchmarkRandomStr(b *) {
  ( * 2) // Simulation time-consuming operation  () 
  for i := 0; i &lt; ; i++ {
    randomStr(10000)
  }
}

Execute the benchmark again

$ go test -bench='Str$' .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkRandomStr-12               6506            183098 ns/op
PASS
ok      learn/learn_test        10.030s
​

The number of runs and single execution time have been restored to the previous test situation. Benchmark tests()and()The same is true for the method, stop timing before affecting the time-consuming operation, and start timing after completion.

Check memory usage

When we evaluate the performance of the code, in addition to the speed of time, another important indicator is memory usage, which can be passed in the benchmark test.-benchmemto display memory usage. Let's use a set of return int slicing methods that specify caps and do not specify caps to see the memory usage

func getIntArr(n int) []int {
  (uint64(().UnixNano()))
  arr := make([]int, 0)
  for i := 0; i &lt; n; i++ {
    arr = append(arr, ())
  }
​
  return arr
}
​
func getIntArrWithCap(n int) []int {
  (uint64(().UnixNano()))
  arr := make([]int, 0, n)
  for i := 0; i &lt; n; i++ {
    arr = append(arr, ())
  }
​
  return arr
}
//------------------------------------------
// Benchmark code//------------------------------------------
func BenchmarkGetIntArr(b *) {
  for i := 0; i &lt; ; i++ {
    getIntArr(100000)
  }
}
​
func BenchmarkGetIntArrWithCap(b *) {
  for i := 0; i &lt; ; i++ {
    getIntArrWithCap(100000)
  }
}
​

Perform benchmark tests:

$ go test -bench='Arr' -benchmem .
goos: darwin
goarch: amd64
pkg: learn/learn_test
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkGetIntArr-12                        598           1928991 ns/op         4101389 B/op         28 allocs/op
BenchmarkGetIntArrWithCap-12                 742           1556204 ns/op          802817 B/op          1 allocs/op
PASS
ok      learn/learn_test        2.688s
​

You can see that the method specified in the cap is executed at about 20% faster, while the memory usage is about 80%.802817 B/opRepresents the memory usage each time.1 allocs/opIndicates the number of times memory is allocated for each operation

The underlying implementation

When writing benchmark tests, what I don’t understand the most is the mechanism, how to automatically adjust the number of executions based on different use cases, and then I found some clues in the source code. First, let’s take a look at the underlying data structure of the benchmark test

type B struct {
  common
  importPath       string
  context          *benchContext
  N                int // This is what you want to understand, which represents the number of times you want to execute  previousN        int          
  previousDuration  
  benchFunc        func(b *B) // Test function  benchTime        durationOrCountFlag // Execution time, default is 1s. It can be specified through -benchtime.  bytes            int64 
  missingBytes     bool 
  timerOn          bool 
  showAllocResult  bool
  result           BenchmarkResult
  parallelism      int 
  
  startAllocs uint64 
  startBytes  uint64 
  
  netAllocs uint64 
  netBytes  uint64 
  
  extra map[string]float64
}

Through the N field in the structure, several key methods can be found.runN(): The method that will be called every time it is executed, setting the value of N.run1():The first iteration, based on its results, decide whether more benchmarks need to be run.run(): If the result of run1() execution is true, it will be called, and this method is calleddoBench()Functions are calledlaunch()Function, this is the function that determines the number of executions

// Run benchmarks f as a subbenchmark with the given name. It reports
// whether there were any failures.
//
// A subbenchmark is like any other benchmark. A benchmark that calls Run at
// least once will not be measured itself and will be called once with N=1.
func (b *B) Run(name string, f func(b *B)) bool {
  // ...Omit some code  // Run() method is the start method of the benchmark test, and a new subtest will be created.  sub := &amp;B{
    common: common{
      signal:  make(chan bool),
      name:    benchName,
      parent:  &amp;,
      level:    + 1,
      creator: pc[:n],
      w:       ,
      chatty:  ,
      bench:   true,
    },
    importPath: ,
    benchFunc:  f,
    benchTime:  ,
    context:    ,
  }
// ...Omit some code  if sub.run1() { // Execute a child test, if there is no error, execute run()    () //The launch() method is finally called to determine how many times runN() needs to be executed  }
  ()
  return !
}
​
// runN runs a single benchmark for the specified number of iterations.
func (b *B) runN(n int) {
	// ....Omit some code	 = n //Specify N	// ...
}

// launch launches the benchmark function. It gradually increases the number
// of benchmark iterations until the benchmark runs for the requested benchtime.
// launch is run by the doBench function as a separate goroutine.
// run1 must have been called on b.
func (b *B) launch() {
  // ....Omit some code    d := 
  // The minimum execution time is 1s, the maximum execution times are 1e9 times    for n := int64(1); ! &amp;&amp;  &lt; d &amp;&amp; n &lt; 1e9; {
      last := n
      // The number of iterations required for prediction      goalns := ()
      prevIters := int64()
      prevns := ()
      if prevns &lt;= 0 {
        // Rounding to prevent 0        prevns = 1
      }
      n = goalns * prevIters / prevns
      // Avoid growing too fast, first grow at 1.2 times, and increase at least once      n += n / 5
      n = min(n, 100*last)
      n = max(n, last+1)
      // Execute up to 1e9 times      n = min(n, 1e9)
      (int(n))
}
​

Summarize

1. Benchmark method should start with Benchmark

2. Execute the benchmark test with the go test -bench . command to execute all benchmark tests in this directory. -bench can be followed by regular expressions to execute tests that meet the conditions.

3.-Cpu parameter can specify the number of CPU cores to run the test

4.-benchtime parameter can specify the time and number of tests to be run

5. The -count parameter can specify the number of rounds to run the test

(), (), (), () can reset or pause the timing to eliminate the impact of some time-consuming operations

The above is the detailed explanation of go through benchmark code performance testing. For more information about go benchmark code performance testing, please pay attention to my other related articles!