1 回答

TA贡献1155条经验 获得超0个赞
我将两者都放在基准测试中,两者之间的性能几乎相等,append速度较慢,但几乎可以忽略不计。
package main_test
import "testing"
func BenchmarkMerge1(b *testing.B) {
for i := 0; i < b.N; i++ {
num1 := []int{1, 2, 3}
num2 := []int{4, 5, 6}
merge1(num1, len(num1), num2, len(num2))
}
}
func merge1(nums1 []int, m int, nums2 []int, n int) {
tmpSlice := make([]int, m+n)
tmpIndex := 0
index1 := 0
index2 := 0
for index1 < m {
value1 := nums1[index1]
for index2 < n {
value2 := nums2[index2]
if value1 <= value2 {
break
} else {
tmpSlice[tmpIndex] = value2 // <-- Assign
index2++
tmpIndex++
}
}
tmpSlice[tmpIndex] = value1 // <-- Assign
index1++
tmpIndex++
}
copy(nums1, tmpSlice[:tmpIndex])
copy(nums1[tmpIndex:], nums2[index2:])
}
func BenchmarkMerge2(b *testing.B) {
for i := 0; i < b.N; i++ {
num1 := []int{1, 2, 3}
num2 := []int{4, 5, 6}
merge2(num1, len(num1), num2, len(num2))
}
}
func merge2(nums1 []int, m int, nums2 []int, n int) {
tmpSlice := make([]int, 0, m+n)
tmpIndex := 0
index1 := 0
index2 := 0
for index1 < m {
value1 := nums1[index1]
for index2 < n {
value2 := nums2[index2]
if value1 <= value2 {
break
} else {
tmpSlice = append(tmpSlice, value2) // <-- Append
index2++
tmpIndex++
}
}
tmpSlice = append(tmpSlice, value1) // <-- Append
index1++
tmpIndex++
}
copy(nums1, tmpSlice[:tmpIndex])
copy(nums1[tmpIndex:], nums2[index2:])
}
Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^(BenchmarkMerge1|BenchmarkMerge2)$ example.com/m
goos: linux
goarch: amd64
pkg: example.com/m
cpu: Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz
BenchmarkMerge1-16 34586568 36.40 ns/op 48 B/op 1 allocs/op
BenchmarkMerge2-16 32561293 36.77 ns/op 48 B/op 1 allocs/op
PASS
ok example.com/m 2.533s
这是意料之中的,因为只要切片有容量,append 基本上就会进行分配。append还增加len切片标头中的字段(该提示感谢@rustyx),这解释了差异。
当没有在切片上设置初始容量并使用追加时,您会看到更大的差异,因为它会“增长”需要时间的底层数组。
如果我们更改tmpSlice := make([]int, 0, m+n)为tmpSlice := make([]int, 0)inmerge2我们会得到以下结果:
Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^(BenchmarkMerge1|BenchmarkMerge2)$ example.com/m
goos: linux
goarch: amd64
pkg: example.com/m
cpu: Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz
BenchmarkMerge1-16 37319397 32.34 ns/op 48 B/op 1 allocs/op
BenchmarkMerge2-16 14543529 87.75 ns/op 56 B/op 3 allocs/op
PASS
ok example.com/m 2.604s
TL;DR,只要切片有容量,append就比分配慢(因为切片中的字段递增)几乎可以忽略不计len
- 1 回答
- 0 关注
- 126 浏览
添加回答
举报