Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark for the queue #5765

Merged
merged 2 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion pkg/flow/internal/worker/worker_pool.go
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,10 @@ func (w *workQueue) emitNextTask() {
return
}

// Remove the task from waiting and add it to running set
// Remove the task from waiting and add it to running set.
// NOTE: Even though we remove an element from the middle of a collection, we use a slice instead of a linked list.
// This code is NOT identified as a performance hot spot and given that in large agents we observe max number of
// tasks queued to be ~10, the slice is actually faster because it does not allocate memory. See BenchmarkQueue.
w.waitingOrder = append(w.waitingOrder[:index], w.waitingOrder[index+1:]...)
task = w.waiting[key]
delete(w.waiting, key)
Expand Down
68 changes: 68 additions & 0 deletions pkg/flow/internal/worker/worker_pool_test.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package worker

import (
"container/list"
"fmt"
"testing"
"time"
Expand Down Expand Up @@ -279,3 +280,70 @@ func TestWorkerPool(t *testing.T) {
})
})
}

func BenchmarkQueue(b *testing.B) {
/* The slice-based implementation is faster when queue size is less than 100 elements, as it doesn't allocate:

BenchmarkQueue/slice_10_elements-8 179793877 7.119 ns/op 0 B/op 0 allocs/op
BenchmarkQueue/list_10_elements-8 48129548 24.75 ns/op 48 B/op 1 allocs/op
BenchmarkQueue/slice_50_elements-8 100000000 11.52 ns/op 0 B/op 0 allocs/op
BenchmarkQueue/list_50_elements-8 52150766 22.81 ns/op 48 B/op 1 allocs/op
BenchmarkQueue/slice_100_elements-8 57941505 20.71 ns/op 0 B/op 0 allocs/op
BenchmarkQueue/list_100_elements-8 51232423 22.95 ns/op 48 B/op 1 allocs/op
BenchmarkQueue/slice_500_elements-8 12610141 95.32 ns/op 0 B/op 0 allocs/op
BenchmarkQueue/list_500_elements-8 51342938 22.84 ns/op 48 B/op 1 allocs/op
BenchmarkQueue/slice_1000_elements-8 6416760 187.6 ns/op 0 B/op 0 allocs/op
BenchmarkQueue/list_1000_elements-8 51942148 22.92 ns/op 48 B/op 1 allocs/op
*/

queueSizes := []int{10, 50, 100, 500, 1000}
for _, queueSize := range queueSizes {
elementsStr := fmt.Sprintf("%d elements", queueSize)
b.Run("slice "+elementsStr, func(b *testing.B) {
var slice []string
for i := 0; i < queueSize; i++ {
slice = append(slice, fmt.Sprintf("some.component.name.%d", i))
}

b.ResetTimer()
for i := 0; i < b.N; i++ {
// simulate what the `workQueue` does with its `waitingOrder` field.

// add to queue
slice = append(slice, "some.component.name")

// iterate to an arbitrary element
ind := 0
for ; ind < 5; ind++ {
_ = slice[ind]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious; wondering if storing this in a package-level variable (without having to use it) has any effect. I'd guess not, but again I'm not putting my hand in the fire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're referring to the whole slice here? not sure I follow. Note that before the benchmark loop I call b.ResetTimer().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, literally just doing somePackageLvlVariable = slice[ind] instead.

I'm guessing the compiler can completely eliminate the statement here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! Yeah, it's possible that it removes this. However, I wouldn't expect this to change the results of benchmark in the slightest, as reaching into an indexed array is cheap. I tried this and getting results as following:

BenchmarkQueue/slice_10_elements-8         	261364618	         4.596 ns/op	       0 B/op	       0 allocs/op
BenchmarkQueue/list_10_elements-8          	38173401	        32.00 ns/op	      56 B/op	       1 allocs/op
BenchmarkQueue/slice_100_elements-8        	81230875	        14.80 ns/op	       0 B/op	       0 allocs/op
BenchmarkQueue/list_100_elements-8         	37829430	        31.11 ns/op	      56 B/op	       1 allocs/op
BenchmarkQueue/slice_300_elements-8        	39452000	        30.43 ns/op	       0 B/op	       0 allocs/op
BenchmarkQueue/list_300_elements-8         	38020203	        31.12 ns/op	      56 B/op	       1 allocs/op
BenchmarkQueue/slice_1000_elements-8       	12476460	        96.10 ns/op	       0 B/op	       0 allocs/op
BenchmarkQueue/list_1000_elements-8        	38070512	        31.05 ns/op	      56 B/op	       1 allocs/op
BenchmarkQueue/slice_10000_elements-8      	 1000000	      1026 ns/op	       0 B/op	       0 allocs/op
BenchmarkQueue/list_10000_elements-8       	34904437	        34.29 ns/op	      56 B/op	       1 allocs/op

Which don't change the conclusion.

I noticed that I'm using int instead of string though, which may change the conclusion. I'll quickly change this so the benchmark matches closer the implementation.

I don't think it's worth spending more time here though, as it is definitely not a bottleneck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't expect the benchmark to change, was just curious to see if there was any impact between runs of the same slice_* case.

In any case, let's go ahead and merge this.

}

// remove it from the queue
slice = append(slice[:ind], slice[ind+1:]...)
}
})

b.Run("list "+elementsStr, func(b *testing.B) {
l := list.New()
for i := 0; i < queueSize; i++ {
l.PushBack(fmt.Sprintf("some.component.name.%d", i))
}

b.ResetTimer()
for i := 0; i < b.N; i++ {
// simulate what the `workQueue` does with its `waitingOrder` field.

// add to queue
l.PushBack("some.component.name")

// iterate to an arbitrary element
toRemove := l.Front()
for j := 0; j < 5; j++ {
toRemove = toRemove.Next()
}
// remove it from the queue using copy
l.Remove(toRemove)
}
})
}
}