Describe the bug
Finally found the issue with a pretty severe performance penalty for reading buffers back from the GPU, on mac:
gfx-rs/wgpu#8119
How to reproduce
use gpu.Device.Poll or WaitDone(), e.g., after read buffer, takes at least 1ms on mac. high-bandwidth compute jobs are impacted.
Example code
Relevant output
Platform
macOS