Conversation
|
Local benchmark run on this branch (M4 Max) |
|
Local benchmark run on develop (M4 Max) |
Merging this PR will degrade performance by 16.92%
Performance Changes
Comparing Footnotes
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds a new safe compression pathway that avoids intermediate allocations by working directly with uninitialized memory. The change allows users to pre-allocate a buffer and compress data directly into it without requiring two separate allocations and a copy operation.
- Adds
compress_into_uninitmethod that takes&mut [MaybeUninit<u8>]instead of&mut Vec<u8> - Implements a safe version of the compression hot loop using
compress_word_safe - Updates benchmarks to use the new safe API while maintaining performance
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/lib.rs | Adds new safe compression methods and updates existing compress method to use the new pathway |
| benches/micro.rs | Updates micro benchmarks to use the new safe compression API |
| benches/compress.rs | Updates compression benchmarks to use the new safe API |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
Wow codspeed did not like that |
Using the FSST library currently requires you to make two allocations 1. Allocate a buffer to compress_into 2. Allocate a larger packed buffer We shouldn't take a &mut Vec<u8> directly, since Vec isn't splittable. We should instead be taking &mut [MaybeUninit<u8>], which can be backed by Vec, Bytes, Buffer<u8> or whatever other memory allocation we happen to get our hands on. This PR adds a new safe compression pathway that exposes `compress_into_uninit` and implements the hot loop using only safe code to boot. Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
This reverts commit 01ddd72. Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
5f10dd4 to
12c05ff
Compare
12c05ff to
6d19571
Compare
Using the FSST library currently requires you to make two allocations:
compress_intoThis is kind of annoying. What I really want to do is pre-allocate a buffer large enough to hold all of the compressed data, then do something like
This lets me compress a bunch of values directly into a single packed byte buffer, without an intermediate copy.
The Fix
We shouldn't take a
&mut Vec<u8>directly, instead we should take&mut [MaybeUninit<u8>], which can be backed by Vec, Bytes, Buffer or whatever other memory allocation we happen to get our hands on.This PR adds a new safe compression pathway that exposes
compress_into_uninitand implements the hot loop using only safe code to boot.I'm leaving the old
unsafe compress_intohere to allow existing projects to keep using that interface, but have updated the docs to indicate that the compress_into_uninit is the new preferred pathway.Performance
Performance measured on M4 Max on the micro and compress benches seems to be more or less identical.
I have a long term goal of eliminating most of the unsafe code in this repo, see #87. This brings us one step closer to this.