-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorter blocks and consistent file #112
Conversation
Makes Lepton file consistent between runs and helps for decoding of files transferred over slow channels
Carouseling effectively mixes blocks, so we can exclude old mixing mechanism that checks length after each stream bit pushing - ~3 % faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to separate out the multiplexer changes and work on the bitwriter changes separately.
src/structs/multiplexer.rs
Outdated
} | ||
|
||
let mut curr_write_thread: usize = 0; | ||
while threads_left > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(a) = packets[curr_write_thread].pop_front()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied.
src/structs/vpx_bool_writer.rs
Outdated
} else { | ||
*tmp_range = split; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a cleaner version of this in the cabac project
https://github.com/mcroomp/cabac/blob/77be08aadf92b8fc6d9cd37d17cec0fcef55605b/src/vp8.rs#L366
completely eliminates adjusting the carry in the buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I'll apply this thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be applied in the next PR on bit writer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok if you could remove the bool writer changes here and I can approve. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed - only changes about buffer flashing remain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer flushing is unnecessary with the new bool_writer changes since it doesn't require a separate buffer anymore at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this will be in the next PR :)
Separate into 2 PRs? Or leave like this? |
I think in 2 PRs would be better since they aren't really related. Trying to be extra careful with the write codepath since corruptions on the write side are much more dangerous. Thanks! |
This version f655f27 is slower but still faster than main branch:
I'll prepare a separate bit writer PR after merge of this one. |
Lepton format allows to write blocks of standard length (65536 bytes) shorter, with 1-byte header instead of 3-bytes. Additionally, current implementation writes down encoded blocks into the file in the order of their receiving from different threads - that leads to different output files across runs.
This PR makes the output file shorter and the order of blocks in it predefined by carouseling. Performance is increased by ~3 % due to omission of buffer size check at each stream bit pushing.
Current best of 3 runs:
This PR c7e0f7c best of 3 runs: