Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit tests for huffman::build_tables. #44

Merged
merged 1 commit into from
Dec 29, 2024

Conversation

anforowicz
Copy link
Contributor

@anforowicz anforowicz commented Dec 27, 2024

@fintelia, can you PTAL?

I lack a good understanding of how the fdeflate crate works, so I thought that I can improve this by trying to add some unit tests. I would need your help with this, because there are a few things that I don't quite understand (some of which manifest in test failures in the new tests added by this PR - i.e. this PR is not ready to be merged as-is and so I marked it as a draft PR).


Edit on 2024-12-27 @ 16:43PST - I've actually figured out why single-vs-double entries were getting created. So let me hide the no-longer-relevant question below.

1 stale snippet below

Unexpected single (not double) literal entry

Unexpectedly to me, the code/tables doesn't decode/map the input below as two "B" symbols (instead mapping it somewhat inefficiently to just a single "B" symbol).

QUESTION: Is this also unexpected to you? If this is expected, then can you please help me understand why?

$ cat src/huffman.rs
...
#[test]
fn test_rfc1951_example1() {
    // https://datatracker.ietf.org/doc/html/rfc1951 gives the following example
    // on page 8:
    //
    //    Symbol  Code                                                              
    //    ------  ----
    //    A       10
    //    B       0
    //    C       110                                                               
    //    D       111
    //
    // The code is completely defined by the sequence of bit lengths (2, 1, 3, 3).
    let t = LitlenTables::new(12, &[2, 1, 3, 3]).unwrap();
    assert_eq!(
        t.decode(0b_0_0_0000000_u8.reverse_bits() as u64),
        LitlenResult::DoubleLiteral {
            s1: 1,
            s2: 1,                                                                  
            input_bits: 2
        },
    );
...
$ cargo test test_rfc1951_example1
...
---- huffman::test::test_rfc1951_example1 stdout ----
thread 'huffman::test::test_rfc1951_example1' panicked at src/huffman.rs:342:9:
assertion `left == right` failed
  left: SingleLiteral { symbol: 1, input_bits: 1 }
 right: DoubleLiteral { s1: 1, s2: 1, input_bits: 2 }
...

Edit on 2024-12-27 @ 16:22PST - I've actually figured out my last 2 questions on my own - let me hide them in the collapsible section below. Sorry for the noise...

2 more (stale) questions

Format of SECONDARY_TABLE_ENTRY

https://github.com/image-rs/fdeflate/blob/07d3e744b7a9f65ea822a33c8eec6958694ea4c4/src/tables.rs#L90C44-L97C33 says in a comment that long (> 12 bits) codewords will have the following primary table entry:

0000xxxx_xxxxxxxx_01100000_00000000  x = secondary_table_index

I note that the least significant byte above is filled with zeros.

But... it seems that in practice some entries do not follow the format above:

$ cat src/huffman.rs
...
} else if 0 != entry & SECONDARY_TABLE_ENTRY {
    // Expected format: 0000xxxx_xxxxxxxx_01100000_00000000
    // TODO: In some tests we get: ......_01100000_00000111                          
    if 0 != entry & 0xf0009fff {
        panic!("Unexpected format of entry: index={i} ({i:b}), entry={entry:b}");
    }
    let index2_base = (entry >> 16) as usize;                                        
    assert!(index2_base < secondary_table.len());
} else {
...
$ cargo test test_secondary_table
...
---- huffman::test::test_secondary_table stdout ----
thread 'huffman::test::test_secondary_table' panicked at src/huffman.rs:233:21:
Unexpected format of entry: index=4095 (111111111111), entry=110000000000111
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This is probably related to how the secondary_table_index is computed here:

fdeflate/src/decompress.rs

Lines 539 to 540 in 07d3e74

let secondary_table_index =
(litlen_entry >> 16) + ((bits >> 12) as u32 & (litlen_entry & 0xff));

I think I understand that

  • litlen_entry >> 16 is the base index (i.e. the 0000xxxx_xxxxxxxx from the doc comment).
  • bits >> 12 discards bits that were covered by the primary table (let's call this shifted_bits)

But I don't understand why there is shifted_bits & (litlen_entry 0xff). I am guessing that the format really is something like:

0000xxxx_xxxxxxxx_01100000_mmmmmmmm  x = secondary_table_index   m = mask of overflow inputs bits

(not sure what to call this).

Does that make sense? If so, then I can tweak the comment in this PR.

advance_input_bits greater than 15

This is probably a P3 (I feel that I haven't done enough homework to ask for help here).

This is a bit of a heads-up that I have a a WIP branch (currently with a single commit) to add support for different table sizes via const generics (i.e. pub type Decompressor = GenericDecompressor<2048, 512>). I hope this will enable further experimentation with improving performance of png for small images.

At any rate, this WIP branch is what motivated addition of huffman.rs unit tests (i.e. this PR). The WIP code passes some tests, but the new fuzzer found an input where a literal entry (in a 512-bits primary litlen table) says to advance 60+ input bits - the unexpected entry is created here and I feel that I need to build a better understanding of how build_tables works to figure out what goes wrong. I hope that I can do this by working on the unit tests :-).

@anforowicz anforowicz force-pushed the more-unittests branch 2 times, most recently from d88f599 to 8b3ae5c Compare December 28, 2024 00:37
This commit adds a few unit tests for `huffman::build_table`.  While
adding the tests, the following issues were discovered and fixed:

* In `double_literal` case, `len1` didn't go as far as possible.
  This meant that some table entries covering 2 or more codewords
  unnecessarily set `output_advanced_bytes` to 1 (instead of 2).
* Doc comments describing the format of table entries were tweaked
  to reflect how they are used in practice.
@anforowicz anforowicz marked this pull request as ready for review December 28, 2024 00:39
@fintelia
Copy link
Contributor

Thanks for working on this!

I took a look at the advance bits issue, and it is actually not a problem with build_tables. Rather, it is a horrible hack we're using in the literal decoding fast path: If the first table lookup indicates a literal, we speculatively do 3 more table lookups assuming that more literals will follow. Only once all the lookups have completed do we check whether subsequent symbols actually are literals. Since we mask the table index by 0xfff (or LITLEN_TABLE_MASK in your branch) the reads might be garbage, but they can never be out-of-bounds.

If the symbols aren't literals then we mistakenly use overflow_bits_mask as input_advance_bits. This will be harmless as long as the arithmetic doesn't panic, because we'll discard the subsequent table reads that relied on that assumption. However, the maximum value for overflow_bits_mask depends on the primary table size (it is 2^(15 - primary_table_bits) - 1) so a small primary table can cause the shift amount to exceed 63 and trigger a panic. The easiest fix is probably just switching to wrapping_shr.

@anforowicz
Copy link
Contributor Author

[...] The easiest fix is probably just switching to wrapping_shr.

Thanks for the suggestion - I haven't thought about this approach, but now I agree that wrapping_shr indeed does look simplest/best (for preserving existing behavior).

  • I've also considered bitwise-and with 0xf, but it would add 3 instructions on the hot path, which seems undesirable: https://godbolt.org/z/eYfM6E5rf. (This godbolt also shows that wrapping_shr generates the same assembly the regular >>... at least on x86... not sure about ARM which AFAIR is the one with the funky behavior for rhs>64)
  • And I also considered replacing bits >> x, bits >> (x + y), bits >> (x + y + z) with let bits2 = bits >> x, let bits3 = bits2 >> y, let bits4 = bits3 >> z. This seems like it should be performance positive or neutral, but wrapping_shr still seems preferable for simplicity.
  • And I also considered changing the primary table format of secondary entries so that the lowest byte doesn't store the overflow_bits_mask, but overflow_bits_count (costing more to process secondary entry, but having zero impact on hot path)

@fintelia fintelia merged commit 6e5884b into image-rs:main Dec 29, 2024
10 checks passed
@anforowicz anforowicz deleted the more-unittests branch January 2, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants