Invalid bit masking in KeyCode - KeyCode.CharMask is missing 65536 values

So, while working on the hotkey stuff, Key and KeyCode became very relevant.

And simplified logic that follows the rules for unicode resulted in a small number of test failures, yet the binary math was saying the tests are wrong - not the code.

So I took a look at the values.

KeyCode.CharMask is defined as 0x000F_FFFF, which is 20 bits.

KeyCode.MaxCodePoint is 0x0010_FFFF, which is correct, per the unicode spec, but that's 21 bits.

KeyCode.CharMask, however, is not the correct mask for that. It drops the high bit, resulting in 65536 missing values, because it ignores a range of 16 bits the way it is written.

Here's the binary explanation:
```csharp
// Bit counts are in columns with the number below them being the decimal 10s place in the first row and 1s place in the second row.
// For example:
//              0
//              1
// means 1
// and
//              1
//              5
// means 15.
//
// Now for the breakdown...
//
// KeyCode.MaxCodePoint:
// 0x0010_FFFF = 0b_0000_0000_0001_0000_1111_1111_1111_1111
// 10s                           2 2111 1111 1110 0000 0000
//  1s                           1 0987 6543 2109 8765 4321
// To mask that, you must have a 1 in every column. Columns with 0 are dropped.
// This is a 21-bit value.
// But our value for KeyCode.CharMask is:
// 0x000F_FFFF = 0b_0000_0000_0000_1111_1111_1111_1111_1111
// 10s                             2111 1111 1110 0000 0000
//  1s                             0987 6543 2109 8765 4321
// This is a 20-bit value, which means a max value of 1_048_575
//
// But the highest bit is dropped.
//
// In mask terms, you're dropping half the possible values.
// But values from 0x10_FFFF to 0x1F_FFFF aren't defined, so we have 4 bits we still need to mask,
// but that we just won't care about later and which CANNOT be used for any other purpose.
//
// 0x10_FFFF = 1_114_111 in decimal. Thus, it can represent 1_114_112 code points (0 counts).
// 0x0F_FFFF = 1_048_575 in decimal. Thus, it can represent 1_048_576 code points.
// Thus, 0x1_0000 or 65536 values are missing, which CharMask should be able to cover.
//
// You can also simply look at it from the left (most significant bits), which is how we do it in
// networking (subnet masks are 32-bit binary masks against the address).
//
// We are missing the 12th bit from the left. 2^12 is 4096.
// 4 bits are missing, which means 2^4 values, at that magnitude.
// So, 16 * 4096 (magnitude of 12 bits) is 65536.
//
// The correct mask for 21 bits is 0x1F_FFFF, because those 4 bits still matter when bit 21 is 0.
//
// While the MaxCodePoint value is correct, because the standard defines it to be so,
// the values from 0x01_0000 to 0x0F_0000 (inclusive) are missing.
// Those are the 65535 missing values.
//
// Then, we have KeyCode.SpecialMask, which is 0xFFF0_0000:
// 0xFFF0_0000 = 0b_1111_1111_1111_0000_0000_0000_0000_0000
// 10s              3332 2222 2222 2111 1111 1110 0000 0000
//  1s              2109 8765 4321 0987 6543 2109 8765 4321
//
// That's the correct complement of 0x0F_FFFF.
// But 0x0F_FFFF isn't correct, so this is also not correct.
//
// The correct value is:
// 0xFFE0_0000 = 0b_1111_1111_1110_0000_0000_0000_0000_0000
// 10s              3332 2222 2222 2111 1111 1110 0000 0000
//  1s              2109 8765 4321 0987 6543 2109 8765 4321
//
// In network parlance, that'd be a /11 (an IPv4 subnet mask of 255.224.0.0)
```

So it's an off-by-one error, but it's off by 1 in the 17th position.

As a result, anything that uses CharMask or SpecialMask will lose (or steal, for SpecialMask) the highest-order bit of the character value, if that bit is set.

That's a data corruption bug and treating the binary values correctly leads to 2 KeyTests test cases failing, and also to almost 200 other tests in other fixtures failing, because of either test case values or code that depend on the broken values. I have not yet begun that investigation, but I imagine it's a little of both and that a fairly small number of changes in a small number of places will probably fix them.

Fixing only the values for CharMask and SpecialMask to be `CharMask = 0x1F_FFFF` and `SpecialMask = 0xFFE0_0000` (so, swapping the 21st bit to CharMask, where it belongs, and dropping it from SpecialMask) resolves the KeyTest failures and quite a few of the other failures, leaving a few dozen more to track down.

Now I just need to diagnose the root cause of the other test failures. Most seem to be around the F keys, which seem to be a fair amount of what's often used for non-printable keyboard input test cases.

The fixes for this are important to the work I'm already doing in TextFormatter, so they're just going in that branch for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Invalid bit masking in KeyCode - KeyCode.CharMask is missing 65536 values #3287

8 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Invalid bit masking in KeyCode - KeyCode.CharMask is missing 65536 values #3287

Description

Activity

dodexahedron commented on Mar 3, 2024

dodexahedron commented on Mar 3, 2024

tig commented on Mar 3, 2024

tig commented on Mar 3, 2024

dodexahedron commented on Mar 3, 2024

dodexahedron commented on Mar 3, 2024

dodexahedron commented on Mar 3, 2024

dodexahedron commented on Mar 3, 2024

8 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions