Skip to content

Conversation

@konard
Copy link
Member

@konard konard commented Sep 14, 2025

🎯 Issue Analysis

This PR solves issue #19 by conducting a comprehensive performance comparison between:

  1. Current approach: Precomputed lookup table _bitsSetIn16Bits (1024KB)
  2. Alternative approach: On-demand bit position calculation using modern CPU instructions

🧪 Methodology

Correctness Verification

  • ✅ Tested all 65,536 possible 16-bit values
  • ✅ Verified both approaches produce identical results
  • ✅ Tested realistic GetBits(long word, ...) usage patterns

Performance Benchmarks

Created comprehensive benchmarks in BitsSetIn16BitsBenchmark.cs:

  • Multiple test scenarios with various bit patterns
  • Both simple operations and realistic BitString usage
  • BenchmarkDotNet integration for precise measurements

📊 Results

Performance Comparison

Operation Lookup Table On-Demand Speedup
Simple bit extraction (1M iterations) 197ms 4,790ms 24.31×
GetBits pattern (100K iterations) 3ms 125ms 41.67×

Memory & Initialization

Metric Lookup Table On-Demand
Memory usage 1,024 KB 0 KB
Initialization time 97ms (one-time) 0ms
Runtime computation None Per operation

🎯 Recommendation

Keep the current lookup table approach for these reasons:

✅ Performance Critical

  • BitString is designed for high-performance bit operations
  • 24-42× speedup significantly outweighs 1MB memory cost
  • Consistent with codebase's performance-first design (vectorization, parallelization)

✅ Memory Cost is Reasonable

  • 1MB is minimal for modern systems
  • One-time 97ms initialization cost
  • Used extensively in performance-critical operations:
    • CountSetBitsForWord()
    • AppendAllSetBitIndices()
    • GetFirstSetBitForWord()
    • GetLastSetBitForWord()

✅ Real-World Usage Patterns

The GetBits method is called frequently during:

  • Bit counting operations
  • Index enumeration
  • BitString arithmetic operations

The lookup table becomes extremely cost-effective with repeated usage.

🛠 Files Added

Benchmarks

  • BitsSetIn16BitsBenchmark.cs - Comprehensive BenchmarkDotNet tests
  • Updated Program.cs to run new benchmarks

Experiments

  • BitSetComparison.cs - Simple performance comparison
  • CorrectnessTest.cs - Exhaustive correctness verification
  • BenchmarkResults.md - Detailed analysis and recommendations

🔍 Technical Details

Current Lookup Table Implementation

private static readonly byte[][] _bitsSetIn16Bits; // 65536 entries
// Usage: _bitsSetIn16Bits[value] returns byte[] of set bit positions

Alternative On-Demand Implementation

// Uses modern CPU instructions: BitOperations.PopCount, TrailingZeroCount
private static byte[] CalculateBitsOnDemand(ushort value) { ... }

Both approaches are functionally equivalent but have vastly different performance characteristics.

🏁 Conclusion

The analysis definitively shows that the current lookup table approach is optimal for BitString's performance-critical use case. The 24-42× performance improvement justifies the 1MB memory cost, especially given BitString's role in high-performance bit manipulation scenarios.

Fixes #19


🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

Adding CLAUDE.md with task information for AI processing.
This file will be removed when the task is complete.

Issue: #19
@konard konard self-assigned this Sep 14, 2025
…s on-demand calculation

- Created BitsSetIn16BitsBenchmark.cs with multiple benchmark scenarios
- Added experimental scripts to test performance and correctness
- Verified both approaches produce identical results for all 65,536 test cases
- Performance results show lookup table is 24-42x faster than on-demand calculation
- Memory cost: 1024KB for ~97ms initialization time
- Recommendation: Keep current lookup table approach for performance-critical BitString operations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@konard konard changed the title [WIP] Compare what is faster to store BitsSetIn16Bits or to calculate it on demand Performance analysis: BitsSetIn16Bits lookup table vs on-demand calculation Sep 14, 2025
@konard konard marked this pull request as ready for review September 14, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compare what is faster to store BitsSetIn16Bits or to calculate it on demand

2 participants