⚡️ Speed up function `histogram_equalization` by 22,436% #109

codeflash-ai · 2025-09-18T21:44:04Z

📄 22,436% (224.36x) speedup for `histogram_equalization` in `src/numpy_pandas/signal_processing.py`

⏱️ Runtime : 8.55 seconds → 37.9 milliseconds (best of 99 runs)

📝 Explanation and details

The optimized code achieves a 224x speedup by replacing inefficient nested loops with vectorized NumPy operations:

Key optimizations:

Histogram computation: Replaced nested loops iterating over every pixel with np.bincount(image.ravel(), minlength=256). This leverages NumPy's optimized C implementation to count pixel intensities in one operation instead of ~5 million individual array accesses.
CDF calculation: Used np.cumsum(histogram) / total_pixels instead of a loop that computed cumulative sums iteratively. NumPy's cumsum is highly optimized and eliminates 255 iterations.
Pixel mapping: The most critical optimization - replaced another set of nested loops with vectorized indexing cdf[image]. Instead of 5+ million individual pixel assignments, this performs the entire transformation in one vectorized operation using advanced NumPy indexing.

Why this is so much faster:

The original code spent 77% of its time in the final nested loop doing individual pixel assignments
Python loops have significant overhead compared to NumPy's vectorized operations which run in compiled C
Memory access patterns are much more efficient with vectorized operations

Performance characteristics:
The optimization is particularly effective for:

Large images (1000x1000 tests show dramatic speedups)
Images with any distribution (uniform, gradient, sparse, random all benefit equally)
All test cases maintain identical correctness while gaining massive performance improvements

The vectorized approach scales much better with image size, making it suitable for real-world image processing applications.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 36 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import numpy as np
# imports
import pytest  # used for our unit tests
from src.numpy_pandas.signal_processing import histogram_equalization

# unit tests

# ----------------
# Basic Test Cases
# ----------------

def test_single_color_image():
    # All pixels are the same value; histogram equalization should result in all pixels being 255
    img = np.full((4, 4), 100, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_two_color_image():
    # Two unique values, half and half
    img = np.array([[0, 0, 255, 255],
                    [0, 0, 255, 255],
                    [0, 0, 255, 255],
                    [0, 0, 255, 255]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_linear_gradient():
    # A 4x4 image with values from 0 to 15
    img = np.arange(16, dtype=np.uint8).reshape((4, 4))
    codeflash_output = histogram_equalization(img); result = codeflash_output
    # Each value should be mapped to a unique value from 16 to 255, spaced evenly
    unique_vals = np.unique(result)

def test_uniform_distribution():
    # All values from 0 to 255 appear exactly once in a 16x16 image
    img = np.arange(256, dtype=np.uint8).reshape((16, 16))
    codeflash_output = histogram_equalization(img); result = codeflash_output
    # Output should be a linear ramp from 1 to 255 (since CDF is linear)
    unique_vals = np.unique(result)

# ----------------
# Edge Test Cases
# ----------------

def test_minimal_image_size():
    # 1x1 image, should always be 255 after equalization
    img = np.array([[42]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_all_zeros():
    # All pixels are zero; output should be all 255
    img = np.zeros((8, 8), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_all_max():
    # All pixels are 255; output should be all 255
    img = np.full((8, 8), 255, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_alternating_pattern():
    # Checkerboard of 0 and 255
    img = np.indices((4, 4)).sum(axis=0) % 2 * 255
    img = img.astype(np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_sparse_image():
    # Only two pixels are nonzero
    img = np.zeros((8, 8), dtype=np.uint8)
    img[0, 0] = 50
    img[7, 7] = 200
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_non_contiguous_values():
    # Only values 10, 100, 200 appear
    img = np.array([[10, 10, 100, 100],
                    [200, 200, 10, 100]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output
    # The mapping should be strictly increasing
    vals = sorted(set(img.flatten()))
    mapped = [result[img == v][0] for v in vals]

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_large_uniform_image():
    # Large image, all values the same
    img = np.full((1000, 1), 123, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_large_gradient():
    # 32x32 image, values from 0 to 255 repeated
    img = np.tile(np.arange(256, dtype=np.uint8), (4, 1))
    codeflash_output = histogram_equalization(img); result = codeflash_output
    unique_vals = np.unique(result)

def test_large_random_image():
    # Large random image
    rng = np.random.default_rng(42)
    img = rng.integers(0, 256, size=(32, 32), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_large_sparse_image():
    # Large image with only a few unique values
    img = np.zeros((100, 10), dtype=np.uint8)
    img[0:50, :] = 50
    img[50:, :] = 200
    codeflash_output = histogram_equalization(img); result = codeflash_output
    vals = sorted(set(img.flatten()))
    mapped = [result[img == v][0] for v in vals]

def test_performance_large_image():
    # Test that function runs efficiently on a reasonably large image
    img = np.random.randint(0, 256, size=(1000, 1), dtype=np.uint8)
    # Not measuring time, but ensuring no crash or excessive delay
    codeflash_output = histogram_equalization(img); result = codeflash_output

# -------------
# Miscellaneous
# -------------

def test_input_not_modified():
    # Ensure the input array is not modified in-place
    img = np.arange(16, dtype=np.uint8).reshape((4, 4))
    img_copy = img.copy()
    codeflash_output = histogram_equalization(img); _ = codeflash_output

def test_dtype_preserved():
    # Output dtype should match input dtype
    img = np.random.randint(0, 256, size=(5, 5), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output

def test_shape_preserved():
    # Output shape should match input shape
    img = np.random.randint(0, 256, size=(7, 3), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import numpy as np
# imports
import pytest  # used for our unit tests
from src.numpy_pandas.signal_processing import histogram_equalization

# unit tests

# 1. Basic Test Cases

def test_uniform_image():
    # All pixels have the same value: output should be a uniform image (all zeros)
    img = np.full((4, 4), 128, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_bi_level_image():
    # Image with two intensity levels
    img = np.array([[0, 0, 255, 255],
                    [0, 0, 255, 255],
                    [0, 0, 255, 255],
                    [0, 0, 255, 255]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_gradient_image():
    # Simple gradient image: [0, 64, 128, 192]
    img = np.array([[0, 64], [128, 192]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Each value should be equally spaced in output
    expected = np.array([[63, 127], [191, 255]], dtype=np.uint8)

def test_small_random_image():
    # Random small image
    img = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_already_equalized_image():
    # Already spread-out histogram (linear ramp)
    img = np.arange(16, dtype=np.uint8).reshape(4, 4) * 16
    codeflash_output = histogram_equalization(img); out = codeflash_output

# 2. Edge Test Cases

def test_single_pixel_image():
    # Single pixel image
    img = np.array([[42]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output


def test_all_zero_image():
    # All pixels zero
    img = np.zeros((5,5), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_all_max_image():
    # All pixels 255
    img = np.full((5,5), 255, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_image_with_missing_levels():
    # Image with only a few levels, others missing
    img = np.array([[10, 10, 10, 200, 200, 200],
                    [10, 10, 10, 200, 200, 200]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Should have only two values, 127 and 255
    unique = np.unique(out)

def test_image_with_noncontiguous_levels():
    # Only some levels present
    img = np.array([[0, 50, 100, 150, 200, 250]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_image_with_min_max_only():
    # Only min and max present
    img = np.array([[0, 255, 0, 255]], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Should map to 127 and 255
    unique = np.unique(out)

def test_image_with_large_gap():
    # Two clusters: low and high
    img = np.array([[10]*10 + [240]*10], dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Should map to two values, one near 127, one near 255
    unique = np.unique(out)

def test_dtype_preservation():
    # Output should have same dtype as input
    img = np.random.randint(0, 256, size=(4,4), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

# 3. Large Scale Test Cases

def test_large_uniform_image():
    # Large image, all pixels the same
    img = np.full((1000, 1000), 77, dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_large_gradient_image():
    # Large image with a gradient
    img = np.tile(np.linspace(0, 255, 1000, dtype=np.uint8), (1000,1))
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Output should be monotonic along the gradient axis
    for i in range(1000):
        pass

def test_large_random_image():
    # Large random image
    np.random.seed(0)
    img = np.random.randint(0, 256, size=(1000, 1000), dtype=np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output

def test_large_sparse_levels():
    # Large image, only a few levels used
    img = np.random.choice([0, 128, 255], size=(1000, 1000), p=[0.1, 0.8, 0.1]).astype(np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Should have at most 3 unique values
    unique = np.unique(out)

def test_large_checkerboard():
    # Large checkerboard pattern
    img = np.indices((1000, 1000)).sum(axis=0) % 2 * 255
    img = img.astype(np.uint8)
    codeflash_output = histogram_equalization(img); out = codeflash_output
    # Should have only two unique values
    unique = np.unique(out)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from src.numpy_pandas.signal_processing import histogram_equalization

To edit these changes git checkout codeflash/optimize-histogram_equalization-mfpxu9i4 and push.

The optimized code achieves a **224x speedup** by replacing inefficient nested loops with vectorized NumPy operations: **Key optimizations:** 1. **Histogram computation**: Replaced nested loops iterating over every pixel with `np.bincount(image.ravel(), minlength=256)`. This leverages NumPy's optimized C implementation to count pixel intensities in one operation instead of ~5 million individual array accesses. 2. **CDF calculation**: Used `np.cumsum(histogram) / total_pixels` instead of a loop that computed cumulative sums iteratively. NumPy's cumsum is highly optimized and eliminates 255 iterations. 3. **Pixel mapping**: The most critical optimization - replaced another set of nested loops with vectorized indexing `cdf[image]`. Instead of 5+ million individual pixel assignments, this performs the entire transformation in one vectorized operation using advanced NumPy indexing. **Why this is so much faster:** - The original code spent 77% of its time in the final nested loop doing individual pixel assignments - Python loops have significant overhead compared to NumPy's vectorized operations which run in compiled C - Memory access patterns are much more efficient with vectorized operations **Performance characteristics:** The optimization is particularly effective for: - Large images (1000x1000 tests show dramatic speedups) - Images with any distribution (uniform, gradient, sparse, random all benefit equally) - All test cases maintain identical correctness while gaining massive performance improvements The vectorized approach scales much better with image size, making it suitable for real-world image processing applications.

codeflash-ai bot requested a review from KRRT7 September 18, 2025 21:44

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Sep 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `histogram_equalization` by 22,436% #109

⚡️ Speed up function `histogram_equalization` by 22,436% #109

Uh oh!

codeflash-ai bot commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function histogram_equalization by 22,436% #109

Are you sure you want to change the base?

⚡️ Speed up function histogram_equalization by 22,436% #109

Uh oh!

Conversation

codeflash-ai bot commented Sep 18, 2025

📄 22,436% (224.36x) speedup for histogram_equalization in src/numpy_pandas/signal_processing.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `histogram_equalization` by 22,436% #109

⚡️ Speed up function `histogram_equalization` by 22,436% #109

📄 22,436% (224.36x) speedup for `histogram_equalization` in `src/numpy_pandas/signal_processing.py`