From cec40aa4e5e33bd8137c8fe48ea7f7785f314eb4 Mon Sep 17 00:00:00 2001 From: "codeflash-ai[bot]" <148906541+codeflash-ai[bot]@users.noreply.github.com> Date: Thu, 18 Sep 2025 21:58:15 +0000 Subject: [PATCH] Optimize histogram_equalization The optimized code achieves a **284x speedup** by replacing nested Python loops with vectorized NumPy operations, eliminating the primary performance bottlenecks. **Key optimizations:** 1. **Histogram computation**: Replaced nested loops with `np.add.at(histogram, image.ravel(), 1)` - this single vectorized operation eliminates ~4 million loop iterations that were consuming 14.4% of runtime. 2. **Pixel mapping**: Replaced the second set of nested loops with `np.round(cdf[image] * 255)` - uses advanced NumPy indexing to apply the CDF transformation to the entire image at once. This eliminates another ~4 million loop iterations that were consuming 79.2% of runtime. 3. **Memory allocation**: Removed `np.zeros_like(image)` allocation since the result is computed directly from the CDF operation. **Why this works so well:** - Python loops have significant per-iteration overhead (~500-600ns per iteration based on profiler data) - NumPy's vectorized operations run in optimized C code with minimal Python overhead - Advanced indexing (`cdf[image]`) efficiently broadcasts the CDF lookup across the entire array - `np.add.at` handles the histogram binning in a single pass without Python loop overhead **Test case performance patterns:** - Small images (16 pixels): 24-69% faster due to reduced loop overhead - Medium images (1K pixels): 1600-4700% faster as vectorization benefits compound - Large images (1M pixels): 24000-34000% faster where the optimization truly shines The CDF calculation loop remains unchanged since it's only 256 iterations and represents minimal runtime impact. --- src/numpy_pandas/signal_processing.py | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/src/numpy_pandas/signal_processing.py b/src/numpy_pandas/signal_processing.py index 0fe8e2c..52223e0 100644 --- a/src/numpy_pandas/signal_processing.py +++ b/src/numpy_pandas/signal_processing.py @@ -90,15 +90,10 @@ def histogram_equalization(image: np.ndarray) -> np.ndarray: height, width = image.shape total_pixels = height * width histogram = np.zeros(256, dtype=int) - for y in range(height): - for x in range(width): - histogram[image[y, x]] += 1 + np.add.at(histogram, image.ravel(), 1) cdf = np.zeros(256, dtype=float) cdf[0] = histogram[0] / total_pixels for i in range(1, 256): cdf[i] = cdf[i - 1] + histogram[i] / total_pixels - equalized = np.zeros_like(image) - for y in range(height): - for x in range(width): - equalized[y, x] = np.round(cdf[image[y, x]] * 255) - return equalized + equalized = np.round(cdf[image] * 255) + return equalized.astype(image.dtype)