label_majority computed but not used in final output — intentional? #81

santhil-cyber · 2026-02-23T08:48:06Z

santhil-cyber
Feb 23, 2026

Hey @fwitmer, @rawann31 and @Ritika-K7, I was exploring UAA GSoC projects and this one immediately caught my attention. What drew me in wasn't just the engineering side — the combination of computer vision, geospatial analysis and a very specific real-world problem tied to Alaskan coastlines made it genuinely exciting to dig into.

Quick intro: I'm Santhil Kherwal, a 2nd year undergrad at JIIT. I've always been drawn to problems that sit at the intersection of systems engineering and applied research, and adaptive coastline extraction feels like exactly that kind of space.

I've already set up the codebase locally and spent the last few days genuinely exploring it — not just reading the README but actually tracing through the NDWI pipeline, understanding how the sliding window approach feeds into the Otsu thresholding, and going through ndwi_labels.py in detail. While doing this I noticed something I wanted to flag before assuming it's a bug.

The code computes label_majority at line 183 using a 0.55 majority threshold across overlapping windows — so a pixel only gets classified as water if more than 55% of the windows covering it agree. That feels like a solid way to reduce noise from any single window's Otsu threshold misfiring.

label_majority = np.where(water_count > (buffer_numbers * MAJORITY_THRESHOLD), 1, 0)

But the final ndwi_concatenated output at line 192 uses label instead, which is built with OR operations across windows — meaning a pixel becomes water if any window classifies it as such.

ndwi_concatenated = np.where(sliding_windows == 1, label, ndwi_classified)

From what I can tell, label_majority gets computed and plotted in save_ndwi_plots() but never actually makes it into the final TIFF or shapefile output.

Is this intentional? I'm wondering if there's a reason the OR-based label is preferred over the majority vote for the final output — maybe the downstream U-Net training pipeline expects more inclusive labels, or there's a recall vs precision trade-off I'm not seeing here.

I may be missing context from earlier design decisions, so happy to be corrected. Just wanted to flag it before going further!

Thanks 🙏

Ritika-K7 · 2026-03-05T17:41:04Z

Ritika-K7
Mar 5, 2026

Hi Santhil, Thanks for looking into the code so carefully . Yes, your understanding about this is correct. The code calculates label_majority using the majority rule to reduce noise from individual sliding windows. However, in the final step the pipeline currently uses label, which is created using OR operations across the windows. right now label_majority is mainly used for analysis and visualization in the plots, and we are also trying to improve it. It is not currently used for generating the final TIFF or shapefile output. Using label makes the water labels more inclusive, while the majority rule would be more conservative. So this is definitely a good point to review. Thanks for bringing this up. Looking forward to your updates. regards, Ritika

…

On Mon, Feb 23, 2026 at 2:18 PM Santhil Kherwal ***@***.***> wrote: Hey @fwitmer <https://github.com/fwitmer>, @rawann31 <https://github.com/rawann31> and @Ritika-K7 <https://github.com/Ritika-K7>, I was exploring UAA GSoC projects and this one immediately caught my attention. What drew me in wasn't just the engineering side — the combination of computer vision, geospatial analysis and a very specific real-world problem tied to Alaskan coastlines made it genuinely exciting to dig into. Quick intro: I'm Santhil Kherwal, a 2nd year undergrad at JIIT. I've always been drawn to problems that sit at the intersection of systems engineering and applied research, and adaptive coastline extraction feels like exactly that kind of space. I've already set up the codebase locally and spent the last few days genuinely exploring it — not just reading the README but actually tracing through the NDWI pipeline, understanding how the sliding window approach feeds into the Otsu thresholding, and going through ndwi_labels.py in detail. While doing this I noticed something I wanted to flag before assuming it's a bug. The code computes label_majority at line 183 using a 0.55 majority threshold across overlapping windows — so a pixel only gets classified as water if more than 55% of the windows covering it agree. That feels like a solid way to reduce noise from any single window's Otsu threshold misfiring. label_majority = np.where(water_count > (buffer_numbers * MAJORITY_THRESHOLD), 1, 0) But the final ndwi_concatenated output at line 192 uses label instead, which is built with OR operations across windows — meaning a pixel becomes water if *any* window classifies it as such. ndwi_concatenated = np.where(sliding_windows == 1, label, ndwi_classified) From what I can tell, label_majority gets computed and plotted in save_ndwi_plots() but never actually makes it into the final TIFF or shapefile output. *Is this intentional?* I'm wondering if there's a reason the OR-based label is preferred over the majority vote for the final output — maybe the downstream U-Net training pipeline expects more inclusive labels, or there's a recall vs precision trade-off I'm not seeing here. I may be missing context from earlier design decisions, so happy to be corrected. Just wanted to flag it before going further! Thanks 🙏 — Reply to this email directly, view it on GitHub <#81>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BD5MT6JKK5NYPBRQ4ZPAVOT4NK5FXAVCNFSM6AAAAACV4NSBR2VHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZZGUZDANRYHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

label_majority computed but not used in final output — intentional? #81

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

label_majority computed but not used in final output — intentional? #81

Uh oh!

santhil-cyber Feb 23, 2026

Replies: 1 comment

Uh oh!

Ritika-K7 Mar 5, 2026

santhil-cyber
Feb 23, 2026

Ritika-K7
Mar 5, 2026