label_majority computed but not used in final output — intentional? #81
santhil-cyber
started this conversation in
General
Replies: 1 comment
-
|
Hi Santhil,
Thanks for looking into the code so carefully .
Yes, your understanding about this is correct.
The code calculates label_majority using the majority rule to reduce noise
from individual sliding windows. However, in the final step the pipeline
currently uses label, which is created using OR operations across the
windows.
right now label_majority is mainly used for analysis and visualization in
the plots, and we are also trying to improve it. It is not currently used
for generating the final TIFF or shapefile output.
Using label makes the water labels more inclusive, while the majority rule
would be more conservative. So this is definitely a good point to review.
Thanks for bringing this up.
Looking forward to your updates.
regards,
Ritika
…On Mon, Feb 23, 2026 at 2:18 PM Santhil Kherwal ***@***.***> wrote:
Hey @fwitmer <https://github.com/fwitmer>, @rawann31
<https://github.com/rawann31> and @Ritika-K7
<https://github.com/Ritika-K7>, I was exploring UAA GSoC projects and
this one immediately caught my attention. What drew me in wasn't just the
engineering side — the combination of computer vision, geospatial analysis
and a very specific real-world problem tied to Alaskan coastlines made it
genuinely exciting to dig into.
Quick intro: I'm Santhil Kherwal, a 2nd year undergrad at JIIT. I've
always been drawn to problems that sit at the intersection of systems
engineering and applied research, and adaptive coastline extraction feels
like exactly that kind of space.
I've already set up the codebase locally and spent the last few days
genuinely exploring it — not just reading the README but actually tracing
through the NDWI pipeline, understanding how the sliding window approach
feeds into the Otsu thresholding, and going through ndwi_labels.py in
detail. While doing this I noticed something I wanted to flag before
assuming it's a bug.
The code computes label_majority at line 183 using a 0.55 majority
threshold across overlapping windows — so a pixel only gets classified as
water if more than 55% of the windows covering it agree. That feels like a
solid way to reduce noise from any single window's Otsu threshold misfiring.
label_majority = np.where(water_count > (buffer_numbers * MAJORITY_THRESHOLD), 1, 0)
But the final ndwi_concatenated output at line 192 uses label instead,
which is built with OR operations across windows — meaning a pixel becomes
water if *any* window classifies it as such.
ndwi_concatenated = np.where(sliding_windows == 1, label, ndwi_classified)
From what I can tell, label_majority gets computed and plotted in
save_ndwi_plots() but never actually makes it into the final TIFF or
shapefile output.
*Is this intentional?* I'm wondering if there's a reason the OR-based
label is preferred over the majority vote for the final output — maybe
the downstream U-Net training pipeline expects more inclusive labels, or
there's a recall vs precision trade-off I'm not seeing here.
I may be missing context from earlier design decisions, so happy to be
corrected. Just wanted to flag it before going further!
Thanks 🙏
—
Reply to this email directly, view it on GitHub
<#81>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BD5MT6JKK5NYPBRQ4ZPAVOT4NK5FXAVCNFSM6AAAAACV4NSBR2VHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZZGUZDANRYHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey @fwitmer, @rawann31 and @Ritika-K7, I was exploring UAA GSoC projects and this one immediately caught my attention. What drew me in wasn't just the engineering side — the combination of computer vision, geospatial analysis and a very specific real-world problem tied to Alaskan coastlines made it genuinely exciting to dig into.
Quick intro: I'm Santhil Kherwal, a 2nd year undergrad at JIIT. I've always been drawn to problems that sit at the intersection of systems engineering and applied research, and adaptive coastline extraction feels like exactly that kind of space.
I've already set up the codebase locally and spent the last few days genuinely exploring it — not just reading the README but actually tracing through the NDWI pipeline, understanding how the sliding window approach feeds into the Otsu thresholding, and going through
ndwi_labels.pyin detail. While doing this I noticed something I wanted to flag before assuming it's a bug.The code computes
label_majorityat line 183 using a 0.55 majority threshold across overlapping windows — so a pixel only gets classified as water if more than 55% of the windows covering it agree. That feels like a solid way to reduce noise from any single window's Otsu threshold misfiring.But the final
ndwi_concatenatedoutput at line 192 useslabelinstead, which is built with OR operations across windows — meaning a pixel becomes water if any window classifies it as such.From what I can tell,
label_majoritygets computed and plotted insave_ndwi_plots()but never actually makes it into the final TIFF or shapefile output.Is this intentional? I'm wondering if there's a reason the OR-based
labelis preferred over the majority vote for the final output — maybe the downstream U-Net training pipeline expects more inclusive labels, or there's a recall vs precision trade-off I'm not seeing here.I may be missing context from earlier design decisions, so happy to be corrected. Just wanted to flag it before going further!
Thanks 🙏
Beta Was this translation helpful? Give feedback.
All reactions