Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add segmentation GVI, refactor scoring images #55

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dragonejt
Copy link
Contributor

@dragonejt dragonejt commented Jul 3, 2024

Notes

Changes

  • Add PyTorch and HuggingFace Transformers for the segmentation GVI method
  • Add output_file as argument to assign_images script
  • Rename assign_gvi_to_points to score_images
  • Add ScoringMethod interface and ScoringSelector enum
  • Add PixelCounting GVI scoring method which implements ScoringMethod, uses Treepedia pixel counting method
  • Add Segmentation GVI scoring method which implements ScoringMethod, uses mask2former model from huggingface

Testing

  • Running script with both segmentation and pixel counting methods produced scores in the geodataframe

@dragonejt dragonejt added the enhancement New feature or request label Jul 3, 2024
@dragonejt dragonejt self-assigned this Jul 3, 2024
@dragonejt dragonejt linked an issue Jul 3, 2024 that may be closed by this pull request
@danbjoseph
Copy link
Member

  • i was able to run a successful segmentation. thanks for this @dragonejt ! 🎉
  • there was comment on the issue about "do we want to divide the image into 4 images with 90 degree field of vision instead of 1 image with 360 field of vision before the segmentation step?" - was that considered in this PR? or are we just ingesting the source image as is?
  • i think we can deal with the question about a default output_file in a different PR? i logged it as default output_file for every step #58
  • can you do at least a partial update to the docs in the README as part of this PR? can you note which category(ies) from the output are being saved for the value? can you link to the mask2former page on huggingface?
    • related to the category(ies) there's this comment from use class segmentation for % vegetation/canopy estimation #10 (i didn't copy over the image being referenced): "see the sample segmented image above. there's a more pastel green (grassy looking area) and a more neon green (appears to be mostly trees). confirm the different possible labels related to vegetation (there are at least 2)."
  • can we have an option flag to save out the segmentation results? same folder as the images, same filename with maybe "_segmentation_YYYYMMDD" appended? however, StreetView-NatureVisibility does the images side-by-side in one file and shows all the categories. ours would just the segmentation image by itself and it would be just 2 colors for vegetation and not-vegetation (or even 1 color, green for vegetation and then transparent for the null?). this request can also be logged as an issue and done after merging this PR.

@dragonejt
Copy link
Contributor Author

Hey Dan!

  • We are currently ingesting the image as-is from the JPEG format. No modification has been done, I believe.
  • Sure output_file can be in a different PR. Do you want me to revert the current output_file changes then (that make it more similar to create_points.py)?
  • I'll add README pages shortly.
  • I can also take a look at saving the segmentation images, though I agree that it would probably be best put in a separate PR.

@danbjoseph
Copy link
Member

  • For...

Do you want me to revert the current output_file changes then (that make it more similar to create_points.py)?

...I don't fully understand the 2 options being offered.

@dragonejt
Copy link
Contributor Author

Previously the script determined the output file path on its own instead of accepting a CLI argument, do we want to revert back to determining its own output file path for now?

@danbjoseph
Copy link
Member

Previously the script determined the output file path on its own instead of accepting a CLI argument, do we want to revert back to determining its own output file path for now?

ah, no thanks. i like inputting the file path.

Comment on lines +35 to 37
"torch",
"transformers",
"tqdm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"torch",
"transformers",
"tqdm",
"torch",
"tqdm",
"transformers",

Comment on lines +85 to +91
thread_map(
score_image,
gdf.index,
total=len(gdf.index),
desc="Calculating GVI of Images/Points",
unit="images",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if thread_map makes sense for this. My guess is that scoring images is probably going to be compute bound rather than I/O bound—though I'm not sure about this. I guess maybe pixel counting would be I/O bound but the segmentation is probably compute-bound.

It's possible process_map would speed things up more. Most likely it would for the pixel counting method. It might not for the segmentation model, if the segmentation model is internally parallelized already.

I wonder then if we want a score_images that takes a collection of images to be the primary API (rather than a score_image for a single image, though it can still be useful to keep that around for utility) and then the individual scoring methods can determine the most effective parallelization approach for themselves.

Comment on lines +116 to +129
w4 = int(width / 4)
h4 = int(height / 4)
hFor43 = int(w4 * 3 / 4)
images = []
pickles = []
# Crop the panoramic image based on road centers
for w in range(4):
x_begin = w * w4
x_end = (w + 1) * w4
cropped_image = image.crop((x_begin, h4, x_end, h4 + hFor43))
cropped_segmentation = segmentation[h4 : h4 + hFor43, x_begin:x_end]
images.append(cropped_image)
pickles.append(cropped_segmentation)
return images, pickles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little suspicious about this being hardcoded to these particular widths and heights. Do we even need them? If we're segmenting, wouldn't the model identify these pixels as not relevant if they're not relevant?

Also, why do we split up the image into 4 pieces? It seems like we just sum them back up again later in _get_GVI.

Anyways, I'm also okay with investigating and fixing this later though.

Copy link
Member

@danbjoseph danbjoseph Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's 1 piece with a 360 field of view it's distorted? and if it's re-projected into 4 images with 90 degree field of view then the shapes of things aren't distorted?

i missed that this code is doing something based on road centers the NatureVisibility project describes 3 scenarios for 3. Clean and process data

  1. Panoramic Image Cropping using Road Centers
  2. Panoramic Image without Road Center Cropping
  3. Non-Panoramic Image

I thought we should do # 2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be #2 based on the feature flag? The comments say that it cuts by road center, but crop_panoramic_images is only called when cut_by_road_centers is false. Also, do we need to do the cutting of images into 4 pieces? We don't do this for the pixel counting method, so I'm wondering if we want to keep the image modification process as simple as possible or continue to do these modifications. Pixel counting doesn't crop off the bottom 20% of the image for the car either I think, so I don't think the segmentation GVI and pixel counting GVI are very comparable or of the same scale currently

Here's the feature flag for reference: https://github.com/Spatial-Data-Science-and-GEO-AI-Lab/StreetView-NatureVisibility/blob/f4e6b5f53890db13bc32154682591937ba2271d0/modules/process_data.py#L276

Copy link
Member

@danbjoseph danbjoseph Aug 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • can we have an optional argument that is --crop-vehicle 20.0 with 20 being the default (20% of image bottom cropped) if the argument isn't provided - adding this to the config file created in Initial working implementation for config files for create_points #48. this option could apply to all score_images workflows?
    • the Indonesian Red Cross is going to be doing an imagery collection test project in October, and I think they will be using cameras mounted on helmets of people riding motorbikes, so cropping may not be as beneficial compared to if the camera is in the middle of a car roof.
  • regarding the cutting into 4. if it's 1 piece with a 360 field of view it's distorted? see the example below. and if it's re-projected into 4 images with 90 degree field of view then the shapes of things aren't distorted?
    • An9poZTMiWJkhmnbkEwneUA6vG06-2Hy-38oHtvOoaccGFR9-r1CM79NS4kxnSEzGQa-AFRORhzwgTcutyzLCkkHyegqrWEyv-RZE8JRseSELydN7Cc18P9tF00OzBXerqphcoASmiv_hS6VqZYh3Rw
    • this paper mentions: "distortion-aware modules to address extreme object deformations and panorama distortions that result from equirectangular representation" and "While extensive research has been conducted on pinhole based learning methods, approaches tailored for processing ultra-wide panoramic images and inherently accounting for spherical deformations remain ongoing research."
  • is there anything from the NatureVisibility project that we should strip out?

Copy link
Member

@danbjoseph danbjoseph Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Cut panoramic image in 4 equal parts
# Crop the image and its segmentation based on
# the previously found road centers
images, pickles = self._crop_panoramic_images(image, segmentation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the segmentations called "pickles"?

Copy link
Contributor Author

@dragonejt dragonejt Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pickles.append(cropped_segmentation)
return images, pickles

def _get_GVI(self, segmentations: list[torch.Tensor]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _get_GVI(self, segmentations: list[torch.Tensor]):
def _get_gvi(self, segmentations: list[torch.Tensor]):

Code style preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, but also, code style issues should be caught by linter or autoformatter rules, not by manual review. For the future, let's review the Ruff rules and see which extra ones we want to add, such as variable capitalization.

Comment on lines +67 to +68
GVI, segment_scores = self._get_GVI(pickles)
return GVI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GVI, segment_scores = self._get_GVI(pickles)
return GVI
gvi, segment_scores = self._get_gvi(pickles)
return gvi

Code style

@ioalexei
Copy link
Contributor

ioalexei commented Oct 2, 2024

I'm taking a look at this to work on #59 - I can get it to work with the PIXELS argument, but running with SEGMENTATION causes my session to crash. Any suggestions for how I can identify the cause?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

use class segmentation for % vegetation/canopy estimation
4 participants