-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add segmentation GVI, refactor scoring images #55
base: main
Are you sure you want to change the base?
feat: Add segmentation GVI, refactor scoring images #55
Conversation
|
Hey Dan!
|
...I don't fully understand the 2 options being offered.
|
Previously the script determined the output file path on its own instead of accepting a CLI argument, do we want to revert back to determining its own output file path for now? |
ah, no thanks. i like inputting the file path. |
"torch", | ||
"transformers", | ||
"tqdm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"torch", | |
"transformers", | |
"tqdm", | |
"torch", | |
"tqdm", | |
"transformers", |
thread_map( | ||
score_image, | ||
gdf.index, | ||
total=len(gdf.index), | ||
desc="Calculating GVI of Images/Points", | ||
unit="images", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if thread_map
makes sense for this. My guess is that scoring images is probably going to be compute bound rather than I/O bound—though I'm not sure about this. I guess maybe pixel counting would be I/O bound but the segmentation is probably compute-bound.
It's possible process_map
would speed things up more. Most likely it would for the pixel counting method. It might not for the segmentation model, if the segmentation model is internally parallelized already.
I wonder then if we want a score_images
that takes a collection of images to be the primary API (rather than a score_image
for a single image, though it can still be useful to keep that around for utility) and then the individual scoring methods can determine the most effective parallelization approach for themselves.
w4 = int(width / 4) | ||
h4 = int(height / 4) | ||
hFor43 = int(w4 * 3 / 4) | ||
images = [] | ||
pickles = [] | ||
# Crop the panoramic image based on road centers | ||
for w in range(4): | ||
x_begin = w * w4 | ||
x_end = (w + 1) * w4 | ||
cropped_image = image.crop((x_begin, h4, x_end, h4 + hFor43)) | ||
cropped_segmentation = segmentation[h4 : h4 + hFor43, x_begin:x_end] | ||
images.append(cropped_image) | ||
pickles.append(cropped_segmentation) | ||
return images, pickles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little suspicious about this being hardcoded to these particular widths and heights. Do we even need them? If we're segmenting, wouldn't the model identify these pixels as not relevant if they're not relevant?
Also, why do we split up the image into 4 pieces? It seems like we just sum them back up again later in _get_GVI
.
Anyways, I'm also okay with investigating and fixing this later though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's 1 piece with a 360 field of view it's distorted? and if it's re-projected into 4 images with 90 degree field of view then the shapes of things aren't distorted?
i missed that this code is doing something based on road centers the NatureVisibility project describes 3 scenarios for 3. Clean and process data
- Panoramic Image Cropping using Road Centers
- Panoramic Image without Road Center Cropping
- Non-Panoramic Image
I thought we should do # 2 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be #2 based on the feature flag? The comments say that it cuts by road center, but crop_panoramic_images
is only called when cut_by_road_centers
is false. Also, do we need to do the cutting of images into 4 pieces? We don't do this for the pixel counting method, so I'm wondering if we want to keep the image modification process as simple as possible or continue to do these modifications. Pixel counting doesn't crop off the bottom 20% of the image for the car either I think, so I don't think the segmentation GVI and pixel counting GVI are very comparable or of the same scale currently
Here's the feature flag for reference: https://github.com/Spatial-Data-Science-and-GEO-AI-Lab/StreetView-NatureVisibility/blob/f4e6b5f53890db13bc32154682591937ba2271d0/modules/process_data.py#L276
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can we have an optional argument that is
--crop-vehicle 20.0
with 20 being the default (20% of image bottom cropped) if the argument isn't provided - adding this to the config file created in Initial working implementation for config files for create_points #48. this option could apply to allscore_images
workflows?- the Indonesian Red Cross is going to be doing an imagery collection test project in October, and I think they will be using cameras mounted on helmets of people riding motorbikes, so cropping may not be as beneficial compared to if the camera is in the middle of a car roof.
- regarding the cutting into 4. if it's 1 piece with a 360 field of view it's distorted? see the example below. and if it's re-projected into 4 images with 90 degree field of view then the shapes of things aren't distorted?
- this paper mentions: "distortion-aware modules to address extreme object deformations and panorama distortions that result from equirectangular representation" and "While extensive research has been conducted on pinhole based learning methods, approaches tailored for processing ultra-wide panoramic images and inherently accounting for spherical deformations remain ongoing research."
- is there anything from the NatureVisibility project that we should strip out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- i've updated issue reproject 360 image for segmentation (and other analysis options) #13 with a link to a good Stack Overflow post that has an interesting deep dive into re-projecting equirectangular panoramas. it would be great to include the reprojection as part of this PR, however we can also leave it for a future PR
# Cut panoramic image in 4 equal parts | ||
# Crop the image and its segmentation based on | ||
# the previously found road centers | ||
images, pickles = self._crop_panoramic_images(image, segmentation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the segmentations called "pickles"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pickles.append(cropped_segmentation) | ||
return images, pickles | ||
|
||
def _get_GVI(self, segmentations: list[torch.Tensor]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_GVI(self, segmentations: list[torch.Tensor]): | |
def _get_gvi(self, segmentations: list[torch.Tensor]): |
Code style preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, but also, code style issues should be caught by linter or autoformatter rules, not by manual review. For the future, let's review the Ruff rules and see which extra ones we want to add, such as variable capitalization.
GVI, segment_scores = self._get_GVI(pickles) | ||
return GVI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GVI, segment_scores = self._get_GVI(pickles) | |
return GVI | |
gvi, segment_scores = self._get_gvi(pickles) | |
return gvi |
Code style
I'm taking a look at this to work on #59 - I can get it to work with the |
Notes
output_file
Path be a required argument or an option with a sensible default?Changes
ScoringMethod
interface andScoringSelector
enumPixelCounting
GVI scoring method which implementsScoringMethod
, uses Treepedia pixel counting methodSegmentation
GVI scoring method which implementsScoringMethod
, uses mask2former model from huggingfaceTesting