-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce processing overhead for large amount of imagery data #524
Comments
Heya, any luck with this one? |
@richlv maybe this isn't an issue if you process your images once, and then keep uploading. Instead of keeping processing and uploading: mapillary_tools process_and_upload path/to/images use this workflow: # process once (no need to process again unless you have new images added)
mapillary_tools process path/to/images
# then run upload (can be run multiple times due to interruptions)
mapillary_tools upload path/to/images |
Thank you for the workaround - that might work, but it adds extra complexity in the workflow. Same as with the upload status tracking, could the processing status be tracked? |
It's a tradeoff here. It could be added for process in
Simply replace |
Thank you so much for looking into this. As far as I understand, mapillary_tools already uses TMPDIR to store the archive files for upload. Not quite sure why mtime would need to be tracked - just storing information of a particular file being processed seems sufficient. If I recall correctly, that is how the previous processing status tracking worked. The replacing of process_and_upload with upload wouldn't work - that basically requires me to manually track processing status anyway, or script that logic myself. And it's so much better for such an improvement to be available to all users :) To clarify on the need, a frequent usecase, especially when travelling:
To do so, I have a simple script that loops over these directories and runs process_and_upload. Thus the only thing the script really needs to do is loop over these directories, and process & upload. |
Thanks for the context. Would you mind sharing the relevant part of your script? Here is a simple idea that avoids reprocess by checking the existence of each image directory's #!/bin/sh
set -e
# loop through all image directories
for image_dir in "$@"; do
if [ -d "$image_dir" ]; then
desc_path="${image_dir}/mapillary_image_description.json"
# check if the desc file exists
if [ -f "$desc_path" ]; then
echo processed already "$image_dir"
else
mapillary_tools process "$image_dir" --desc_path="$desc_path"
fi
mapillary_tools upload "$image_dir" --desc_path="$desc_path"
fi
done Usage:
|
The script, if we exclude the part that does some GPX trickery for edge cases, is trivial.
Thank you for the script idea, greatly appreciated. If checking for "mapillary_image_description.json" is a good approach, what would be the reason not to do so in mapillary_tools right away, and skip processing if found? |
Think of I will keep the issue open until a good solution. For now, I would suggest script it: image_dir="$1"
desc_path="${image_dir}/mapillary_image_description.json"
# check if the desc file exists
if ! [ -f "$desc_path" ]; then
mapillary_tools process "$image_dir" $gt --desc_path="$desc_path" --interpolate_directions --skip_process_errors
fi
mapillary_tools upload "$image_dir" --desc_path="$desc_path" --user_name richlv to replace
|
Have we observed specific scenarios where it would be confusing? |
@richlv could you try out https://github.com/mapillary/mapillary_tools/releases/tag/v0.10.2? The image processing performance is improved significantly in this version (700 images per second on my laptop MBP M1). |
Thanks, that indeed is faster - greatly appreciated :) Is the potential for confusion easy to avoid, if it is still a concern? |
Although processing images (reading geotags) are fast, for large amount of data it's still slow. We should avoid processing the same images that has been uploaded or processed. See https://forum.mapillary.com/t/mapillary-tools-0-8-no-status-kept/6197/16
The text was updated successfully, but these errors were encountered: