The repository is the official implementation of Sat2Cap [CVPRW, EarthVision 2024, Best Paper Award]. Sat2Cap model solves the mapping problem in a zero-shot approach. Instead of predicting pre-defined attributes for a satellite image, Sat2Cap attempts to learn the text associated with a given location.
You can use the run_geo.sh
script to train the Sat2Cap model. All the necessary hyperparameters can be set in the bash script.
Once you have the trained model use the generate_map_embedding.py
file under evaluations to generate Sat2Cap embeddings for all images of interest.
Use merge_embeddings.py
to add location and temporal input to the generated embeddings. Finally, the get_similarity.py
file generates similarity values for a given prompt. These similarity values can then be used to create zero-shot maps.
@inproceedings{dhakal2024sat2cap,
title={Sat2cap: Mapping fine-grained textual descriptions from satellite images},
author={Dhakal, Aayush and Ahmad, Adeel and Khanal, Subash and Sastry, Srikumar and Kerner, Hannah and Jacobs, Nathan},
booktitle={IEEE/ISPRS Workshop: Large Scale Computer Vision for Remote Sensing (EARTHVISION)},
pages={533--542},
year={2024}
}
Check out our lab website for other interesting works on geospatial understanding and mapping: