Skip to content

Code repository for paper: "G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models"

License

Notifications You must be signed in to change notification settings

Applied-Machine-Learning-Lab/G3

Repository files navigation

This is the code repository for paper "G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models"

MP16-Pro

You can download the images and metadata of MP16-Pro from huggingface: Jia-py/MP16-Pro

Data

IM2GPS3K: images | metadata

YFCC4K: images | metadata

Environment Setting

# test on cuda12.0
conda create -n g3 python=3.9
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu121
pip install transformers accelerate huggingface_hub pandas

Running samples

  1. Geo-alignment

You can run python run_G3.py to train the model.

  1. Geo-diversification

First, you need to build the index file using python IndexSearch.py.

Parameters in IndexSearch.py

  • index name --> which model you want to use for embedding
  • dataset --> im2gps3k or yfcc4k
  • database --> default mp16

Then, you also need to construct index for negative samples by modifying images_embeds to -1 * images_embeds

Then, you can run llm_predict_hf.py or llm_predict.py to generate llm predictions.

After that, running aggregate_llm_predictions.py to aggregate the predictions.

  1. Geo-verification

python IndexSearch.py --index=g3 --dataset=im2gps3k or yfcc4k to verificate predictions and evaluate.

Citation

@article{jia2024g3,
  title={G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models},
  author={Jia, Pengyue and Liu, Yiding and Li, Xiaopeng and Zhao, Xiangyu and Wang, Yuhao and Du, Yantong and Han, Xiao and Wei, Xuetao and Wang, Shuaiqiang and Yin, Dawei},
  journal={arXiv preprint arXiv:2405.14702},
  year={2024}
}

About

Code repository for paper: "G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages