Official code repository for the CVPR2025 paper Any-Resolution AI-Generated Image Detection by Spectral Learning.
Dimitrios Karageorgiou1,2, Symeon Papadopoulos1, Ioannis Kompatsiaris1, Efstratios Gavves2,3
1 Information Technologies Institute, CERTH, Greece
2 University of Amsterdam, The Netherlands
3 Archimedes/Athena RC, Greece
SPAI employs spectral learning to learn the spectral distribution of real images under a self-supervised setup. Then, using the spectral reconstruction similarity it detects AI-generated images as out-of-distribution samples of this learned model.
- 28/03/25: Code released.
- 27/02/25: Paper accepted on CVPR2025.
The code originally targeted Nvidia L40S 48GB GPU, however many recent cuda-enabled GPUs should be supported. Inference should be effortless performed with less than 8GB of GPU RAM. As training originally targeted a 48GB GPU, a suitable GPU should be presented to reproduce the paper's setup without further modifications of the code.
To train and evaluate SPAI an anaconda environment can be used for installing all the required dependencies as following:
conda create -n spai python=3.11
conda activate spai
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia
pip install -r requirements.txt
Furthermore, the installation of Nvidia APEX is required for training.
The trained weights checkpoint can be downloaded here
and should be placed under the weights
directory, located under the project's root directory.
To compute the predicted scores for a set of images, place them under a directory and use the following command.
python -m spai --input <input_dir> --output <output_dir>
where:
input_dir
: is a directory where the input images are located,output_dir
: is a directory where a csv file with the predictions will be written.
The --input
option also accepts CSV files containing the paths of the images. The CSV
files of the evaluation set, included under the data
directory, can be used as examples.
For downloading the images of these evaluation CSVs, check the instruction here.
We learn a model of the spectral distribution of real images under a self-supervised setup using masked spectral learning. Then, we use the spectral reconstruction similarity to measure the divergence from this learned distribution and detect AI-generated images as out-of-distribution samples of this model. Spectral context vector captures the spectral context under which the spectral reconstruction similarity values are computed, while spectral context attention enables the processing of any-resolution images for capturing subtle spectral inconsistencies.
Download the pre-trained ViT-B/16 MFM model from its public repo
and place it under the weights
directory:
weights
|_ mfm_pretrain_vit_base.pth
Latent diffusion training and validation data can be downloaded from their corresponding repo.
Furthermore, the corresponding instructions for downloading COCO and LSUN should be followed.
They should be placed under the datasets
directory as following:
datasets
|_latent_diffusion_trainingset
|_train
...
|_val
...
|_COCO
...
|_LSUN
...
Then, a csv file describing these data should be created as following:
python spai/create_dmid_ldm_train_val_csv.py \
--train_dir "./datasets/latent_diffusion_trainingset/train" \
--val_dir "./datasets/latent_diffusion_trainingset/val" \
--coco_dir "./datasets/COCO" \
--lsun_dir "./datasets/LSUN" \
-o "./datasets/ldm_train_val.csv"
The validation split can be augmented as following:
python spai/tools/augment_dataset.py \
--cfg ./configs/vit_base/vit_base__multipatch__100ep__intermediate__restore__patch_proj_per_feature__last_proj_layer_no_activ__fre_orig_branch__all_layers__bce_loss__light_augmentations.yaml \
-c ./datasets/ldm_val.csv \
-o ./datasets/ldm_val_augm.csv \
-d ./datasets/latent_diffusion_trainingset_augm
Then, training can be performed as following:
python -m spai train \
--cfg "./configs/spai.yaml" \
--batch-size 72 \
--pretrained "./weights/mfm_pretrain_vit_base.pth" \
--output "./output/train" \
--data-path "./datasets/ldm_train_val.csv" \
--tag "spai" \
--amp-opt-level "O2" \
--data-workers 8 \
--save-all \
--opt "DATA.VAL_BATCH_SIZE" "256" \
--opt "MODEL.FEATURE_EXTRACTION_BATCH" "400" \
--opt "DATA.TEST_PREFETCH_FACTOR" "1"
When a model has been trained using the previous script, it can be evaluated as following:
python -m spai test \
--cfg "./configs/spai.yaml" \
--batch-size 8 \
--model "./output/train/finetune/spai/<epoch_name>.pth" \
--output "./output/spai/test" \
--tag "spai" \
--opt "MODEL.PATCH_VIT.MINIMUM_PATCHES" "4" \
--opt "DATA.NUM_WORKERS" "8" \
--opt "MODEL.FEATURE_EXTRACTION_BATCH" "400" \
--opt "DATA.TEST_PREFETCH_FACTOR" "1" \
--test-csv "<test_csv_path>"
where:
test_csv_path
: Path to a csv file including the paths of the testing data.epoch_name
: Filename of the epoch selected during validation.
This work was partly supported by the Horizon Europe projects ELIAS (grant no. 101120237) and vera.ai (grant no. 101070093). The computational resources were granted with the support of GRNET.
Pieces of code from the MFM project have been used as a basis for developing this project. We thank its authors for their contribution.
This project will download and install additional third-party open source software projects. Also, all the employed third-party data retain their original license. Review their license terms before use.
The source code and model weights of this project are released under the Apache 2 License.
For any question you can contact [email protected].
If you found this work useful for your research, you can cite the following paper:
@article{karageorgiou2025any,
title={Any-Resolution AI-Generated Image Detection by Spectral Learning},
author={Karageorgiou, Dimitrios and Papadopoulos, Symeon and Kompatsiaris, Ioannis and Gavves, Efstratios},
journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2025}
}