RUMpy is a PyTorch-based toolbox for blind image super-resolution (SR), with a variety of SR deep learning architectures and degradation predictors available for use. In particular, RUMpy provides:
- An easy-to-use CLI for:
- Generating low-resolution (LR) images from a high-resolution (HR) dataset, with various types of blurring, noise addition and compression available.
- Training or fine-tuning of SR models and degradation predictors.
- Qualitative and quantitative evaluation of SR results.
- Tools for analyzing, moving and curating models.
- Various SR and degradation prediction architectures, with customizable settings.
- Straightforward pipeline for developing and integrating new models in the framework.
- A GUI for quick evaluation of models, including the cropping and direct SR of video frames.
- Integration with Aim (Mac & Linux only) for training monitoring.
RUMpy has been used to test and combine a variety of blind degradation prediction systems with high-performing SR architectures. Our results are available in our 2022 Sensors journal paper here. The basic concept of the 'Best of Both Worlds' framework is illustrated below:
A snapshot of our blind SR results on a real-world image are available below (more details in the full paper):
This also acts as the main code repository for the Deep-FIR project (University of Malta).
Developers/Researchers:
Project Management:
If installing from scratch, it is first recommended to set up a new Python virtual environment prior to installing this code. With Conda, this can be achieved through the following:
conda create -n *environment_name* python=3.7
(Python 3.7-3.8 recommended but not essential).
conda activate *environment_name*
Code testing was conducted in Python 3.7 & Python 3.8, but the code should work well with Python 3.6+.
Run the following commands from the repo base directory to fully install the package and all requirements:
-
If using CPU only:
Install main requirements via:
conda install --file requirements.txt --channel pytorch --channel conda-forge
If using CPU + GPU:
First install Pytorch and Cudatoolkit for your specific configuration using instructions here. Then, install requirements as above.
-
Install pip packages via:
pip install -r pip_requirements.txt
-
If using Aim for metrics logging, install via
pip install aim
. The Aim GUI does not work on Windows, but metrics should still be logged in the .aim folder. -
pip install -e .
This installs the toolbox, but will also auto-update if any changes to the code are made (this is ideal for those seeking to make their own custom changes to the code).
All functionality has been tested on Linux (CPU & GPU), Mac OS (CPU) and Windows (CPU & GPU).
Requirements installation is only meant as a guide and all requirements can be installed using alternative means (e.g. using pip
for all packages).
Details provided in GUI/README.md
.
All details on generating LR data are provided in Documentation/data_prep.md
.
DIV2K training/validation downloadable from here.
Flickr2K dataset downloadable from here.
All SR testing datasets are available for download from the LapSRN main page here. Generate LR versions of each image using the same commands as used for the DIV2K/Flickr2K datasets.
Please refer to details here.
To train models, prepare a configuration file (details in Documentation/model_training.md
) and run:
train_sisr --parameters *path_to_config_file*
Similarly, for evaluation, prepare an eval config file (details in Documentation/sisr_model_eval.md
) and run:
eval_sisr --config *path_to_config_file*
Additional functionality for evaluating contrastive models is discussed in Documentation/contrastive_model_eval.md
.
Standard SISR models available (code for each adapted from their official repository - linked within source code):
- SRCNN
- VDSR
- EDSR
- RCAN
- ESRGAN - 4x only
- Real-ESRGAN - 4x only
- Wavelet-SRNet
- Wavelet-SRGAN (WIP)
- SPARNet
- DICNET (not fully validated) - 4x only
- SFTMD
- SRMD
- SAN
- HAN
- ELAN
- RCAN (DAN & Contrastive Encoders)
- HAN (DAN & Contrastive Encoders)
- ELAN (DAN & Contrastive Encoders)
- Real-ESRGAN (DAN & Contrastive Encoders)
- SAN (Contrastive Encoders)
- EDSR (Contrastive Encoders)
- IEEE Signal Processing Letters models (baseline models + meta-attention models): link
- Sensors 2022 models (all Best of Both Worlds models): link
Once downloaded, models from the above links can be used directly with the eval command (```eval_sisr``) or with the GUI.
Install the required packages using pip install -r special_requirements.txt
. Model weights are automatically installed to ~./keras
when first used.
Download pre-trained weights for the VGGFace model from here (scroll to VGGFace). Place the weights file in the directory ./external_packages/VGGFace/
. The weights file should be called vgg_face_dag.pth
.
Download pre-trained weights for the lightCNN model from here (LightCNN-29 v1). Place the weights file in the directory ./external_packages/LightCNN/
. The weights file should be called LightCNN_29Layers_checkpoint.pth.tar
.
Download pre-trained weights for the YOLO model from here. Place the weights file in the directory ./Code/utils/yolo_detection
. The weights file should be called yolov3-wider_16000.weights
.
Download pre-trained weights for the Bisenet model trained on Celeba-HQ from here. Place the weights file in the directory ./Code/utils/face_segmentation
. The weights file should be called weights.pth
.
Download the reference software from here. Place the software in the directory ./JM
. cd into this directory and compile the software using the commands . unixprep.sh
and make
. Some changes might be required for different OS versions.
Install the face alignment package using conda install -c 1adrianb face_alignment
.
ℹ️ NOTE: Currently, this package doesn't get installed properly if using Python 3.8/CUDA 10.0.
ℹ️ NOTE: Landmarks generated by this method vary slightly if using older versions of the package.
Information on how to develop and train your own models is available in Documentation/framework_development.md
.
The entire list of commands available with this repository is:
train_sisr
- main model training function.eval_sisr
- main model evaluation function.image_manipulate
- main bulk image converter.find_faces
- Helper function for using YOLO face detector to detect faces in an input image directory.face_segment
- Helper function to segment face images and save output map for downstream use.images_to_video
- Helper function to convert a folder of images into a video.extract_best_model
- Helper function to extract model config and best model checkpoint from a folder to a target location.clean_models
- Helper function to remove unnecessary model checkpoints.model_report
- Helper function to report on models available in specified directory.
Each command can be run with the --help
parameter, which will print out the available options and docstrings.
Simply run:
pip uninstall rumpy
from any directory, with the relevant virtual environment activated.
The main paper to cite for this repository is our 2023 paper:
@Article{RUMpy,
AUTHOR = {Aquilina, Matthew and Ciantar, Keith George and Galea, Christian and Camilleri, Kenneth P. and Farrugia, Reuben A. and Abela, John},
TITLE = {The Best of Both Worlds: A Framework for Combining Degradation Prediction with High Performance Super-Resolution Networks},
JOURNAL = {Sensors},
VOLUME = {23},
YEAR = {2023},
NUMBER = {1},
ARTICLE-NUMBER = {419},
URL = {https://www.mdpi.com/1424-8220/23/1/419},
ISSN = {1424-8220},
ABSTRACT = {To date, the best-performing blind super-resolution (SR) techniques follow one of two paradigms: (A) train standard SR networks on synthetic low-resolution–high-resolution (LR–HR) pairs or (B) predict the degradations of an LR image and then use these to inform a customised SR network. Despite significant progress, subscribers to the former miss out on useful degradation information and followers of the latter rely on weaker SR networks, which are significantly outperformed by the latest architectural advancements. In this work, we present a framework for combining any blind SR prediction mechanism with any deep SR network. We show that a single lightweight metadata insertion block together with a degradation prediction mechanism can allow non-blind SR architectures to rival or outperform state-of-the-art dedicated blind SR networks. We implement various contrastive and iterative degradation prediction schemes and show they are readily compatible with high-performance SR networks such as RCAN and HAN within our framework. Furthermore, we demonstrate our framework’s robustness by successfully performing blind SR on images degraded with blurring, noise and compression. This represents the first explicit combined blind prediction and SR of images degraded with such a complex pipeline, acting as a baseline for further advancements.},
DOI = {10.3390/s23010419}
}
An earlier version of this framework has also been used for our 2021 Signal Processing Letters paper introducing meta-attention. A checkpoint containing this earlier version is available here, with the associated paper available to cite as follows:
@ARTICLE{Meta-Attention,
author={Aquilina, Matthew and Galea, Christian and Abela, John and Camilleri, Kenneth P. and Farrugia, Reuben A.},
journal={IEEE Signal Processing Letters},
title={Improving Super-Resolution Performance Using Meta-Attention Layers},
year={2021},
volume={28},
number={},
pages={2082-2086},
doi={10.1109/LSP.2021.3116518}}
This code has been released via the GNU GPLv3 open-source license. However, this code can also be made available via an alternative closed, permissive license. Third-parties interested in this form of licensing should contact us separately.
Usages of code from other repositories is properly referenced within the code itself and the licenses of these repositories are available under Documentation/external_licenses.