⚡️ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation [ICML 2025]

📌 Citation

If you find this work useful for your research, please consider citing it. 😊

@misc{yue2025reqflowrectifiedquaternionflow,
      title={ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation}, 
      author={Angxiao Yue and Zichong Wang and Hongteng Xu},
      year={2025},
      eprint={2502.14637},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.14637}, 
}

🔥 News

2025/05/02 💥 ReQFlow is accepted by ICML 2025！！🎉🎉
2025/02/24 💥 Our model weights are hosted on Hugging Face and Google Drive now 😊.
2025/02/20 💥 We release our work ReQFlow for efficient and high-quality protein backbone generation!

🧩 Introduction

Our ReQFlow achieves state-of-the-art (SOTA) performance in protein backbone generation while requiring significantly fewer sampling steps and substantially reducing inference time. For example, it is 37× faster than RFDiffusion and 62× faster than Genie2 when generating a backbone of length 300, demonstrating both its effectiveness and efficiency.

⚒️ Installation

We recommend using mamba. If using mamba then use mamba in place of conda.

conda env create -f reqflow-env.yml

conda activate reqflow-env

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cu117.html

pip install --upgrade deepspeed

# Install local package.
# Current directory should be ReQFlow/
pip install -e .

🚀 Quick Inference

Our model weights are available for download on Hugging Face or Google Drive. You can also use your own weights. If using ours, please organize the directory as follows:

ReQFlow
├── ckpts
│   ├── qflow_pdb
│   │   ├── config.yaml
│   │   └── qflow_pdb.ckpt
│   ├── qflow_scope
│   │   ├── config.yaml
│   │   └── qflow_scope.ckpt
│   ├── reqflow_pdb_rectify
│   │   ├── config.yaml
│   │   └── reqflow_pdb_rectify.ckpt
│   └── reqflow_scope_rectify
│       ├── config.yaml
│       └── reqflow_scope_rectify.ckpt

The inference configurations are available in configs/inference_unconditional.yaml, where you can conveniently specify the inference settings.

inference:
  task: unconditional
  ckpt_path: ./ckpts/reqflow_pdb_rectify/reqflow_pdb_rectify.ckpt # path to ckpts
  inference_subdir: ./inference_outputs/run_${now:%Y-%m-%d}_${now:%H-%M-%S} # path to inference outputs
  pmpnn_dir: ./ProteinMPNN
  pt_hub_dir: ./.cache/torch/ # path to ESMFold
  num_gpus: 4

  samples:
    min_length: 100
    max_length: 300 # We recommend < 500
    length_step: 50 # sampling on length (100,150,200,250,300)
    samples_per_length: 50
    seq_per_sample: 8 # num. of seq. generated by ProteinMPNN

  interpolant:
    sampling:
      num_timesteps: 500
      do_sde: False
    rots:
      sample_schedule: exp

Once you have specified the configurations, you can run inference using the following command:

python -W ignore experiments/inference_se3_flows.py -cn inference_unconditional

During inference, we evaluate results using the ProteinMPNN and ESMFold following FrameDiff. The outputs will be saved as follows,

inference_outputs
└── expriment_name                      # Default is date time of inference
    ├── config.yaml                     # Config used during inference
    └── length_100                      # Sampled length 
        ├── sample_0                    # Sample ID for length
        │   ├── noise.pdb               # First sample, i.e., noise
        │   ├── sample.pdb              # Final sample
        │   ├── self_consistency        # Self consistency results        
        │   │   ├── esmf                # ESMFold predictions using ProteinMPNN sequences
        │   │   │   ├── sample_0.pdb
        │   │   │   ├── ...
        │   │   │   └── sample_8.pdb
        │   │   ├── parsed_pdbs.jsonl   # Parsed chains for ProteinMPNN
        │   │   ├── sample.pdb
        │   │   ├── sc_results.csv      # Summary metrics CSV 
        │   │   └── seqs                
        │   │       └── sample.fa       # ProteinMPNN sequences
        │   └── x0_traj_1.pdb           # x_0 model prediction trajectory
        └── sample_1                    # Next sample

Based on this inference_outputs, we can compute Designability, Diversity and Novelty. More evaluation details to reproduce the paper results are here.

📖 Train from Scratch

Data preparation

We train our models on Protein Data Bank (PDB) and SCOPe dataset, seperately. For PDB dataset, we reprocessed from PDB using the steps described in the FrameDiff, and detailed procedure is also available here. We also provide a demo PDB dataset in data folder to help you test or debug. For SCOPe, we directly downloaded using the link provided by FrameFlow. Tha dataset path is set in configs/_datasets.yaml.

QFlow

Similar to inference, you can simply control your training settings using the yaml files in configs. Take training QFlow on PDB dataset as an example, we speicfy the configurations in configs/train_pdb_base.yaml,

data:
  dataset: pdb
  rectify: False 

  sampler:
    # Setting for 80GB GPUs
    max_batch_size: 128
    max_num_res_squared: 1000000
  
  experiment:
    is_training: True
    debug: False
    num_devices: 4
    warm_start: null # keep it null on first stage
    warm_start_cfg_override: True
    training:
      aux_loss_t_pass: 0.50
    wandb:
      name: reqflow_train_pdb_base
      project: reqflow
  checkpointer: # where to save checkpoints
    dirpath: ./ckpts/${experiment.wandb.project}/${experiment.wandb.name}/${now:%Y-%m-%d}_${now:%H-%M-%S}
    save_last: True
    save_top_k: -1

And make sure configs in _datasets.yaml is set following instructions here.

The according training command is

python -W ignore experiments/train_se3_flows.py -cn train_pdb_base

ReQFlow

One of our key contributions is rectifying the SE(3) generation trajectories in Euclidean/Quaternion space to accelerate inference and enhance the designability of the generated protein backbones. We recitify the QFlow model with the generated noise-sample pairs (see noise.pdb and sample.pdb in inference_outputs).

We construct the rectify dataset by converting the generated .pdb files into a compatible format. You can follow instructions here to do it.

Once the rectify dataset is obtained, the training pipeline remains the same as QFlow. The configurations can be found in configs/train_pdb_rectify.yaml, and make sure experiment.warm_start is set to the ckpt you get from first stage training. The command to run it is:

python -W ignore experiments/train_se3_flows.py -cn train_pdb_rectify

The training of SCOPe dataset is the same as PDB dataset.

👍 Acknowledgments

Thanks to FrameFlow, FrameDiff, FoldFlow for their great work and codebase, which served as the foundation for developing ReQFlow.

📧 Contact Us

If you have any question, please feel free to contact us via [email protected] or [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
ProteinMPNN		ProteinMPNN
analysis		analysis
assets		assets
configs		configs
data		data
experiments		experiments
metadata		metadata
models		models
motif_scaffolding		motif_scaffolding
openfold		openfold
.gitignore		.gitignore
README.md		README.md
reqflow-env.yml		reqflow-env.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚡️ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation [ICML 2025]

📌 Citation

🔥 News

🧩 Introduction

⚒️ Installation

🚀 Quick Inference

📖 Train from Scratch

Data preparation

QFlow

ReQFlow

👍 Acknowledgments

📧 Contact Us

About

Uh oh!

Releases

Packages

Languages

SDS-Lab/ReQFlow

Folders and files

Latest commit

History

Repository files navigation

⚡️ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation [ICML 2025]

📌 Citation

🔥 News

🧩 Introduction

⚒️ Installation

🚀 Quick Inference

📖 Train from Scratch

Data preparation

QFlow

ReQFlow

👍 Acknowledgments

📧 Contact Us

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages