Skip to content

keiserlab/autofragdiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoFragDiff

This repository is the official implementation of Autoregressive fragment based diffusion model for target-aware ligand design

Dependencies

  • RDKit
  • openbabel
  • PyTorch
  • biopython
  • biopandas
  • networkx
  • py3dmol
  • scikit-learn
  • tensorboard
  • wandb
  • pytorch-lightning

Create conda environment

conda create -n autofragdiff
pip install rdkit
conda install -c conda-forge openbabel
pip3 install torch torchvision torchaudio 
pip install biopython
pip install biopandas
pip install networkx
pip install py3dmol
pip install scikit-learn
pip install tensorboard
pip install wandb
pip install tqdm
pip install pytorch-lightning==1.6.0

The model has been tested with the following software versions:

Software Version
rdkit 2023.3.1
openbabel 3.1.1
pytorch 2.0.1
biopython 1.81
biopandas 0.4.1
networkx 3.1
py3dmol 2.0.1.
scikit-learn 1.2.2
tensorboard 2.13.0
wandb 0.15.2
pytorch-lightning 1.6.0

QucikVina2

For Docking with qvina install QuickVina2:

wget https://github.com/QVina/qvina/raw/master/bin/qvina2.1
chmod +x qvina2.1 

We also need MGLTools for preparing the receptor for docking (pdb->pdbqt) but it can mess up the conda environment, so make a new one.

conda create -n mgltools -c bioconda mgltools

Data Preparation

CrossDock

Download and extract the dataset as described by the authors of Pocket2Mol: https://github.com/pengxingang/Pocket2Mol/tree/main/data

process the molecule fragments using a custom fragmentation.

python process_crossdock.py --rootdir $CROSSDOCK_PATH --outdir $OUT_DIR \
      --dist_cutoff 7. --max-num-frags 8 --split test --max-atoms-single-fragment 22 \
      --add-Vina-score --add-QED-score --add-SA-score --n-cores 16
  • For adding Vina you also need to generate pdbqt files for each receptor and crystallographic ligand.

Training

Training AutoFragdiff.

python train_frag_diffuser.py --data $CROSSDOCK_DIR  --exp_name CROSSDOCK_model_1 \
        --lr 0.0001 --n_layers 6  --nf 128  --diffusoin_steps 500 \
       --diffusion_loss_type l2 --n_epochs 1000 --batch_size 4

Training anchor predictor

python train_anchor_predictor --data $CROSSDOCK_DIR --exp_name CROSDOCK_anchor_model_1 \
        --n_layers 4 --inv_sublayers 2 --nf 128 --dataset-type CrossDock

Sampling:

Firt download the trained models from the google drive in the following link

https://drive.google.com/drive/folders/1DQwIfibHIoFPGJP6aHBGiYRp87bCZFA0?usp=share_link

CrossDock pocket-based molecule generation:

To generate molecules from trained pocket-based model, also use anchor-predictor model. fragment sizes are sampled from the data distribution.

CrossDock pocket-based molecule generation (with guidance):

To generate molecules for crossdock test set:

python sample_crossdock_mols.py --results-path results/ --data-path $(path-to-crossdock-dataset) --use-anchor-model --anchor-model anchor-model.ckpt --n-samples 20 --exp-name test-crossdock --diff-model pocket-gvp.ckpt --device cuda:0 

To sample molecules from a pdb file: first run fpocket and identify the correct pocket using:

fpocket -f $pdb.pdb

fpocket gives multiple pockets, you can visualize the identify the right pocket and run sampling

python sample_from_pocket.py --result-path results --pdb $pdbname --anchor-model anchor-model.ckpt --n-samples 10 --device cuda:0 --pocket-number 1 

Scaffold-based molecule property optimization

For scaffold-based optimization you need the pdb file of the pocket and the sdf file of the scaffold molecule (and the original molecule).

Scaffold-extension for crossdock test set

python extend_scaffold_crossdock.py --data-path $(path-to-crossdock) --results-path scaffold-gen --anchor-model anchor-model.ckpt --n-samples 20 --exp-name scaffold-gen --diff-model pocket-gvp.ckpt --device cuda:0 
  • In order to select the anchor you can add the --custom-anchors argument and provide the ids of custom anchors (starts from 0 and based on atomic ids in the scaffold molecule).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published