This repository contains scripts and notebooks to analyze results from mhdx_pipeline and hdxrate_pipeline associated with the Ferrari 2025 paper.
- Compute Cooperativities
- AF2 Prediction
- Rosetta Relaxation
- Hydrogen Bond Extraction
- Dependencies
- How to Cite
The code to compute cooperativities (normalized cooperativity and family-normalized cooperativity) from scratch can be found in:
Path: notebooks/DeriveCooperativities.ipynb
- Processed Data: Output from
hdxrate_pipeline
({hdxrate_output}/consolidated_results/deduplicated.json
). - Structural Data: Dataframe with protein names and the number of expected protected amides based on their 3D structure.
- Structures from Ferrari 2025 were generated using AlphaFold2 via ColabFold, followed by RosettaRelax applied to the top-scoring model.
- Hydrogen bond information was extracted using a custom PyRosetta script.
- Protein family info was added by populating a PF column to the final
hb.json
dataframe
Scripts for all these steps are provided. Ensure ColabFold and Rosetta are installed. Scripts include the exact parameters used in our study.
mhdx_pipeline/scripts/cooperativity/resources/20241030_param_table.json
and mhdx_pipeline/scripts/cooperativity/resources/240917_cooperativity_std_mean_dict.json
are provided
python mhdx_pipeline/scripts/cooperativity/compute_cooperativity.py --hx deduplicated.json --hb hb.json --param_table mhdx_pipeline/scripts/cooperativity/resources/20241030_param_table.json --cooperativity_dict mhdx_pipeline/scripts/cooperativity/resources/240917_cooperativity_std_mean_dict.json --output results/df_HX_cooperativities.json
To perform structure prediction using AlphaFold2:
bash mhdx_analysis/scripts/feature_extraction/run0_structure_prediction.sh {folder-to-fasta-or-a3m-files}
To relax the AlphaFold2 predicted structures using Rosetta:
bash mhdx_analysis/scripts/feature_extraction/run1_rosetta.sh {af_prediction/pdbs/}
Extract hydrogen bond information from the relaxed structures.
Run for each structure:
python mhdx_analysis/scripts/feature_extraction/pdb2hbond.py --input "$rosetta_relax" --output "$hbonds_output"
After extraction, concatenate all H-bond JSON files and populate a column PF
with the corresponding protein family.
Ensure the following tools and libraries are installed:
- ColabFold (version 1.5.2): https://github.com/sokrypton/ColabFold
- Rosetta (version 2019.11) : https://rosettacommons.org/software/
- PyRosetta (version 4 2022.33) : https://www.pyrosetta.org/
If you use this repository, please cite the following paper:
Ferrari, A. et al. (2025). Title of the Paper. Journal Name. DOI: XXXXXX
For any questions or issues, please contact [email protected]
.