Deep Reaction

Efficient Prediction of Chemical reactions

This repository corresponds to the DeepReaction project, designed for accurate prediction of chemical reaction properties using graph neural networks.

📋 Table of Contents

Installation
Data Format
Dataset
Dataset Preparation
Training
Evaluation
Advanced Usage
Citation
Acknowledgements
License

🔧 Installation

Method 1: Using pip (Recommended)

# Clone this repository:
git clone https://github.com/chimie-paristech-CTM/DeepReaction.git
cd DeepReaction

# Install in development mode
pip install -e .

# (Optional) For Jupyter notebook support
pip install jupyterlab

Method 2: Using conda environment

# Clone this repository:
git clone https://github.com/chimie-paristech-CTM/DeepReaction.git
cd DeepReaction

# Create the conda environment from the environment.yml file
conda env create -f environment.yml

# Activate the environment
conda activate reaction

# (Optional) For Jupyter notebook support
pip install jupyterlab

⚠️ Note: The version of PyTorch Geometric (PyG) and its related packages must be selected according to your hardware configuration (e.g., CUDA version). Visit the official PyG installation guide to find the correct command for your system.

⚠️ Note: Due to the computational complexity of graph neural network architectures built with PyG (PyTorch Geometric), it is recommended to run them on a GPU for better performance and efficiency.

📊 Data Format

DeepReaction requires a specific data format for training and prediction. The key components are:

CSV Input Format

Your main dataset file should be a CSV with the following essential columns:

Column	Description
`ID`	Unique identifier for each reaction
`R_dir`	Directory name containing XYZ files (e.g., "reaction_R0")
`smiles`	SMILES representation of the reaction
`DG_act`	Target property: Gibbs free activation energy (kcal/mol)
`DrG`	Target property: Gibbs free reaction energy (kcal/mol)
`DG_act_xtb`	Input feature: XTB-computed approximation of DG_act
`DrG_xtb`	Input feature: XTB-computed approximation of DrG

Example CSV row:

ID63623,reaction_R0,[C:1](=[C:2]([C:3](=[C:4]([H:11])[H:12])[H:10])[H:9])([H:7])[H:8].[C:5](=[C:6]([H:15])[H:16])([H:13])[H:14]>>[C:1]1([H:7])([H:8])[C:2]([H:9])=[C:3]([H:10])[C:4]([H:11])([H:12])[C:5]([H:13])([H:14])[C:6]1([H:15])[H:16],35.16,-22.54,21.70,-44.40

XYZ File Structure

For each reaction in your dataset, you need to provide three XYZ files representing the:

Reactant(s)
Transition state (TS)
Product(s)

The XYZ files should be organized in directories named according to the R_dir column in your CSV:

dataset_root/
└── reaction_R0/
    ├── R0_reactant.xyz
    ├── R0_ts.xyz
    └── R0_product.xyz
└── reaction_R1/
    ├── R1_reactant.xyz
    ├── R1_ts.xyz
    └── R1_product.xyz
...

XYZ File Format:

[Number of atoms]
[Optional comment line]
[Element] [X coordinate] [Y coordinate] [Z coordinate]
[Element] [X coordinate] [Y coordinate] [Z coordinate]
...

Important Configuration Parameters

When setting up your configuration, make sure to specify:

file_keywords: Patterns to identify XYZ files (default: ['*_reactant.xyz', '*_ts.xyz', '*_product.xyz'])
target_fields: Target properties to predict (default: ['DG_act', 'DrG'])
input_features: Features used as input (default: ['DG_act_xtb', 'DrG_xtb'])
id_field: Column name for reaction IDs (default: 'ID')
dir_field: Column name for directory names (default: 'R_dir')
reaction_field: Column name for reaction SMILES (default: 'reaction')

🔍 Dataset

Diels-Alder Reaction Dataset

The models in DeepReaction were developed and tested using a comprehensive Diels-Alder reaction dataset:

Dataset link: Diels-Alder Reaction Space for Self-Healing Polymer

This dataset contains:

1,580 Diels-Alder reactions with complete 3D structures
Quantum chemical calculations (DFT and XTB) for transition states and energetics
Reaction energies, activation energies, and structural information
Computed properties including DG_act and DrG values

Download and Use

Download the dataset archive from the Figshare link above
Extract the contents to your desired location (recommended: ./dataset/DATASET_DA_F/)
Ensure the dataset has the correct structure as described in the XYZ File Structure section
Update the dataset paths in your configuration if needed

📁 Dataset Preparation

Place your reaction dataset in the appropriate location:

./dataset

Alternatively, modify the paths in the configuration file or command-line arguments.

💻 Training

Using Command Line Interface

To train the model with the dataset using our specialized training script:

# Basic training with default parameters
python example/train.py

Available command line options

--readout: Readout function type (set_transformer, sum, mean, max, attention)
--batch: Batch size for training
--epochs: Maximum number of training epochs
--lr: Learning rate
--node-dim: Dimension of node latent representations
--output: Output directory for results
--reaction-root: Custom path to reaction dataset root, i.e., the location of the xyz files of reactants, products and TSs
--reaction-csv: Custom path to reaction dataset CSV

📈 Evaluation

To evaluate a trained model(Checkpoint link):

python example/predict.py

The prediction notebook allows you to:

Load a trained model checkpoint
Make predictions on new data
Visualize prediction results
Compare predictions with actual values (if available)

🔧 Advanced Usage

Hyperparameter Optimization

# Run hyperparameter optimization
python example/hyper.py

📝 Citation

If you use DeepReaction or the Diels-Alder dataset in your research, please cite:

[Placeholder]

🙏 Acknowledgements

This implementation is built upon several open-source projects:

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github/workflows		.github/workflows
.idea		.idea
assets		assets
build		build
dataset		dataset
deepreaction		deepreaction
example		example
post		post
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml
make.bat		make.bat
predict.ipynb		predict.ipynb
setup.py		setup.py
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Reaction

📋 Table of Contents

🔧 Installation

Method 1: Using pip (Recommended)

Method 2: Using conda environment

📊 Data Format

CSV Input Format

Example CSV row:

XYZ File Structure

XYZ File Format:

Important Configuration Parameters

🔍 Dataset

Diels-Alder Reaction Dataset

Download and Use

📁 Dataset Preparation

💻 Training

Using Command Line Interface

Available command line options

📈 Evaluation

🔧 Advanced Usage

Hyperparameter Optimization

📝 Citation

🙏 Acknowledgements

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

chimie-paristech-CTM/DeepReaction

Folders and files

Latest commit

History

Repository files navigation

Deep Reaction

📋 Table of Contents

🔧 Installation

Method 1: Using pip (Recommended)

Method 2: Using conda environment

📊 Data Format

CSV Input Format

Example CSV row:

XYZ File Structure

XYZ File Format:

Important Configuration Parameters

🔍 Dataset

Diels-Alder Reaction Dataset

Download and Use

📁 Dataset Preparation

💻 Training

Using Command Line Interface

Available command line options

📈 Evaluation

🔧 Advanced Usage

Hyperparameter Optimization

📝 Citation

🙏 Acknowledgements

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages