SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models 🧬

📖 Overview

SynLlama is a fine-tuned version of Meta's Llama3 large language models that generates synthesizable analogs of small molecules by creating full synthetic pathways using commonly accessible building blocks and robust organic reaction templates, offering a valuable tool for drug discovery with strong performance in bottom-up synthesis, synthesizable analog generation, and hit expansion.

💡 Usage

Prerequisites

Ensure you have conda installed on your system. All additional dependencies will be managed via the environment.yml file.

Installation

To get started with SynLlama, follow these steps:

git clone https://github.com/THGLab/SynLlama
cd SynLlama
conda env create -f environment.yml
conda activate synllama
pip install -e .

Inference

To perform inference using the already trained SynLlama, download the trained models and relevant files from here and follow the instructions in the Inference Guide.

Retraining

If you are interested in retraining the model, please refer to the Retraining Guide for detailed instructions.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details

🙏 Acknowledgments

This project is built on top of the ChemProjector Repo. We thank the authors for building such a user-friendly github!

📝 Citation

If you use this code in your research, please cite:

@misc{sun_synllama_2025,
    title = {SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models},  
    url = {http://arxiv.org/abs/2503.12602},
    doi = {10.48550/arXiv.2503.12602},
    publisher = {arXiv},
    author = {Sun, Kunyang and Bagni, Dorian and Cavanagh, Joseph M. and Wang, Yingze and Sawyer, Jacob M. and Gritsevskiy, Andrew and Head-Gordon, Teresa},
    month = mar,
    year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
data		data
steps		steps
synllama		synllama
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models 🧬

📖 Overview

💡 Usage

Prerequisites

Installation

Inference

Retraining

📄 License

🙏 Acknowledgments

📝 Citation

About

Releases

Packages

Languages

License

THGLab/SynLlama

Folders and files

Latest commit

History

Repository files navigation

SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models 🧬

📖 Overview

💡 Usage

Prerequisites

Installation

Inference

Retraining

📄 License

🙏 Acknowledgments

📝 Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages