Skip to content

Commit

Permalink
Release single-step evaluation framework and wrappers for several mod…
Browse files Browse the repository at this point in the history
…el types (#14)

This (fairly large) PR releases the majority of the single-step
evaluation code, including the eval script itself, inference wrappers
for seven model types, and glue code to plug these models into search.

A few notable things which are still missing, to be added in future PRs
to avoid making this one even larger:
- Detailed setup instructions for each model type (e.g. environment
definitions).
- Model checkpoints for each model type.
- More tests covering `syntheseus/reaction_prediction` (currently the
coverage is quite low, partly because the wrappers themselves cannot be
tested easily without setting up dedicated environments for each, but
there are also places which are just not adequately covered by tests
yet).
  • Loading branch information
kmaziarz committed Jul 14, 2023
1 parent 6465917 commit 9de0f47
Show file tree
Hide file tree
Showing 40 changed files with 2,559 additions and 21 deletions.
6 changes: 6 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# This should not be necessary, except that `conda<4.11` has a bug dealing with `python>=3.10`
# (see https://github.com/conda/conda/issues/10969), and the below makes that go away.
default_language_version:
python: python3

repos:
# Generally useful pre-commit hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
Expand Down Expand Up @@ -37,6 +42,7 @@ repos:
- id: mypy
name: "mypy"
files: "syntheseus/"
args: ["--install-types", "--non-interactive"]

# Latest ruff (does linting + more)
- repo: https://github.com/charliermarsh/ruff-pre-commit
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.

### Added

- Release single-step evaluation framework and wrappers for several model types ([#14](https://github.com/microsoft/syntheseus/pull/14)) ([@kmaziarz])
- Add option to terminate search when the first solution is found ([#13](https://github.com/microsoft/syntheseus/pull/13)) ([@austint])
- Add code to extract routes in order found instead of by minimum cost ([#9](https://github.com/microsoft/syntheseus/pull/9)) ([@austint])
- Declare support for type checking ([#4](https://github.com/microsoft/syntheseus/pull/4)) ([@kmaziarz])
Expand Down
34 changes: 18 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,33 +14,35 @@ of retrosynthesis algorithms.

## Installation

Syntheseus is designed to have very few dependencies to allow it to
be run in a wide range of environments.
At the moment the only hard dependencies are `numpy`, `rdkit`, and `networkx`.
It should be easy to install syntheseus into any environment which has these packages.

Currently `syntheseus` is not hosted on `pypi`
Currently `syntheseus` is not hosted on PyPI
(although this will likely change in the future).
To install, please run:

```bash
# Clone and cd into repo
# Clone and cd into the repository.
git clone https://github.com/microsoft/syntheseus.git
cd syntheseus

# Option 1: minimal install into current environment.
# Assumes dependencies are already present in your environment.
pip install . --no-dependencies
# Create and activate a new conda environment (or use your own).
conda env create -f environment.yml
conda activate syntheseus

# Option 2: pip install with dependencies into current environment.
pip install .
# Install into the current environment.
pip install -e .
```

# Option 3: create new conda environment and then install.
conda env create -f environment.yml # creates env named syntheseus
conda activate syntheseus
pip install .
Syntheseus contains two subpackages: `reaction_prediction`, which deals with benchmarking single-step reaction models, and `search`, which can use any single-step model to perform multi-step search.
Each is designed to have minimal dependencies, allowing it to run in a wide range of environments.
While specific components (single-step models, policies, or value functions) can make use of Deep Learning libraries, the core of `syntheseus` does not depend on any.

If you only want to use either of the two subpackages, you can limit the dependencies further by installing the dependencies separately and then running

```bash
pip install -e . --no-dependencies
```

See `pyproject.toml` for a list of dependencies tied to each subpackage.

## Development

Syntheseus is currently under active development and does not have a fixed API
Expand Down
5 changes: 4 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,8 @@ dependencies:
# Temporary pin to avoid the 3.16.0 release, which drops support for Python 3.7
- zipp<3.16
- pip:
# Optional dependency for the single-step model interface
# Additional dependencies of `syntheseus/reaction_prediction`
- more_itertools
- omegaconf
- pydantic>=1.10.5,<2 # earlier versions had a bug involving `default_factory` (see https://github.com/pydantic/pydantic/issues/5065), later are backward incompatible
- tqdm
13 changes: 9 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,13 @@ requires-python = ">=3.7"
license = {file = "LICENSE"}
dynamic = ["version"]
dependencies = [
"numpy",
"rdkit",
"networkx",
"more_itertools", # reaction_prediction
"networkx", # search
"numpy", # reaction_prediction, search
"omegaconf", # reaction_prediction
"pydantic>=1.10.5,<2", # reaction_prediction
"rdkit", # reaction_prediction, search
"tqdm", # reaction_prediction
]

[project.optional-dependencies]
Expand Down Expand Up @@ -48,8 +52,9 @@ namespaces = false
line-length = 100
include = '\.pyi?$'

[tool.mypy.overrides]
[tool.mypy]
python_version = 3.9 # pin modern python version
ignore_missing_imports = true

[tool.ruff]
line-length = 100
Expand Down
Empty file.
Empty file.
45 changes: 45 additions & 0 deletions syntheseus/reaction_prediction/chem/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from typing import Optional

from rdkit import Chem

from syntheseus.interface.bag import Bag
from syntheseus.interface.molecule import SMILES_SEPARATOR, Molecule

ATOM_MAPPING_PROP_NAME = "molAtomMapNumber"


def remove_atom_mapping_from_mol(mol: Chem.Mol) -> None:
"""Removed the atom mapping from an rdkit molecule modifying it in place."""
for atom in mol.GetAtoms():
atom.ClearProp(ATOM_MAPPING_PROP_NAME)


def remove_atom_mapping(smiles: str) -> str:
"""Removes the atom mapping from a SMILES string.
Args:
smiles: Molecule SMILES to be modified.
Returns:
str: Input SMILES with atom map numbers stripped away.
"""
mol = Chem.MolFromSmiles(smiles)
remove_atom_mapping_from_mol(mol)

return Chem.MolToSmiles(mol)


def molecule_bag_from_smiles_strict(smiles: str) -> Bag[Molecule]:
return Bag([Molecule(component) for component in smiles.split(SMILES_SEPARATOR)])


def molecule_bag_from_smiles(smiles: str) -> Optional[Bag[Molecule]]:
try:
return molecule_bag_from_smiles_strict(smiles)
except ValueError:
# If any of the components ends up invalid we return `None` instead.
return None


def molecule_bag_to_smiles(mols: Bag[Molecule]) -> str:
return SMILES_SEPARATOR.join(mol.smiles for mol in mols)
Empty file.
Loading

0 comments on commit 9de0f47

Please sign in to comment.