-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #32 from aced-differentiate/develop
Add learning modules and documentation
- Loading branch information
Showing
70 changed files
with
8,039 additions
and
247 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
name: deploy-pages | ||
on: | ||
push: | ||
branches: | ||
- master | ||
|
||
jobs: | ||
deploy: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.x | ||
- run: pip install mkdocs-material | ||
- run: pip install mkdocstrings | ||
- run: mkdocs gh-deploy --force |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
include src/autocat/data/**/*.json | ||
include src/autocat/VERSION.txt | ||
include bin/autocat | ||
include bin/autocat | ||
include CONTRIBUTING.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.learning.featurizers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.learning.predictors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.learning.sequential |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.adsorption |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.bulk |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Single Atom Alloys | ||
|
||
::: autocat.saa |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
::: autocat.surface |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
# AutoCat Documentation | ||
|
||
![AutoCat Logo](img/autocat_logo.png){ align=right } | ||
|
||
AutoCat is a suite of python tools for **sequential learning for materials applications** | ||
and **automating structure generation for DFT catalysis studies.** | ||
|
||
Development of this package stems from [ACED](https://www.cmu.edu/aced/), as part of the | ||
ARPA-E DIFFERENTIATE program. | ||
|
||
Below we provide an overview of the key functionalities of AutoCat. | ||
For additional details please see the User Guide, Tutorials, and API sections. | ||
|
||
## Sequential Learning | ||
|
||
One of the core philosophies of AutoCat is to provide modular and extensible tooling to | ||
facilitate closed-loop computational materials discovery workflows. Within this submodule | ||
are classes for defining a design space, featurization, | ||
regression, and defining a closed-loop sequential learning iterator. The | ||
key classes intended for each of these purposes are: | ||
|
||
- [**`DesignSpace`**](User_Guide/Learning/sequential#designspace): define a design space to explore | ||
|
||
- [**`Featurizer`**](User_Guide/Learning/featurizers): featurize the systems for regression | ||
|
||
- [**`Predictor`**](User_Guide/Learning/predictors): a regressor for predicting materials properties | ||
|
||
- [**`SequentialLearner`**](User_Guide/Learning/sequential#sequentiallearner): define a closed-loop iterator | ||
|
||
|
||
## Structure Generation | ||
|
||
![Adsorption Figure](img/struct_gen_figs/adsorption.png){ align=right } | ||
|
||
This submodule contains functions for automating atomic structure generation | ||
within the context of a catalysis study using density functional theory. | ||
Specifically, this includes generating bulk structures, surfaces, and | ||
placing adsorbates. In addition, functions for generating the single-atom alloys | ||
material class are also included. These functions are organized within AutoCat as follows: | ||
|
||
- [**`autocat.bulk`**](User_Guide/Structure_Generation/bulk): generation of periodic | ||
mono-elemental bulk structures | ||
|
||
- [**`autocat.surface`**](User_Guide/Structure_Generation/surface): mono-elemental surface slab generation | ||
|
||
- [**`autocat.adsorption`**](User_Guide/Structure_Generation/adsorption): placement of adsorbates onto surfaces | ||
|
||
- [**`autocat.saa`**](User_Guide/Structure_Generation/saa): generation of single-atom alloy surfaces | ||
|
||
Structures generated or read with this package are typically of the form of | ||
[`ase.Atoms`](https://wiki.fysik.dtu.dk/ase/ase/atoms.html#module-ase.atoms) | ||
objects. | ||
|
||
When opting to write structures to | ||
disk using these functions, they are automatically organized into a clean, scalable directory organization. | ||
All structures are written in the | ||
[`ase.io.Trajectory`](https://wiki.fysik.dtu.dk/ase/ase/io/trajectory.html#trajectory) | ||
file format. | ||
For further details on the directory structure, see the User Guide. | ||
|
||
## Installation | ||
|
||
There are two options for installation, either via `pip` or from the repo directly. | ||
|
||
### `pip` (recommended) | ||
|
||
If you are planning on strictly using AutoCat rather than contributing to development, | ||
we recommend using `pip` within a virtual environment (e.g. | ||
[`conda`](https://docs.conda.io/en/latest/) | ||
). This can be done | ||
as follows: | ||
|
||
``` | ||
pip install autocat | ||
``` | ||
|
||
### Github (for developers) | ||
|
||
Alternatively, if you would like to contribute to the development of this software, | ||
AutoCat can be installed via a clone from Github. First, you'll need to clone the | ||
github repo to your local machine (or wherever you'd like to use AutoCat) using | ||
`git clone`. Once the repo has been cloned, you can install AutoCat as an editable | ||
package by changing into the created directory (the one with `setup.py`) and installing | ||
via: | ||
``` | ||
pip install -e . | ||
``` | ||
|
||
## Contributing | ||
Contributions through issues, feature requests, and pull requests are welcome. | ||
Guidelines are provided here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
In this tutorial we are going to show how to use the learning tools within | ||
AutoCat to train a regressor that can predict adsorption energies of hydrogen | ||
on a set of single-atom alloys. | ||
|
||
## Creating a `DesignSpace` | ||
|
||
Let's start by creating a `DesignSpace`. Normally each of these | ||
structures would be optimized via DFT, but for demo purposes | ||
we'll use the generated structures directly. First we need to generate the single-atom | ||
alloys. Here, we can use AutoCat's | ||
[`generate_saa_structures`](../API/Structure_Generation/saa.md#autocat.saa.generate_saa_structures) | ||
function. | ||
|
||
```py | ||
>>> # Generate the clean single-atom alloy structures | ||
>>> from autocat.saa import generate_saa_structures | ||
>>> from autocat.utils import extract_structures | ||
>>> saa_struct_dict = generate_saa_structures( | ||
... ["Fe", "Cu", "Au"], | ||
... ["Pt", "Pd", "Ni"], | ||
... facets={"Fe":["110"], "Cu":["111"], "Au":["111"]}, | ||
... n_fixed_layers=2, | ||
... ) | ||
>>> saa_structs = extract_structures(saa_struct_dict) | ||
``` | ||
|
||
Now that we have the clean structures, let's adsorb hydrogen on the surface. | ||
For convenience let's place H at the origin instead of considering all symmetry sites. | ||
To accomplish this we can make use of AutoCat's | ||
[`place_adsorbate`](../API/Structure_Generation/adsorption.md#autocat.adsorption.place_adsorbate) | ||
function. | ||
|
||
```py | ||
>>> # Adsorb hydrogen onto each of the generated SAA surfaces | ||
>>> from autocat.adsorption import place_adsorbate | ||
>>> ads_structs = [] | ||
>>> for clean_struct in saa_structs: | ||
... ads_dict = place_adsorbate( | ||
... clean_struct, | ||
... "H", | ||
... (0.,0.) | ||
... ) | ||
... ads_struct = extract_structures(ads_dict)[0] | ||
... ads_structs.append(ads_struct) | ||
``` | ||
|
||
This has collected all of the single-atom alloys with hydrogen adsorbed into | ||
a single list of `ase.Atoms` objects, `ads_structs`. Ideally at this stage we'd have | ||
adsorption energies for each of the generated structures after relaxation. As a proxy | ||
in this demo we'll create random labels, but this should be adsorption energies if you | ||
want to train a meaningful Predictor! | ||
|
||
```py | ||
>>> # Generate the labels for each structure | ||
>>> import numpy as np | ||
>>> labels = np.random.uniform(-1.5,1.5,size=len(ads_structs)) | ||
``` | ||
|
||
Finally, using both our structures and labels we can define a `DesignSpace`. In practice, | ||
if any of the labels for a structure are unknown, it can be included as a `numpy.nan` | ||
|
||
```py | ||
>>> from autocat.learning.sequential import DesignSpace | ||
>>> design_space = DesignSpace(ads_structs, labels) | ||
``` | ||
|
||
## Setting up a `Predictor` | ||
|
||
When setting up our `Predictor` we now have two choices to make: | ||
|
||
1. The technique to be used for featurizing the systems | ||
2. The regression model to be used for training and predictions | ||
|
||
Internally, the `Predictor` will contain a `Featurizer` object which contains all of | ||
our choices for how to featurize the systems. Our choice of featurizer class and | ||
the associated kwargs are specified via the `featurizer_class` and | ||
`featurization_kwargs` arguments, respectively. By providing the design space structures | ||
some of the kwargs related to the featurization (e.g. maximum structure size) can be | ||
automatically obtained. | ||
|
||
Similarly, we can specify the regressor to be used within the `model_class` and | ||
`model_kwargs` arguments. The class should be "`sklearn`-like" with `fit` and | ||
`predict` methods. | ||
|
||
Let's featurize the hydrogen environment via `dscribe`'s `SOAP` class with | ||
`sklearn`'s `GaussianProcessRegressor` for regression. | ||
|
||
```py | ||
>>> from sklearn.gaussian_process import GaussianProcessRegressor | ||
>>> from sklearn.gaussian_process.kernels import RBF | ||
>>> from dscribe import SOAP | ||
>>> from autocat.learning.predictors import Predictor | ||
>>> kernel = RBF(1.5) | ||
>>> model_kwargs={"kernel": kernel} | ||
>>> featurization_kwargs={ | ||
... "design_space_structures": design_space.design_space_structures, | ||
... "kwargs": {"rcut": 7.0, "nmax": 8, "lmax": 8} | ||
... } | ||
>>> predictor = Predictor( | ||
... model_class=GaussianProcessRegressor, | ||
... model_kwargs=model_kwargs, | ||
... featurizer_class=SOAP, | ||
... featurization_kwargs=featurization_kwargs, | ||
... ) | ||
``` | ||
|
||
## Training and making predictions | ||
|
||
With our newly defined `Predictor` we can train it using data from our | ||
`DesignSpace` and the `fit` method. | ||
|
||
```py | ||
>>> train_structures = design_space.design_space_structures[:5] | ||
>>> train_labels = design_space.design_space_labels[:5] | ||
>>> predictor.fit(train_structures, train_labels) | ||
``` | ||
|
||
Making predictions is a similar process except using the `predict` method. | ||
|
||
```py | ||
>>> test_structures = design_space.design_space_structures[5:] | ||
>>> predicted_labels = predictor.predict(test_structures) | ||
``` | ||
|
||
In this example, since we already have the labels for the test structures, we can | ||
also use the `score` method to calculate a prediction score. | ||
|
||
```py | ||
>>> test_labels = design_space.design_space_labels[5:] | ||
>>> mae = predictor.score(test_structures, test_labels) | ||
``` |
Oops, something went wrong.