Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding conformers in a past Dataset #85

Open
jthorton opened this issue Jan 15, 2021 · 1 comment
Open

Expanding conformers in a past Dataset #85

jthorton opened this issue Jan 15, 2021 · 1 comment

Comments

@jthorton
Copy link
Contributor

Users may want to add extra conformations to past datasets but not want to re-roll the entire dataset through the factory, as the dataset might be large and the order of the conformers may change. One way this can currently be done is the following

from qcsubmit.datasets import load_dataset

dataset = load_dataset("dataset.json")
# loop through the dataset and make new conformers
for entry in dataset.dataset.values():
    # get the molecule and gen conformers
    mol = entry.get_off_molecule()
    mol.generate_conformers()
    for i in mol.n_conformers:
        entry.initial_molecules.append(mol.to_qcschema(conformer=i))

here users just need to check that the same conformer is not entered twice into the entry, maybe we can add some functions to datasets to automatically do this for users?

@trevorgokey
Copy link
Contributor

Thanks so much for detailing this! My initial reaction is that it "would be nice" if we could run this through the conformer component so we get the provenance and all the goodness from the exposed options (e.g. rms_cutoff). I'll take a look and see what comes of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants