Address open questions on SMIRNOFF format spec revamp #48

davidlmobley · 2018-10-16T22:14:12Z

https://open-forcefield-toolkit.readthedocs.io/en/topology/smirnoff.html lists a number of open questions raised by @jchodera (and relevant to @j-wags ) on the SMIRNOFF format spec. I'll give draft answers to some of them here:

"What should we name the parameter sections?" I'd call them "parameter sections", since the other suggested terms ("force field terms", "force classes", "interaction types" all seem already loaded with meaning or like they are overly narrow)
"Should we have a separate metadata or provenance section that users can add whatever they want to?" Seems like a good idea to me. Probably "metadata".
Do we want to allow users to specify atomic or particle masses (e.g. heavy hydrogens)? I am OK with this but also ambivalent.
"Should individual force classes [parameter sections] be versioned, rather than having a global SMIRNOFF version?" I am not certain. I think initially I'd go for the simplest to implement in order to finish sooner. I can see the use case for having versions of specific parameter sections, because if you have a force field which uses only unchanged versions of the spec you don't have to upgrade it for it to work with the new one. But that does add more work to implementing and more work checking for updates, etc.
"Should we expand fractional bond orders beyond <Bonds>?" Yes! That is supposed to be implemented! We must have lost the issue for finishing that implementation when we migrated repos. @j-wags take note; we need to apply the same infrastructure that we use for Bonds for angles and torsions.
XML schema: I don't know, but since we're heading towards various serialization options that probably means no?
"Handling extra XML tag attributes": Agree that having tags there which the parser does not understand should raise an error or warning (perhaps an option as to which). Things like id are very useful though. Maybe we should have some optional tags (like id) which won't raise warnings/exceptions if present, but if anything other than required/optional tags is present it will raise an exception?
"Should we integrate ParmEd and InterMol functionality by adding create() methods for other simulation packages?" Not at present, as this makes us responsible for them NOW. Better not to be responsible for them yet.
"Add link to complete open specification of OEAroModel_MDL aromaticity model" Caitlin can pull this from the SMIRNOFF paper/her notes and provide when someone is ready for it.
"Are there ways we can simplify the integration of legacy biopolymer force fields?" Let's just provide some good examples to start with.
" How will we ensure the SMIRNOFF force field is correctly implemented by molecular simulation packages where nonbonded treatments are encoded by auxiliary input files?" I don't know, let's make a separate issue for that and deal with it separately.
"How should we use multiple conformations in charging?" We should be doing ELF at first I think; this can warrant separate science. Create an issue?
"Should we support RESP charging?" For now, via user charges, otherwise this makes us responsible for RESP at present and Lee-Ping's code is not ready for this.
"How should we fragment larger small molecules and polymers for charging?" Separate issue, deal with later. We and OpenEye have some leads on this but it will take a person working on it for a while.

The text was updated successfully, but these errors were encountered:

j-wags · 2018-10-18T01:54:32Z

It looks like, using OE tools, I can calculate Wiberg bond orders using https://docs.eyesopen.com/toolkits/python/quacpactk/bondordertheory.html . Is there an equivalent method we have in mind using the RDKit/AmberTools toolkits?

hjuinj · 2018-10-18T13:28:49Z

@j-wags from what I looked into in the past, I don't think there is the equivalent in RDKit.

davidlmobley · 2018-10-18T18:05:56Z

@j-wags Right, there is nothing in AmberTools for this yet. You may have noticed this was also something which came up in the Torsions call and we reminded @dgasmith about the need for it.

For now the RDKit implementation would presumably have to raise an exception if it is asked to do something with bond orders.

(We don't yet have a force field which requires bond orders, so these are an optional -- but important for the future -- feature.)

j-wags · 2019-04-05T22:58:10Z

Commit 246101b moves remaining spec questions from The-SMIRNOFF-force-field-format.md to this issue.

Should we have a separate `<Metadata>` or `<Provenance>` section that users can add whatever they want to?

This would minimize the potential for accidentally colliding with other tags we add in the future.

Do we want to allow users to specify atomic or particle masses? This could allow, for example, heavy hydrogens to be specified easily.

<!-- Use average atomic masses (averaged over isotopes) except for heavy hydrogens -->
<Masses default="average-atomic-mass" mass_unit="amu">
   <!-- Make hydrogens heavy -->
   <Mass smirks="[#1:1]" mass="3"/>
</Masses>

While this won't affect thermodynamic properties, it could affect kinetic properties, which may be something our force fields optimize for in the future.

Should individual force classes be versioned, rather than having a global SMIRNOFF version?

Should we expand fractional bond orders beyond `<Bonds>`?

Should we have an XML Schema?

An XML Schema would make it easier to validate XML representations of SMIRNOFF to make sure they are compliant and detect errors.

Should we integrate ParmEd and InterMol functionality by adding `create()` methods for other simulation packages?

We could integrate ParmEd and InterMol as dependencies in our tooolkit and add methods like ForceField.create_amber_system() or ForceField.create_charmm_system() that generate input files for other packages without the need to convert.
While perhaps not immediately useful for combining biopolymers parameterized with traditional force fields with SMIRNOFF-parameterized small molecules, once these legacy force fields are available in SMIRNOFF format or we have new biopolymer SMIRNOFF force fields, this drastically simplifies workflows.

Add link to complete open specification of `OEAroModel_MDL` aromaticity model

Are there ways we can simplify the integration of legacy biopolymer force fields?

Are there ways we can make it easy to integrate pre-parameterized systems describing part of the topology (e.g. protein)?

How will we ensure the SMIRNOFF force field is correctly implemented by molecular simulation packages where nonbonded treatments are encoded by auxiliary input files?

For gromacs, AMBER, and CHARMM, the nonbonded treatments (which are integrally specified by a SMIRNOFF force field) are instead specified by an auxiliary input file.
Should we generate this auxiliary input file to, or part of it?
How can we insist that the desired settings be used?

We should include some missing references

Supported aromaticity models
Supported fractional bond order models
Quantum chemical methods (e.g. AM1) and charge partitioning schemes (e.g. CM2)
AM1-BCC

Should we support RESP charging?

How should we use multiple conformations in charging?

Should we follow the RESP approach, where the charges or averaged?
Or the ELF approach, where we use some kind of energy function to evaluate which single conformer is used to compute which conformer is to be used for quantum chemical calculations?
What scheme should we use to generate the conformers in a deterministic manner?

How should we fragment larger small molecules and polymers for charging?

Larger small molecules may not benefit from quantum chemical calculations on the whole molecule.
Instead, it might be more robust (and faster) to break the molecule into smaller capped fragments, compute charges separately for these fragments, and combine the charges according to some algorithms.
This approach would also work for polymers, allowing us to parameterize arbitrary polymeric residues and covalent modifications of them.
What is the best way to do this?

j-wags · 2019-04-06T21:08:48Z

Per openforcefield/openff-toolkit#233 (review)

Should the SMIRNOFF spec live in its own repo, instead of being bundled with the toolkit documentation?

davidlmobley · 2019-04-07T03:14:18Z

Hmm, the spec in its own repo? That's an interesting idea. Pros and cons?

jchodera · 2019-04-07T05:27:18Z

Examples:

The QC JSON spec
autoprotocol

j-wags · 2019-06-07T14:47:36Z

Copying relevant text from openforcefield/openff-toolkit#341 (comment)

Here are some unresolved questions we may need to solve

What is the relationship between a parameter section and the SMIRNOFF tag?

I actually don't know, so I'm going to "think out loud" here.

The SMIRNOFF tag currently encodes the aromaticity model. Could we achieve the same behavior by having each parameter section have an aromaticity model? Yes, we could.

So maybe, the idea of an enclosing tag in the SMIRNOFF spec indicates "a value that would otherwise need to be set repeatedly in the enclosed tags". In that sense, the top-level aromaticity_model attribute is implicitly a value that affects, for example, the BondHandler's behavior.

As a thought experiment, I could make a new ParameterHandler that applies parameter based on a different aromaticity model, and have it override the higher level aromaticity model. This is a weird example, but I couldn't come up with a better one -- The point is that it would seem to be arbitrary to tell a developer they can't do that, just because we say so.

Note that the above "sections should be enclosed in other sections if the enclosing section affects the behavior of the enclosee" logic would have a consequence for @andrrizzi's question in openforcefield/openff-toolkit#310 (comment)

First thing that comes to mind is that we might want to have the charge model tags being children tags of Electrostatics, which will essentially have 3 levels of indentation instead of 2.

It implies that the sections that describe atomic charges should be enclosed by the Electrostatics tag (since attributes like the cutoff influence the functional form). It also opens the question of "what if a person wanted both the Electrostatics and GBSA tags? What should enclose the charge-determining sections?". I don't know what we'd do in that situation.

And, in a similar vein, this will make things like VirtualSites more difficult, since they currently are in their own section, but will experience electrostatics and vdW forces. Should they be doubly-nested in Electrostatics and vdW tags? Or should the VirtualSite definitions be duplicated in the enclosing Electrostatics and vdW sections?

So maybe this isn't the right view of section enclosure.

Maybe the SMIRNOFF version encodes something about the underlying section structure

For example, in the 0.2 -> 0.3 transition, we switch from units-in-the-header to units-in-the-quantity. Well, I'd argue that even that change could have been handled at the parameter section level. So one could argue that changing to the "SMIRNOFF 0.3 spec" doesn't actually mean anything -- it's the parameter sections that need to change their behavior! But, I guess the SMIRNOFF spec change here records a "rule change", because the SMIRNOFF 0.3 spec allows ParameterHandlers with their own versions.

Is a parameter section's version truly just the `version=Y` attribute in its defining tag? Or is it really a tuple of `(SMIRNOFF_version=X, section_version=Y)`?

I think it has to be the tuple, because if a parameter section only works with SMIRKS1, but the SMIRNOFF tag sets SMIRKS2... then the parameter section needs to know about this higher-level setting that didn't even exist when it was written. So I guess each parameter section will need to have a MAX_SMIRNOFF_VERSION tag. And probably a MIN_SMIRNOFF_VERSION too.

mattwthompson · 2023-01-27T15:34:20Z

This is no longer strictly in the domain of the toolkit so I'm transferring it to the standards tracker. I don't know what is currently actionable, but if anything is actionable I recommend triaging this into smaller discrete action items in their own issues or tickets.

davidlmobley assigned j-wags Oct 16, 2018

j-wags referenced this issue in openforcefield/openff-toolkit Apr 5, 2019

Moved "Open Questions" from the SMIRNOFF spec page and put them in #120

246101b

j-wags referenced this issue in openforcefield/openff-toolkit Apr 6, 2019

Moved "Open Questions" from the SMIRNOFF spec page and put them in #120

f998e2c

j-wags referenced this issue in openforcefield/openff-toolkit Apr 6, 2019

Moved "Open Questions" from the SMIRNOFF spec page and put them in #120

4d26d0c

j-wags mentioned this issue Apr 6, 2019

0.2 docs update openforcefield/openff-toolkit#233

Merged

j-wags referenced this issue in openforcefield/openff-toolkit Apr 6, 2019

Moved "Open Questions" from the SMIRNOFF spec page and put them in #120

b50935b

j-wags referenced this issue in openforcefield/openff-toolkit Apr 6, 2019

Moved "Open Questions" from the SMIRNOFF spec page and put them in #120

05bc287

j-wags mentioned this issue Jun 7, 2019

0.4.0 releasenotes openforcefield/openff-toolkit#341

Merged

1 task

mattwthompson transferred this issue from openforcefield/openff-toolkit Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address open questions on SMIRNOFF format spec revamp #48

Address open questions on SMIRNOFF format spec revamp #48

davidlmobley commented Oct 16, 2018

j-wags commented Oct 18, 2018

hjuinj commented Oct 18, 2018

davidlmobley commented Oct 18, 2018

j-wags commented Apr 5, 2019 •

edited

Loading

j-wags commented Apr 6, 2019

davidlmobley commented Apr 7, 2019

jchodera commented Apr 7, 2019

j-wags commented Jun 7, 2019

mattwthompson commented Jan 27, 2023

Address open questions on SMIRNOFF format spec revamp #48

Address open questions on SMIRNOFF format spec revamp #48

Comments

davidlmobley commented Oct 16, 2018

j-wags commented Oct 18, 2018

hjuinj commented Oct 18, 2018

davidlmobley commented Oct 18, 2018

j-wags commented Apr 5, 2019 • edited Loading

Should we have a separate <Metadata> or <Provenance> section that users can add whatever they want to?

Do we want to allow users to specify atomic or particle masses? This could allow, for example, heavy hydrogens to be specified easily.

Should individual force classes be versioned, rather than having a global SMIRNOFF version?

Should we expand fractional bond orders beyond <Bonds>?

Should we have an XML Schema?

Should we integrate ParmEd and InterMol functionality by adding create() methods for other simulation packages?

Add link to complete open specification of OEAroModel_MDL aromaticity model

Are there ways we can simplify the integration of legacy biopolymer force fields?

How will we ensure the SMIRNOFF force field is correctly implemented by molecular simulation packages where nonbonded treatments are encoded by auxiliary input files?

We should include some missing references

Should we support RESP charging?

How should we use multiple conformations in charging?

How should we fragment larger small molecules and polymers for charging?

j-wags commented Apr 6, 2019

davidlmobley commented Apr 7, 2019

jchodera commented Apr 7, 2019

j-wags commented Jun 7, 2019

Here are some unresolved questions we may need to solve

What is the relationship between a parameter section and the SMIRNOFF tag?

Maybe the SMIRNOFF version encodes something about the underlying section structure

Is a parameter section's version truly just the version=Y attribute in its defining tag? Or is it really a tuple of (SMIRNOFF_version=X, section_version=Y)?

mattwthompson commented Jan 27, 2023

j-wags commented Apr 5, 2019 •

edited

Loading

Should we have a separate `<Metadata>` or `<Provenance>` section that users can add whatever they want to?

Should we expand fractional bond orders beyond `<Bonds>`?

Should we integrate ParmEd and InterMol functionality by adding `create()` methods for other simulation packages?

Add link to complete open specification of `OEAroModel_MDL` aromaticity model

Is a parameter section's version truly just the `version=Y` attribute in its defining tag? Or is it really a tuple of `(SMIRNOFF_version=X, section_version=Y)`?