Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document charge assignment hierarchy #68

Open
mattwthompson opened this issue Sep 13, 2024 · 5 comments
Open

Document charge assignment hierarchy #68

mattwthompson opened this issue Sep 13, 2024 · 5 comments
Labels

Comments

@mattwthompson
Copy link
Member

mattwthompson commented Sep 13, 2024

The OpenFF Toolkit/Interchange has/had long-standing implementations of hierarchical charge assignment, in particular that charges are assigned from methods in the following order

  1. Look for molecule matches in the charge_from_molecules kwarg
  2. Look for chemical environment matches in library charges
  3. Look for chemical environment matches in charge increments
  4. Try to run some variant of AM1-BCC (presumably this is were graph charges go)

This is in the spec ... but also not presented in a helpful way. I found (detailed below the fold) hints at this hierarchy in each section, but it'd be really convenient to users to have this hierarchy spelled out. I happen to know this behavior because of my familiarity with the implementation, but I'd be really confused if I was a new user trying to figure out which of the many possible charge assignment methods was actually happening. (Or see another failure case below.)

I think, maybe under this section header, a short mention of the above list would be helpful to users. Most of this information is buried in the other sections but not presented in a way that makes it explicit.

As a somewhat problematic example, imagine a user has a polymer of some sort in which AM1 is not practical, there aren't pre-specific charges for charge_from_molecules, there also isn't a charge increment model, and they made a typo in a large SMIRKS pattern describing the library charges of a residue. A scientist is probably only expecting pre-specific or library charges to be used on macromolecules, so our software hanging (the result of firing up AM1) is certainly useless and arguably surprising. This could be worked around, and the toolkit tries (see below) to warn users if AM1-BCC is called on large molecules, but the result in this scenario is confusion as to why AM1-BCC is even in the picture.

>>> from openff.toolkit import Molecule
>>> Molecule.from_smiles(200 * "C").assign_partial_charges(partial_charge_method="am1bcc")
<stdin>:1: UserWarning: Warning! Partial charge method 'am1bcc' is not designed for use on large (i.e. > 150 atoms) molecules and may crash or take hours to run on this molecule (found 602 atoms). For more, see https://docs.openforcefield.org/projects/toolkit/en/stable/faq.html#parameterizing-my-system-which-contains-a-large-molecule-is-taking-forever-whats-wrong

Library charge section

Library charges are applied first, and atoms for which library charges are applied will be excluded from alternative charging schemes listed below.

Note that this isn't true, since charge_from_molecules takes precedence over library charges.

Charge increments section

Note that atoms for which library charges have already been applied are excluded from charging via <ChargeIncrementModel>.

AM1-BCC section

Note that atoms for which prespecified or library charges have already been applied are excluded from charging via <ToolkitAM1BCC>.

“Prespecified charges” section doesn’t even communicate “if you set charges this way, all other methods are skipped” and I really think it should


@mattwthompson
Copy link
Member Author

Working on this has surfaced #69

@j-wags
Copy link
Member

j-wags commented Oct 1, 2024

I agree with everything @mattwthompson said originally and would approve a PR to clarify this in the spec early in the Partial charge and electrostatics models section. I don't think any section versions would need to be updated for this change as it is a clarification of existing behavior.

@davidlmobley
Copy link

Fully agree with this as well, @mattwthompson . Nice catch, agree it needs to be much clearer. Same perspective as Jeff.

@lilyminium
Copy link
Contributor

Definitely agree as well -- this would be a huge improvement to the standard! Would it be helpful to document the isomorphism issue here too?

@mattwthompson
Copy link
Member Author

I took some bits of my write-up and turned them into Interchange docs which I think covers in information content what we seem to agree is missing here. I probably won't have time to do this for a few weeks, but this could be used as a starting point for what we want added into the spec here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants