Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial charge assignment using charge_from_molecules on virtual sites / charge increments can be unexpected. #1050

Open
IAlibay opened this issue Sep 16, 2024 · 9 comments

Comments

@IAlibay
Copy link

IAlibay commented Sep 16, 2024

Description

Related to #1048

When building a system with OPC water, if a user passed along a pre-charged water, they might end up with an unexpected set of partial charges in the resultant 4 point molecule due to charge increments.

The behaviour isn't incorrect according to the spec, but you then also have to expect users to a) know the charge assignment order in the spec, b) remember that there are charge increments affecting the molecule they passed charges for, c) do the charge increment math ahead of time to make sure they passed along the right charges.

I'm not 100% sure, but I think the ask would be something along the lines of:

  1. Some kind of tooling to inspect the partial charges you are likely to get after all the increments, etc.. are applied for each molecule (rather than having to manually inspect the OpenMM system to see what happened). Is there a better way to assign partial charges for virtual site containing molecules that I'm missing?
  2. Some kind of warning that further processing of the partial charges will happen.
  3. Documentation that a) yes charge_from_molecules will ignore library charges, but b) charge increments will be applied (this is not clear from the existing docs).

Reproduction

from openff.toolkit import Molecule, Topology, ForceField
from openff.interchange.components._packmol import solvate_topology
from openff.interchange import Interchange
from openmm import NonbondedForce

water = Molecule.from_smiles('O')
water.assign_partial_charges(partial_charge_method='gasteiger')
print(water.partial_charges)

ligand = Molecule.from_smiles('CCCCC')
ligand.generate_conformers()

off_top = Topology.from_molecules(ligand)
solvated_off_top = solvate_topology(topology=off_top)

ff = ForceField('openff-2.2.0.offxml', 'opc.offxml')
inter = Interchange.from_smirnoff(topology=solvated_off_top, force_field=ff, charge_from_molecules=[water])
omm_system = inter.to_openmm_system()

nonbond = [f for f in system.getForces() if isinstance(f, NonbondedForce)][0]

# Get the parameters for the first water oxygen
nonbond.getParticleParameters(17)

Output

partial charges

[-0.4115095219374226 0.2057547609687113 0.2057547609687113] elementary_charge

particle parameters for oxygen (in OPC this is a zero charged atom since all the charge is on the virtual site)

[Quantity(value=-0.4115095219374226, unit=elementary charge),
 Quantity(value=0.31506999999999996, unit=nanometer),
 Quantity(value=0.6363864000000001, unit=kilojoule/mole)]
@mattwthompson
Copy link
Member

Pre-charged waters sounds like a strange use case; I don't have grounds to forbid it, but it'll really stress what's documented and easy to work with. Just documenting the quirks is a last resort, I'd like things to work more smoothly and in line with SMIRNOFF and user expectations if possible.

Before getting to that, though, is your reproduction missing a charge_from_molecules somewhere? The charges on the Molecule object shouldn't be processed by default. Without a ligand I'm getting

from openff.toolkit import Molecule, ForceField
from openmm import NonbondedForce

water = Molecule.from_smiles("O")
water.assign_partial_charges(partial_charge_method="gasteiger")
print(
    f"Partial charges on molecule: {[round(charge, 3) for charge in water.partial_charges]}"
)
# Partial charges on molecule: [<Quantity(-0.41, 'elementary_charge')>, <Quantity(0.205, 'elementary_charge')>, <Quantity(0.205, 'elementary_charge')>]

ff = ForceField("openff-2.2.0.offxml", "opc.offxml")
system = ff.create_interchange(water.to_topology()).to_openmm_system()

nonbond = [f for f in system.getForces() if isinstance(f, NonbondedForce)][0]

# Get the parameters for the first water oxygen
print(
    f"Partial charge in system: {round(nonbond.getParticleParameters(0)[0]._value, 3)}"
)
# Partial charge in system: 0.0

@IAlibay
Copy link
Author

IAlibay commented Sep 16, 2024

Before getting to that, though, is your reproduction missing a charge_from_molecules somewhere? The charges on the Molecule object shouldn't be processed by default. Without a ligand I'm getting

Ah yeah sorry, I was playing around with and without and copied the wrong thing 😅 - I've edited the originall comment!

@IAlibay
Copy link
Author

IAlibay commented Sep 16, 2024

Pre-charged waters sounds like a strange use case

The use case I have here is that I'm writing this thing where users can optionally define their solvent molecules.. and well in some cases the solvent can be water 😅 (see: https://github.com/IAlibay/pontibus/blob/develop/src/pontibus/utils/system_creation.py). You could say this is "intentionally" pushing things, because the intended users are.. well you folks.

A question - is this really much different than how ligands with vsites will behave? I.e. would you expect users to more easily know how the offsite charge is being incremented?

@mattwthompson
Copy link
Member

Upon some reflection ... this is quite a mess and I'm not totally sure how to proceed. At best, Interchange handles this poorly. At worst, SMIRNOFF doesn't handle it well. I'm looking to get #1048 done before this, which may help!

Two things are referred to as "charge increments":

  • A proper <ChargeIncrementModelHandler>, which so far has mostly been used to apply custom BCCs on top of AM1, effectively tuning a new AM1-BCC variant (as if they're not all variants. This has a section in .offxml files, is a ParameterHandler in the toolkit, has analogous machinery here ... you get the idea, it's a concrete thing. This concrete thing also has a clear hierarchy in the implementation - after preset charges, after library charges, before (the fallback use of) AM1-BCC.
  • The transfer of charges from atoms to virtual sites as described in individual parameters in a VirtualSites section. There isn't an associated section in the force field file or our Python models for this. So far, these have either been on water (which uses library charges) or ligands (which get AM1-BCC) after which the "virtual site charge increments" should be applied.

Looping back to the original case (preset water charges passed along with a 4-site water model), I'm not totally sure what should happen here. The current implementation sees a match in charge_from_molecules and short-circuits, I guess? Bits of the spec are

(preset charges)

In our reference implementation of SMIRNOFF in the OpenFF Toolkit, we also provide a method for specifying user-defined partial charges during system creation. This functionality is accessed by using the charge_from_molecules optional argument during system creation, such as in ForceField.create_openmm_system(topology, charge_from_molecules=molecule_list). When this optional keyword is provided, all matching molecules will have their charges set by the entries in molecule_list

(virtual sites)

Each virtual site receives charge which is transferred from the desired atoms specified in the SMIRKS pattern via a charge_increment# parameter, e.g., if charge_increment1=0.1*elementary_charge then the virtual site will receive a charge of -0.1 and the atom labeled 1 will have its charge adjusted upwards by 0.1. N may index any indexed atom. Additionally, each virtual site can bear Lennard-Jones parameters, specified by sigma and epsilon or rmin_half and epsilon. If unspecified these default to zero.

Snipping out just the relevant bits of opc.offxml:

    <LibraryCharges version="0.3">
        <LibraryCharge smirks="[#1]-[#8X2H2+0:1]-[#1]" charge1="0.0 * elementary_charge ** 1" id="q-opc-O"></LibraryCharge>
        <LibraryCharge smirks="[#1:1]-[#8X2H2+0]-[#1]" charge1="0.0 * elementary_charge ** 1" id="q-opc-H"></LibraryCharge>
    </LibraryCharges>
    <VirtualSites version="0.3" exclusion_policy="parents">
        <VirtualSite smirks="[#1:2]-[#8X2H2+0:1]-[#1:3]" epsilon="0.0 * kilocalorie_per_mole ** 1" type="DivalentLonePair" match="once" distance="-0.15939833 * angstrom ** 1" outOfPlaneAngle="0.0 * degree ** 1" inPlaneAngle="None" charge_increment1="0.0 * elementary_charge ** 1" charge_increment2="0.679142 * elementary_charge ** 1" charge_increment3="0.679142 * elementary_charge ** 1" rmin_half="1.0 * angstrom ** 1" name="EP"></VirtualSite>
    </VirtualSites>

I can think of only two possible expectations here:

  1. charge_from_molecules wins and stop looking. So the charges should be (O H H VS) with poor rounding -0.4, +0.2, +0.2, and 0.0.
  2. charge_from_molecules only specifies the atomic partial charges, and virtual site charge increments still apply. So the resulting charges would be (O H H VS) again with poor rounding -0.4, +0.9, +0.9, -1.3

(1) seems bad but is also confusing since the interaction between preset charges and virtual sites seems ambiguous/under-defined. I think you're expecting (2) here?

@IAlibay
Copy link
Author

IAlibay commented Sep 18, 2024

I think you're expecting (2) here?

From my likely poor understanding of the spec, yes. At least as a user I would expect the same behaviour you would expect with ligands.

As far as I'm aware this is the behaviour that you get in practice, it just isn't immediately clear as a user that this is happening (or even that you have virtual sites for that unique molecule).

@mattwthompson
Copy link
Member

Unfortunately, I'm in a position where I think this is underspecified and I need updates to the SMIRNOFF spec to proceed: openforcefield/standards#69

I originally hoped this was limited to your (IMO) esoteric case of passing water through to charge_from_molecules, but I now realize this applies the same to ligands' preset charges interacting with virtual sites. (It's just that, right now, you're probably only adding virtual sites to the water.)

cc: @lilyminium @j-wags

@lilyminium
Copy link
Contributor

I commented on the SMIRNOFF spec with my opinion that I would also expect 2) in this scenario and that I think that's the best option.

Some kind of tooling to inspect the partial charges you are likely to get after all the increments, etc.. are applied for each molecule (rather than having to manually inspect the OpenMM system to see what happened).

Does the .charges property on the SMIRNOFFElectrostaticsCollection work for you? (possibly complicated by #1052 at the moment).

Something like:

>>> from openff.toolkit import ForceField, Molecule, Topology
>>> water = Molecule.from_smiles('O')
>>> water.assign_partial_charges(partial_charge_method='gasteiger')
>>>
>>> ligand = Molecule.from_smiles('CCCCC')
>>> ligand.generate_conformers()
>>>
>>> off_top = Topology.from_molecules(ligand, water)
>>> ff = ForceField('openff-2.2.0.offxml', 'opc.offxml')
>>> inter = ff.create_interchange(off_top, charge_from_molecules=[water])
>>> inter.collections["Electrostatics"].charges

@IAlibay
Copy link
Author

IAlibay commented Oct 1, 2024

Does the .charges property on the SMIRNOFFElectrostaticsCollection work for you? (possibly complicated by #1052 at the moment).

Sure, it's sorta what I've been doing-ish, but from the docs it's really unclear that this is how you're meant to do things. I would suggest doing some kind of cookbook or userguide example maybe? (unless I'm being oblivious and missing one).

@mattwthompson
Copy link
Member

Modifying charges after the fact is something that should be straightforward but probably only I am familiar with the sharp edges and we haven't really documented it well. I think it's a fair use case, but it's partially in tension with the design that a force field includes instructions on how to assign partial charges (and, by deduction, not whatever else is brought to the table).

Tracked so I don't forget: #1071

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants