Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packmol interface: can't pack mnsol benchmark inputs with assumed evaluator defaults #1062

Open
IAlibay opened this issue Sep 25, 2024 · 13 comments

Comments

@IAlibay
Copy link

IAlibay commented Sep 25, 2024

Description

I'm trying to test if we can pack mnsol (from the 2.0 sage benchmark) with Interchange.
My rough understanding (from the paper and some limited looking at the evaluator v3.11 code) is that it was possible to pack all the systems with 2k molecules, a cubic box and a 0.95 * unit.gram/unit.milliliter target density.

However, I can't seem to be able to make this work using Interchange.

The main questions here are:

  1. Is this a correct assumption for how the sage MNSOL benchmarks were run, or is there more to it that I'm missing?
  2. Should we be able to reproduce this with Interchange?
  3. If so, how?

Reproduction

Here's a good example case:

from openff.units import unit
from openff.units.openmm import from_openmm
from openmm import NonbondedForce
from openff.toolkit import Molecule, Topology, ForceField
from openff.interchange.components._packmol import pack_box, UNIT_CUBE
from openff.interchange import Interchange
import pytest

solvent = Molecule.from_smiles('CCCCO')
solvent.generate_conformers()
solvent.assign_partial_charges(partial_charge_method='am1bcc')

ligand = Molecule.from_smiles('CCCCCCCC')
ligand.generate_conformers()
ligand.assign_partial_charges(partial_charge_method='am1bcc')

off_top = Topology.from_molecules(ligand)
solvated_off_top = pack_box(
    molecules=[solvent],
    number_of_copies=[2000],
    solute=off_top,
    tolerance=2.0*unit.angstrom,
    mass_density=0.95 * unit.gram/unit.milliliter,
    box_shape=UNIT_CUBE,
    center_solute=True,
    working_directory='.',
    retain_working_files=False,
)

Output

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
File ~/software/mambaforge/install/envs/openfe/lib/python3.12/site-packages/openff/interchange/components/_packmol.py:718, in pack_box(molecules, number_of_copies, solute, tolerance, box_vectors, mass_density, box_shape, center_solute, working_directory, retain_working_files)
    717 try:
--> 718     result = subprocess.check_output(
    719         packmol_path,
    720         stdin=file_handle,
    721         stderr=subprocess.STDOUT,
    722     ).decode("utf-8")
    723 except subprocess.CalledProcessError as error:

File ~/software/mambaforge/install/envs/openfe/lib/python3.12/subprocess.py:466, in check_output(timeout, *popenargs, **kwargs)
    464     kwargs['input'] = empty
--> 466 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
    467            **kwargs).stdout

File ~/software/mambaforge/install/envs/openfe/lib/python3.12/subprocess.py:571, in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    570     if check and retcode:
--> 571         raise CalledProcessError(retcode, process.args,
    572                                  output=stdout, stderr=stderr)
    573 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '/home/ialibay/software/mambaforge/install/envs/openfe/bin/packmol' returned non-zero exit status 173.

The above exception was the direct cause of the following exception:

PACKMOLRuntimeError                       Traceback (most recent call last)
Cell In[8], line 18
     15 ligand.assign_partial_charges(partial_charge_method='am1bcc')
     17 off_top = Topology.from_molecules(ligand)
---> 18 solvated_off_top = pack_box(
     19     molecules=[solvent],
     20     number_of_copies=[2000],
     21     solute=off_top,
     22     tolerance=2.0*unit.angstrom,
     23     mass_density=0.95 * unit.gram/unit.milliliter,
     24     box_shape=UNIT_CUBE,
     25     center_solute=True,
     26     working_directory='.',
     27     retain_working_files=False,
     28 )

File ~/software/mambaforge/install/envs/openfe/lib/python3.12/site-packages/openff/utilities/utilities.py:80, in requires_package.<locals>.inner_decorator.<locals>.wrapper(*args, **kwargs)
     77 except Exception as e:
     78     raise e
---> 80 return function(*args, **kwargs)

File ~/software/mambaforge/install/envs/openfe/lib/python3.12/site-packages/openff/interchange/components/_packmol.py:725, in pack_box(molecules, number_of_copies, solute, tolerance, box_vectors, mass_density, box_shape, center_solute, working_directory, retain_working_files)
    723     except subprocess.CalledProcessError as error:
    724         if "173" in str(error):
--> 725             raise PACKMOLRuntimeError from error
    727     packmol_succeeded = result.find("Success!") > 0
    729 if not packmol_succeeded:

PACKMOLRuntimeError: 

Software versions

Interchange v0.3.29

@IAlibay
Copy link
Author

IAlibay commented Sep 25, 2024

I'll have a full list of solvent/solute combos a bit later, my tests are running super super slow, so erm.. maybe in a day or so.

@IAlibay
Copy link
Author

IAlibay commented Sep 26, 2024

Part of the issue here might just be that the target density is too high - n-butanol has a density of 0.8 g/ml!

Reducing the density does fix at least this case. Were people manually setting the density for each solvent type with evaluator?

@mattwthompson
Copy link
Member

Unfortunately I don't know past what's in the paper. My experience with Packmol in other projects had me often packing at ~20% less density and running equilibration for longer. Packing has been such a headache for me at ~5% under the target density that I'd avoid packing above the target density whenever possible

@IAlibay
Copy link
Author

IAlibay commented Sep 26, 2024

Unfortunately I don't know past what's in the paper. My experience with Packmol in other projects had me often packing at ~20% less density and running equilibration for longer. Packing has been such a headache for me at ~5% under the target density that I'd avoid packing above the target density whenever possible

Thanks, it would be great to hear from the folks that do use evaluator for non-water mixtures how they do things (maybe they just set the value to 0.8 too).

Do you think there's scope for changing the default value @mattwthompson ? I.e. setting solvate_topology_nonwater to 0.8 or similar and just letting folks know they need to do an NPT equilibration? I think that's an approach used by other folks - e.g. tleap in AMBER.

@IAlibay IAlibay changed the title Packmol interface: can't pack mnsol benchmark inputs with assumed evaluator defauluts Packmol interface: can't pack mnsol benchmark inputs with assumed evaluator defaults Sep 26, 2024
@mattwthompson
Copy link
Member

Short answer: yeah, no thought went into that default value and something more like 0.8 g/cc or lower is easy to defend.

Longer answer: will sync up with scientists for more feedback here - stay tuned & thanks for your patience!

@mattwthompson
Copy link
Member

I don't have a diagnosis of why the behavior is different, but I have consensus on lowering the default value. For a study you might benefit from using a different density for each system, but I'll probably just set it to a default value around 0.7-0.8 g/cc. Do any of your systems fail to pack at that density, or possibly for funkier reasons @IAlibay?

@IAlibay
Copy link
Author

IAlibay commented Sep 30, 2024

I don't have a diagnosis of why the behavior is different, but I have consensus on lowering the default value. For a study you might benefit from using a different density for each system, but I'll probably just set it to a default value around 0.7-0.8 g/cc. Do any of your systems fail to pack at that density, or possibly for funkier reasons @IAlibay?

Thanks, I'll have to check a bit later this week (for this current set). That being said, I am still waiting on a set of necessary solvents from the OpenFF science team, until I have those my development efforts are stalled. Will get back as soon as I have the full details.

@mattwthompson mattwthompson mentioned this issue Oct 2, 2024
4 tasks
@mattwthompson
Copy link
Member

I'm going to be updating the default value of this argument to somewhere between 0.6 and 0.8 g/cc, which I expect to handle most issues you've been running into. When you get back to this (on your own timeline - I'm not a stakeholder here) could you update your scripts to use a lower target mass density and see if that fixes your issues? This behavior won't land until 0.4.0 or 0.4.1, so I have to ask you to manually update things until then.

@IAlibay
Copy link
Author

IAlibay commented Oct 2, 2024

which I expect to handle most issues you've been running into.

Is there any chance solvent packing tests can be implemented at the Interchange level? It feels like doing this downstream is incorrect.

@mattwthompson
Copy link
Member

Probably - I think the unit tests are relatively strong but regression/application tests for packing something at the scale of a realistic simulation. I think I can distill your issues down to packing not converging at 0.95 g/cc?

@IAlibay
Copy link
Author

IAlibay commented Oct 2, 2024

Probably - I think the unit tests are relatively strong but regression/application tests for packing something at the scale of a realistic simulation. I think I can distill your issues down to packing not converging at 0.95 g/cc?

i think it's more "it doesn't match evaluator" (which may not be a bug!)

@IAlibay
Copy link
Author

IAlibay commented Oct 2, 2024

Maybe to add to this - it looks like somewhere we're going to have to add a test with a list of solvents that we expect to be able to pack properly in a way that doesn't crash (for the use of the OpenFF science folks needing to run those systems for benchmarks). I can do that downstream but, especially given the context, it feels like this should be an interchange test not a downstream one.

@mattwthompson
Copy link
Member

I think that's fair - perhaps we can spin up some extra resources to run a larger set (tens?) of solvents at realistic sizes (~5 nm cubes) every once in a while, like a weekly cron job or every time something is committed to the main branch.

For now I have only partial solutions (#1067 and #1068 should get us most of the way) with hopes of unblocking that work whenever you end up getting back to it. (No time pressure from me personally on this!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants