Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize fixes #1751

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Visualize fixes #1751

wants to merge 14 commits into from

Conversation

Yoshanuikabundi
Copy link
Collaborator

@Yoshanuikabundi Yoshanuikabundi commented Oct 19, 2023

This finishes some stuff left over from the visualization updates - the overloads for Molecule.visualize() have been corrected, and the whole ensure_correct_connectivity thing in Topology.visualize() has been removed.

ensure_correct_connectivity was supposed to do two things:

  1. Ensure the bonds in the topology are the bonds in the visualization
  2. Get bond orders into the visualization

Over in Molecule.visualize() land, we currently attempt to talk to NGLView with MOL2 and then failover into PDB when RDKit spits the dummy. But RDKit's PDB writer supports writing multiple bonds in CONECT records... so double bonds show up. So I wrote a Topology PDB writer with RDKit... and it works. So that's (2) handled! Not sure if we like this, hate this, or really like it and want to use it for general topology PDB exports. This means visualizing multiple bonds works only with RDKit, and we fall back to OpenMM if RDKit is missing.

NGLView supplements CONECT records with bonds inferred from positions, so (1) isn't currently possible for Molecules or Topologies. I've updated the docstrings to tell users about this. This may be fixable soon: nglviewer/ngl#977. If that stalls, we might want to drop the PDB failover and switch to SDF. The catch of SDF is that protein secondary structure is absent, and so is atom metadata like atom names, residue names, etc. These all show up in PDB and MOL2.

So at present there's no way to ensure the correct connectivity, and in the future the most likely path to making it possible won't have a downside. Thus, I removed the option.

  • Update docstrings/documentation, if applicable
  • Lint codebase
  • Update changelog
  • Make sure # doctest: +SKIP doesn't render in API docs

@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

Merging #1751 (a57a88d) into main (524a0e6) will decrease coverage by 0.35%.
The diff coverage is 69.44%.

Additional details and impacted files

@mattwthompson
Copy link
Member

In #1771 somebody reported that mucking around with metadata causes some issues - I haven't looked into how this PR interacts with it, but it'd be nice to know. (Note also that in that issue, the behavior was only observed WITHOUT OpenEye.)

@Yoshanuikabundi
Copy link
Collaborator Author

nglviewer/ngl#977 has been merged, so we should be able to get the infer bonds issues fixed soon (eg #1771). I've raised nglviewer/nglview#1092 to see what needs to be done for upstream support, and hopefully with the feedback from that I can put together some sort of interim private method hackery!

@Yoshanuikabundi
Copy link
Collaborator Author

#1771 seems to be fixed with this PR BTW, because we don't use PDB for Molecule.visualize any more. But the same bug still occurs with offmol.to_topology().visualize() because that uses PDB.

@Yoshanuikabundi
Copy link
Collaborator Author

This is ready for review! I think this is a major improvement on the status quo as is, and guaranteeing correct connectivity in visualizations should come in a later PR after I've had a chance to do the supporting work on NGLView upstream.

@mattwthompson
Copy link
Member

Sorry for dragging my feet on reviewing this. Unfortunately I think I need to be babied here; since it's hard for us to write unit tests for something that shows up in the browser, I need to manually tinker with the different combinations of inputs and states, verify that the results are expected, and do a rain dance to ensure that it all works the same on future users' machines.

If I understand the change correctly, the important umbrella test would be a protein-ligand complex in which the ligand has a double or triple bond(s)?

@j-wags
Copy link
Member

j-wags commented Dec 20, 2023

Ah, I'd planned to take this one, and would prefer to keep it on my plate unless you're currently mid-review.

@mattwthompson
Copy link
Member

Go for it

@mattwthompson mattwthompson removed their request for review December 20, 2023 18:43
@j-wags j-wags self-requested a review December 20, 2023 18:56
Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good. I haven't checked behavior yet but two (blocking) things that stand out to me are:

  1. The rdkit import in _viz.py (see comment)
  2. The failing CI seems to involve code added by this PR

and this one isn't blocking, but is there a way to query the widget object to see whether bond orders were set? It's fine if not, I'd just love for this to have an automated test :-)

self.topology.to_file(f, file_format=self.ext)
structure_string = f.getvalue()
if RDKIT_AVAILABLE:
from rdkit.Chem.rdmolfiles import PDBWriter # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(blocking) Direct cheminformatics toolkit imports outside of toolkit wrappers have caused all sorts of havoc before. Even with the RDKIT_AVAILABLE guard I'm worried about this. Can this be refactored to put the RDKit import and associated code in RDKitToolkitWrapper?

Comment on lines +2413 to +2414
conformations, include bonds not present in the topology. Rendering
multiple bonds additionally requires RDKit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not blocking) Is this the correct interpretation, or am I misunderstanding the intent here?

Suggested change
conformations, include bonds not present in the topology. Rendering
multiple bonds additionally requires RDKit.
conformations, include bonds not present in the topology. Rendering
bond orders correctly additionally requires RDKit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants