Allow appending extra values to embedding vector #289

peastman · 2024-02-21T21:24:48Z

This feature lets you inject extra per-sample or per-atom values into the model. They are appended to the embedding vector for each atom, then a linear layer mixes them in and reduces the vector back down to its original size. In the config file, you specify one or more extra fields from the dataset that should be added to the embedding:

extra_embedding: partial_charges

or

extra_embedding:
  - partial_charges
  - total_charge

Each one can have a single value per sample (for a global scalar like total charge) or a value for each atom.

When calling the model for inference, you specify them with extra_args:

model(z, pos, extra_args={'partial_charges':partial_charges)

See #26 (comment) for discussion.

torchmdnet/models/model.py

RaulPPelaez · 2024-02-22T08:30:28Z

The design and implementation are great and retrocompatible.
@guillemsimeon, could you comment on the architecture changes?
Peter made it so that the extra arguments are mixed with the embedding using a Linear layer:

torchmd-net/torchmdnet/models/tensornet.py

Lines 308 to 311 in 13926ad

    
           if extra_embedding is not None: 
        
               self.reshape_embedding = nn.Linear(hidden_channels+len(extra_embedding), hidden_channels, dtype=dtype) 
        
           else: 
        
               self.reshape_embedding = None

torchmd-net/torchmdnet/models/tensornet.py

Lines 345 to 347 in 13926ad

    
           if self.reshape_embedding is not None: 
        
               Z = torch.cat((Z,)+tuple(t.unsqueeze(1) for t in extra_embedding_args), dim=1) 
        
               Z = self.reshape_embedding(Z)

guillemsimeon · 2024-02-22T08:33:05Z

We already discussed the changes in the issue.

…

On Thu, 22 Feb 2024 at 09:30, Raul ***@***.***> wrote: The design and implementation are great and retrocompatible. @guillemsimeon <https://github.com/guillemsimeon>, could you comment on the architecture changes? Peter made it so that the extra arguments are mixed with the embedding using a Linear layer: https://github.com/torchmd/torchmd-net/blob/13926ad59446ab1c8bdf0354845028d3d30cff8f/torchmdnet/models/tensornet.py#L308-L311 https://github.com/torchmd/torchmd-net/blob/13926ad59446ab1c8bdf0354845028d3d30cff8f/torchmdnet/models/tensornet.py#L345-L347 — Reply to this email directly, view it on GitHub <#289 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANJMOA7P6RGGUMZ7FUX7ZADYU363DAVCNFSM6AAAAABDTZIFNWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJYHE2DEMRUGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

peastman · 2024-02-27T00:16:38Z

This is finished and ready to merge as far as I'm concerned. I'm getting good results with it. Can someone review it?

giadefa · 2024-02-27T17:02:38Z

we cannot merge this for now, use it in a branch for now. Once we finish what we are doing we will come back to it to see how it performs and if it makes sense to merge it into main.

…

On Tue, Feb 27, 2024 at 1:16 AM Peter Eastman ***@***.***> wrote: This is finished and ready to merge as far as I'm concerned. I'm getting good results with it. Can someone review it? — Reply to this email directly, view it on GitHub <#289 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3KUOVDIKFDBRXUDNTTFZTYVUQXHAVCNFSM6AAAAABDTZIFNWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRVGU3DGOJSGQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

peastman · 2024-02-27T17:52:16Z

Why can't we merge it? What are you doing that conflicts with it?

If this isn't going to be run as an open source project, I can't use it. I need a stable, reliable base to build my research on. That means development is done in public, and I can add features that are essential to my research. If you're going to reject changes without explanation because they conflict with things you're doing in secret and can't tell me about, then I need to end my involvement in this project.

giadefa · 2024-02-27T18:25:24Z

I don't think that what you are saying is right at many levels. Simply, as explained before, the change that you propose has not yet been tested by us. It is a model change, not a cosmetic one. As I wrote before, it might be superseded by another change that we are working on and that we have tested already. Use it in your branch for a bit until we can assess it. If it is better, we will of course merge it, but we also test many things and not all go to main. Happy to clarify further tomorrow.

…

On Tue, Feb 27, 2024 at 6:52 PM Peter Eastman ***@***.***> wrote: Why can't we merge it? What are you doing that conflicts with it? If this isn't going to be run as an open source project, I can't use it. I need a stable, reliable base to build my research on. That means development is done in public, and I can add features that are essential to my research. If you're going to reject changes without explanation because they conflict with things you're doing in secret and can't tell me about, then I need to end my involvement in this project. — Reply to this email directly, view it on GitHub <#289 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3KUORGPZSDNZGSQWBTTU3YVYMNXAVCNFSM6AAAAABDTZIFNWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXGI4TIOJYHA> . You are receiving this because you commented.Message ID: ***@***.***>

peastman · 2024-02-27T19:33:38Z

As I said before, it is not a change to the existing model. It's a completely new feature. If you don't use the new feature, nothing in your model changes the slightest bit. It breaks nothing. It conflicts with nothing.

I've already done extensive testing of it and found that it substantially improves accuracy. I can post numbers if you want to see them. What additional validation do you need?

And we really need to do development in public. Being told you are working on "something" that "might" supersede this in the future, but it's secret and you can't tell me about it, is very frustrating and not useful. In an open source project, decisions get made in public based on public information. You can have your own internal branch where you work on things you don't want to share. But you can't block development of the public version for private reasons that you won't tell anyone.

If I'm going to base my research on this project, it needs to be a stable base to build on. That means having a clear process for reviewing and merging PRs. I have a whole series of features I want to add. I can't create a branch with a new feature, then another branch with another feature based on top of that one, then a third branch with a third feature based on the second one, and so on for months, with my version of the code steadily diverging from the main repo and never knowing whether the very first PR will get rejected and break everything I did after.

peastman added 2 commits February 20, 2024 17:19

Allow appending extra values to embedding vector

58f298f

Fixed test failures

13926ad

RaulPPelaez reviewed Feb 22, 2024

View reviewed changes

torchmdnet/models/model.py Show resolved Hide resolved

This was referenced Feb 23, 2024

update hdf5 dataset #292

Merged

Revert #292 #293

Merged

peastman added 3 commits February 26, 2024 13:25

Fixed torchscript errors

969e201

Improved error handling

5425525

Fixed torchscript error

4201fd2

peastman mentioned this pull request Mar 5, 2024

pq #304

Closed

wiederm mentioned this pull request Jul 19, 2024

allow additional features in embedding choderalab/modelforge#204

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow appending extra values to embedding vector #289

Allow appending extra values to embedding vector #289

peastman commented Feb 21, 2024

RaulPPelaez commented Feb 22, 2024

guillemsimeon commented Feb 22, 2024 via email

peastman commented Feb 27, 2024

giadefa commented Feb 27, 2024 via email

peastman commented Feb 27, 2024

giadefa commented Feb 27, 2024 via email

peastman commented Feb 27, 2024

Allow appending extra values to embedding vector #289

Are you sure you want to change the base?

Allow appending extra values to embedding vector #289

Conversation

peastman commented Feb 21, 2024

RaulPPelaez commented Feb 22, 2024

guillemsimeon commented Feb 22, 2024 via email

peastman commented Feb 27, 2024

giadefa commented Feb 27, 2024 via email

peastman commented Feb 27, 2024

giadefa commented Feb 27, 2024 via email

peastman commented Feb 27, 2024