Add convenient `Crepe.represent_sequences` method #117

willdumm · 2025-02-13T23:53:44Z

This addresses #115, adding Crepe.represent_sequences, and a number of supporting methods on D*SM model classes.

It also eliminates the option of providing non-paired sequences to D*SM model methods that take a string (and therefore also all Crepe methods that call them). These methods now require amino acid sequences to be provided in (heavy_chain, light_chain) tuples, where a missing chain sequence can be represented by the empty string.

The represent_sequences function returns a tensor for each heavy-light pair provided to it, while Crepe.__call__ returns a pair of tensors (one for heavy, one for light chain) for each heavy-light pair provided to it. This seems to me the correct choice, but there could be justification for splitting the embedding tensors returned by represent_sequences on heavy/light boundaries.

willdumm · 2025-02-14T00:15:42Z

netam/dxsm.py

@@ -449,25 +449,3 @@ def worker_optimize_branch_length(burrito_class, model, dataset, optimization_kw
    """The worker used for parallel branch length optimization."""
    burrito = burrito_class(None, dataset, copy.deepcopy(model))
    return burrito.serial_find_optimal_branch_lengths(dataset, **optimization_kwargs)
-


This function had to be moved to avoid circular dependencies.

matsen

Great!

Q: If we can't supply single chains, does it make sense to require paired tuples? Ablang take lists of lists and I can get those with

one_pair = [train_df.iloc[0][['heavy', 'light']].tolist()]

Of course, I can manage with

def lists_to_tuples(list_of_lists):
    return [tuple(lst) for lst in list_of_lists]

rep = crepe.represent_sequences(lists_to_tuples(one_pair))

No big deal but I thought I'd ask.

Also, it appears that the return type of represent_sequences is a tuple. Is that on purpose?

willdumm · 2025-02-14T18:11:13Z

Thanks for noticing these things! I changed the check on sequence inputs to allow any non-str type of length two. Also, I changed the return type to list.

matsen

Works like a charm. Merge it!

almost working

e20894c

willdumm marked this pull request as draft February 13, 2025 23:54

willdumm added 4 commits February 13, 2025 15:55

fix test

b2a048a

format and lint

e74cff3

add docstrings

8781881

format

2993b3e

willdumm marked this pull request as ready for review February 14, 2025 00:14

willdumm requested a review from matsen February 14, 2025 00:14

willdumm commented Feb 14, 2025

View reviewed changes

remove cruft

cd6d9d8

willdumm linked an issue Feb 14, 2025 that may be closed by this pull request

Give the crepe a represent method #115

Closed

matsen reviewed Feb 14, 2025

View reviewed changes

respond to Erick's comments

464ad96

willdumm changed the title ~~Add convenient Crepe.represent method~~ Add convenient Crepe.represent_sequences method Feb 14, 2025

matsen approved these changes Feb 14, 2025

View reviewed changes

willdumm merged commit 954c28c into main Feb 14, 2025
2 checks passed

willdumm deleted the 115-crepe-represent branch February 14, 2025 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add convenient `Crepe.represent_sequences` method #117

Add convenient `Crepe.represent_sequences` method #117

willdumm commented Feb 13, 2025 •

edited

Loading

willdumm Feb 14, 2025

matsen left a comment

willdumm commented Feb 14, 2025 •

edited

Loading

matsen left a comment

Add convenient Crepe.represent_sequences method #117

Add convenient Crepe.represent_sequences method #117

Conversation

willdumm commented Feb 13, 2025 • edited Loading

willdumm Feb 14, 2025

Choose a reason for hiding this comment

matsen left a comment

Choose a reason for hiding this comment

willdumm commented Feb 14, 2025 • edited Loading

matsen left a comment

Choose a reason for hiding this comment

Add convenient `Crepe.represent_sequences` method #117

Add convenient `Crepe.represent_sequences` method #117

willdumm commented Feb 13, 2025 •

edited

Loading

willdumm commented Feb 14, 2025 •

edited

Loading