SW_cuda modual #155

Zhu-Siqi · 2024-02-27T11:30:12Z

Hi,

I want to do some end-to-end training using your sw_cuda modual. But it seems that I don't how it works. My code:

from deepblast.sw_cuda import SmithWatermanDecoder
device = torch.device('cuda:0')

ddp = SmithWatermanDecoder(operator='softmax')
match = torch.randn(100, 120, 108).requires_grad_(True)
gap = torch.randn(100, 120, 108).requires_grad_(True)
aln = ddp.decode(match.to(device), gap.to(device))

In my understanding, aln should return the probability of alignment. However, the aln tensor always get 0 in the first row and column, which means aln don't return what I want. So is my code wrong, or I just misunderstand the output of ddp.decode?

The text was updated successfully, but these errors were encountered:

mortonjt · 2024-02-27T12:42:16Z

Hi, you need a couple of intermediate steps to do end-to-end training. I'd start with the training script https://github.com/flatironinstitute/deepblast/blob/master/scripts/deepblast-trai I"m not sure what you are trying to accomplish with your test. I'd recommend looking at the unittests https://github.com/flatironinstitute/deepblast/blob/master/deepblast/tests/test_sw.py

…

On Tue, Feb 27, 2024 at 6:30 AM SorrowAir ***@***.***> wrote: Hi, I want to do some end-to-end training using your sw_cuda modual. But it seems that I don't how it works. My code: from deepblast.sw_cuda import SmithWatermanDecoder as SWDecoderCUDA ddp = SmithWatermanDecoder(operator='softmax') match = torch.randn(100, 120, 108).requires_grad_(True) gap = torch.zeros(100, 120, 108).requires_grad_(True) aln = ddp.decode(match.to(device), gap.to(device)) In my understanding, aln should return the probability of alignment. However, the aln tensor always get 0 in the first row and column, which means aln don't return what I want. So is my code wrong, or I just misunderstand the output of ddp.decode? — Reply to this email directly, view it on GitHub <#155>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA75VXNOAKQ7CED3WDCLRULYVW7U7AVCNFSM6AAAAABD35TC4WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2TMMZVGI3DIOI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Zhu-Siqi · 2024-02-27T14:15:28Z

Thanks for your quick reply!

I have used my model to create the match and gap scoring matrices, and expect to use your differentiable_sw modual to get the alignment prediction. In other words, I have match scoring matrix M # (size: a*b) and gap scoring matrix G # (size: a*b), a and b are the lengths of two sequence. Now, I want to get a align matrix A # (size: a*b) and A[i][j] is the predicted alignment of the ith and jth residue. How can I get such a align matrix from your differentiable_sw modual?

Due to the lack of comments, it is a little difficult to understand the result returned by SmithWatermanDecoder. Could you explain what the aln tensor means and add some comments in your code? Your code is elegantly written but lacks comments, which hinders its readability XD. As I just want to use a modual from your work instead of finetuning your model, the comments are not enough to tell me where I can start my work.

mortonjt · 2024-02-27T14:31:33Z

I see the confusion. The `aln` matrix here is the expected traceback matrix (i.e. the e matrix in algorithm 1 in the paper : https://www.nature.com/articles/s41587-023-01917-2#Sec10) Also see equation 11 here https://arxiv.org/pdf/1802.03676.pdf

…

On Tue, Feb 27, 2024 at 9:15 AM SorrowAir ***@***.***> wrote: Thanks for your quick reply! I have used my model to create the match and gap scoring matrices, and expect to use your differentiable_sw modual to get the alignment prediction. In other words, I have match scoring matrix M # (size: N*M) and gap scoring matrix G # (size: N*M), N and M are the lengths of two sequence. Now, I want to get a align matrix A # (size: N*M) and A[i][j] is the predicted alignment of the ith and jth residue. How can I get such a align matrix from your differentiable_sw modual? Due to the lack of comments, it is a little difficult to understand the result returned by SmithWatermanDecoder. Could you explain what the aln tensor means and add some comments in your code? Your code is elegantly written but lacks comments, which hinders its readability XD. — Reply to this email directly, view it on GitHub <#155 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA75VXL5F72UREUNETPREGDYVXTA3AVCNFSM6AAAAABD35TC4WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRWGY2DQNJZGQ> . You are receiving this because you commented.Message ID: ***@***.***>

Zhu-Siqi · 2024-02-27T14:55:28Z

Does the aln matrix means the expected alignment between the two proteins? Why do I always get 0 in the first row and column of aln matrix and 1 as the last element of 'aln' matrix? The picture below shows the result of my test code.
Because I define the match and gap scoring matrices randomly, the locations of gap should be different. Do I confuse what the align matrix means or give the inappropriate inputs?

Zhu-Siqi · 2024-02-28T01:43:11Z

After running your test code (https://github.com/flatironinstitute/deepblast/blob/master/deepblast/tests/test_sw.py), I also get the same result.

mortonjt · 2024-04-11T18:33:10Z

Hi yes this is by construction -- we zero out the first row / column to make autograd work ...

Zhu-Siqi · 2024-04-13T15:33:03Z

Thanks! I understand the zeros in the first row and column. However, why is the last element always one? If so, it means that the last element is always aligned.

Thanks again for your reply! I also look forward to see an example like https://github.com/spetti/SMURF/blob/main/examples/SSW_examples/sw_in_tensorflow_pytorch.ipynb. Your work is similar with it. However, without a good example file, I couldn't follow your work well :(

mortonjt · 2024-04-17T15:10:15Z

Got it. Yes we generate those types of alignment visualizations as depicted in the tensorboard readouts.
https://github.com/flatironinstitute/deepblast/blob/master/deepblast/trainer.py#L248-L251

We are currently in the middle of doing a overhaul of the tutorials, more updates to come by July

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SW_cuda modual #155

SW_cuda modual #155

Zhu-Siqi commented Feb 27, 2024 •

edited

Loading

mortonjt commented Feb 27, 2024 via email

Zhu-Siqi commented Feb 27, 2024 •

edited

Loading

mortonjt commented Feb 27, 2024 via email

Zhu-Siqi commented Feb 27, 2024

Zhu-Siqi commented Feb 28, 2024

mortonjt commented Apr 11, 2024

Zhu-Siqi commented Apr 13, 2024 •

edited

Loading

mortonjt commented Apr 17, 2024

SW_cuda modual #155

SW_cuda modual #155

Comments

Zhu-Siqi commented Feb 27, 2024 • edited Loading

mortonjt commented Feb 27, 2024 via email

Zhu-Siqi commented Feb 27, 2024 • edited Loading

mortonjt commented Feb 27, 2024 via email

Zhu-Siqi commented Feb 27, 2024

Zhu-Siqi commented Feb 28, 2024

mortonjt commented Apr 11, 2024

Zhu-Siqi commented Apr 13, 2024 • edited Loading

mortonjt commented Apr 17, 2024

Zhu-Siqi commented Feb 27, 2024 •

edited

Loading

Zhu-Siqi commented Feb 27, 2024 •

edited

Loading

Zhu-Siqi commented Apr 13, 2024 •

edited

Loading