Incorrect results when reading sequences from files #9

threadless-screw · 2019-03-28T07:55:56Z

I ran into some issues with the pairwise alignment of sequences read from different files. Although the exact issue varies from case to case, it seems that the first alignment produces the correct result, while further alignments generally do not. Here's an example:

If a first file "f1" contains the following (identical) sequences:

NEEDLE
NEEDLE
NEEDLE

and a second file "f2" contains the following (identical) sequences:

HAYSTACK NEEDLE HAYSTACK
HAYSTACK NEEDLE HAYSTACK
HAYSTACK NEEDLE HAYSTACK

then running the command "smith_waterman --files f1 f2" returns the output:

== Alignment 0 lengths (6, 24):

hit 0.0 score: 12
  NEEDLE  [pos: 0; len: 6]
  NEEDLE  [pos: 9; len: 6]

==
== Alignment 1 lengths (6, 24):

==
== Alignment 2 lengths (6, 24):

==

The results for Alignment 0 are correct, but the fact that results are absent for Alignments 1 and 2 seems off. Interestingly, the dynamic programming matrices (rendered as output with the --printmatrices option) for the three alignments are the same, as they should be. The problem is therefore likely to be found at a superficial level.

The text was updated successfully, but these errors were encountered:

threadless-screw · 2019-03-29T23:11:50Z

I have unfortunately not gotten to the bottom of the issue. As far as I can tell, the re-use of the sw and result variables [see sw_cmdline.c] when iterating over the multiple sequences in the files is at the heart of it; it seems a previous iteration leaves its traces into the next. Practically, the problem can be overcome by wrapping each align() function call in its own initialization/destruction routines, like so:

    sw = smith_waterman_new();
    result = alignment_create(256);

    align( ....);
  
    smith_waterman_free(sw);
    alignment_free(result);

Since this somewhat wasteful solution is no more than a patch that leaves the underlying issue unresolved, I have not made a pull request.

Steph042 referenced this issue in Steph042/seq-align Mar 18, 2021

Update smith_waterman.c

234309f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect results when reading sequences from files #9

Incorrect results when reading sequences from files #9

threadless-screw commented Mar 28, 2019 •

edited

Loading

threadless-screw commented Mar 29, 2019

Incorrect results when reading sequences from files #9

Incorrect results when reading sequences from files #9

Comments

threadless-screw commented Mar 28, 2019 • edited Loading

threadless-screw commented Mar 29, 2019

threadless-screw commented Mar 28, 2019 •

edited

Loading