Skip to content

Fix bam2bcf indel calling: memset destroying consensus, uninit variable, wrong sizes#2531

Open
sirus20x6 wants to merge 3 commits intosamtools:developfrom
sirus20x6:fix/bam2bcf-memset-uninit
Open

Fix bam2bcf indel calling: memset destroying consensus, uninit variable, wrong sizes#2531
sirus20x6 wants to merge 3 commits intosamtools:developfrom
sirus20x6:fix/bam2bcf-memset-uninit

Conversation

@sirus20x6
Copy link
Copy Markdown

Summary

  • Fix memset(aux, 0, ...) to memset(cns, 0, ...) in bam2bcf_iaux.c — was zeroing nucleotide frequency counts just computed, causing the consensus builder to discard all insertion data
  • Fix memset(read_scores, 0, n) to memset(read_scores, 0, n * sizeof(int)) — only zeroed 1/4 of the int array
  • Fix per-sample indel fraction denominator from naux (cross-sample running total) to ntot (per-sample count)
  • Initialize uninitialized variable K in bam2bcf_indel.c
  • Remove duplicate memset(call->SCR, ...) in bam2bcf.c

Test plan

  • Existing test suite passes (1920/1920)
  • Verify indel calling with --indels-2.0 produces correct insertion consensus sequences

- bam2bcf_indel.c: Initialize K=0 at declaration to avoid using an
  uninitialized variable in the max-deletion loop (line ~783) which
  runs before K is reset at line ~791.

- bam2bcf_iaux.c: Fix memset size in iaux_init_scores() to use
  n*sizeof(int) instead of n, since read_scores is int*. Without
  this, only 1/4 of the array (on most platforms) was being zeroed.

- bam2bcf_iaux.c: Fix per-sample indel fraction denominator in
  iaux_init_types() to use ntot (per-sample read count) instead of
  naux (running cross-sample indel count), which produced incorrect
  filtering thresholds.
In bam2bcf_iaux.c, the memset after the first pass was zeroing `aux`
(the nucleotide frequency counts) instead of `cns` (the consensus output
buffer). This caused the consensus-building loop to see all-zero counts
and produce 'N' for every position, discarding all insertion data.

In bam2bcf.c, remove a duplicate memset(call->SCR, ...) that was a
copy-paste error (two identical lines in a row).
@sirus20x6 sirus20x6 force-pushed the fix/bam2bcf-memset-uninit branch from a4c0330 to 622b392 Compare March 26, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant