You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to automatically compare BAM files being output by bowtie2 (for a continuous integration system). The input data is identical, but when running on 2 different machines I'm noticing sometimes 2 lines in the BAM files will have the same name/start position, but represent different reads. For whatever reason, bowtie2 sometimes reverses the order of the lines, so bamUtil's algorithm calls a mismatch.
Is there any way to get bam diff to hold off for a few lines to see if there is another record with the same name/start pos that actually matches?
The text was updated successfully, but these errors were encountered:
Bam diff matches by name and fragment from the flag. In your case, do you have multiple reads in a single file that have the same name and fragment flag such that it is a linear template rather than just paired-end? Bam diff was written with the assumption of paired-end, and currently won't work well (as you are probably seeing) if there are multiple non-first/last reads in the linear template.
Bam diff should hold onto reads until it finds a matching name/flag combination in the other file or until the maximum base pair position between records (posDiff) has been reached or until it reaches the maximum number of records it can hold onto (recPoolSize).
Would you be able to confirm that the issue you are seeing is for linear templates when both 0x40 & 0x80 are set (or both not set) in the flag within multiple records in a single file? If that is the issue you are seeing, I'll look more at linear templates to see if there is an easy way to match beyond just the flag, and also see if there is an easy way to expand the code to enable multiple reads with the same fragment flags.
I'm trying to automatically compare BAM files being output by bowtie2 (for a continuous integration system). The input data is identical, but when running on 2 different machines I'm noticing sometimes 2 lines in the BAM files will have the same name/start position, but represent different reads. For whatever reason, bowtie2 sometimes reverses the order of the lines, so bamUtil's algorithm calls a mismatch.
Is there any way to get bam diff to hold off for a few lines to see if there is another record with the same name/start pos that actually matches?
The text was updated successfully, but these errors were encountered: