Missing reads between 40 and 50bp after trimming? #34

jessicaathomas · 2016-04-07T11:49:19Z

Hello, I was wondering if someone could help me?

I've been trying to adapter trim and merge my dataset using Seqprep, but when I plot the read lengths after merging, I'm missing most of the reads between 40 and 50bp. I can't work out why, or whether I'm doing something wrong!

So: read length plots resemble this:
L120_2.read_lengths.pdf

I'm running SeqPrep as follows:

SeqPrep -f L120_1.qual.fastq -r L120_2_.qual.fastq -1 L120-R1.qual.unmerged.fastq -2 L120-R2.qual.unmerged.fastq -3 L120_NeutCap_2-R1.qual.discarded.fastq -4 L120_NeutCap_2-R2.qual.discarded.fastq -L 30 -q 15 -A AGATCGGAAGAGCACACGTC -B GGAAGAGCGTCGTGTAGGGA -s L120_NeutCap_2.qual.merged.fastq -E L120_NeutCap_2.qual.readable_alignment.txt -o 10

You'll notice that while the first adapter is the standard illumina one, but the second is a modified one, missing the first 5 bp. You can see both adapters present in the file if you grep the sequences (indicated below with [xx])…

Read1 quality trimmed, L120_2 above:

@HISEQ:268:C8TMGANXX:2:1101:1430:1965 1:N:0:NTCGTCGGNCGCAACG CAGGCACTCCCTGGAAACTCTAAGGGGCAGTTCTACTCT[AGATCGGAAGA] + A@B0BGGGGGGGCFGGGGGGGGGGGEGGGGGGGGGGCGG@1E@FGD/CEF
@HISEQ:268:C8TMGANXX:2:1101:1457:1992 1:N:0:TTCGTCGGNCGCAACG CTAGACCGCGAATACACACA[AGATCGGAAGAGCACACGTCTGAACTCCAG] + 33<<BGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGBGGGGGGGG
@HISEQ:268:C8TMGANXX:2:1101:1684:1955 1:N:0:TTCGTCGGCCGCAACG NTGATATGTCCGGAGTGCATCGTATGGCGCTTTCAATGAATTTG[AGATCG] + #3<<@EGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGG

@HISEQ:268:C8TMGANXX:2:1101:1619:1977 1:N:0:TTCGTCGGCCGCAACG CGGTGCCATCGAGCCTGTTCTGTCTCATAGTGACCCT[AGATCGGAAGAGC] + 33@>@GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@HISEQ:268:C8TMGANXX:2:1101:1574:1983 1:N:0:TTCGTCGGCCGCAACG CCATCCTAGTGGGGGGAAAT[AGATCGGAAGAGCACACGTCTGAACTCCAA] + <330<E1EFFCGGGGGFGECDGEGGFGBDCDDGEGGGGCD0DDCDG=EBC

Read 2, quality trimmed, for L120_2 above.

@HISEQ:268:C8TMGANXX:2:1101:1430:1965 2:N:0:NTCGTCGGNCGCAACG AGAGTAGAACTGCCCCNNNNAGTTTCCAGGGAGTGCCTG[GGAAGAGCGTC] + BB@BBGGDFGGGGGGG####==EFGDFFGGGGGGGGGGGGEGGGGGGGGF
@HISEQ:268:C8TMGANXX:2:1101:1457:1992 2:N:0:TTCGTCGGNCGCAACG TGTGTGTATTCGCGGTCTATGGAAGAGCGTCGTGTAG[GGAAAGAGTGTCG] + CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@HISEQ:268:C8TMGANXX:2:1101:1684:1955 2:N:0:TTCGTCGGCCGCAACG CAAATTCATTGAAAGNNNNNTACGATGCACTCCGGACATATCAT[GGAAGA] + CCCCCGGGGGGGGGG#####@=EFGGGGGGGGGGGGGGGGGGGGGGGGGG
@HISEQ:268:C8TMGANXX:2:1101:1619:1977 2:N:0:TTCGTCGGCCGCAACG AGGGTCACTATGAGACAGAACAGGCTCGATGGCACCT[GGAAGAGCGTCGT] + CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@HISEQ:268:C8TMGANXX:2:1101:1574:1983 2:N:0:TTCGTCGGCCGCAACG ATTTCCCCCCACTAGGATGT[GGAAGAGCGTCGTGTAGGGAAAGAGTGTCG] + BCCCCGGGGGDGGGGGGGGGGGGGGGGGGGDGGGGGGGGGGGGGGGGGFG

So I think the adapter sequences are correct, but I can't explain why there's a dip in the read length frequency. Is this a quirk of SeqPrep? Can anyone offer any explanation?

Many thanks!

The text was updated successfully, but these errors were encountered:

jessicaathomas · 2016-04-07T12:50:57Z

I should also add, that the depth of this dip differs between my different samples (i.e. some sample have barely any reads between 40 and 50bp, whereas some have barely any missing). The only thing which differs between samples is the 8bp index, found within the adapter sequence. I'm not sure how Seqprep removes the adapter sequence, but I don't think this should affect it? Again, any thoughts welcome.

jessicaathomas · 2021-01-20T11:58:31Z

Has anyone come across anything like this in the last 5 years?! Can anyone give me any suggestions as to what I can try to figure out what is going on?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing reads between 40 and 50bp after trimming? #34

Missing reads between 40 and 50bp after trimming? #34

jessicaathomas commented Apr 7, 2016

jessicaathomas commented Apr 7, 2016

jessicaathomas commented Jan 20, 2021

Missing reads between 40 and 50bp after trimming? #34

Missing reads between 40 and 50bp after trimming? #34

Comments

jessicaathomas commented Apr 7, 2016

jessicaathomas commented Apr 7, 2016

jessicaathomas commented Jan 20, 2021