Segmentation fault while merging PE reads #29

hudenise · 2015-06-11T10:40:12Z

Hi, I encounter an issue (exit code 139) with this error message
Processing reads... |/tmp/.lsbtmp4010/.lsbatch/1433770267.781893: line 8: 10935 Segmentation fault (core dumped) /nfs/seqdb/production/interpro/development/metagenomics/pipeline/tools/bin/SeqPrep -f ERR884064_1.fastq -r ERR884064_2.fastq -1 ERR884064_1_paired.fastq.gz -2 ERR884064_2_paired.fastq.gz -3 ERR884064_1_unpaired.fastq.gz -4 ERR884064_2_unpaired.fastq.gz -s ERR884064_paired.fastq.gz
I checked the read files and they do not contain non-ascii characters and all quality score lines have the same length than the sequence lines. I have successfully ran SeqPrep, with the same parameters, before and since so the installation is correct. Any suggestion as how to successfully merge the files? Thanks Hubert

hudenise · 2015-07-02T09:23:46Z

After investigating the files, the issue was that a few sequences from the _1 file were quite short (<10 nt) while they counterparts from file _2 were significantly longer or also very short.
Example from 1 file:
@Miseq ....
TGCAGGATATCGCGGCCGT
+
BCC-@ECFGGD7F7@FE+6
and the counterpart in 2 file:
TATCCGTTACCTATCGTCCGCGAGAAAGCTAGTAGACACACAGCACCCAGGCGTGCAAGTCACCTTCAGATGACTACACCGAACCTGGTTAAAAGAGTCTATGGCCACCCCTACTTTAGAGTAAAAAAACCACACCTCTATTGCGCTGGGTACTAGAATAAGCTAACTACCTAGTCCGTTTCCGGCTGACTTTTTTGGGAATAACATACCACCCATCGTGATTACGTTCGCCACCGTTCTACTGCTCTCTTCACTAGGTTTGCACATTGTTTGTTCCCCTATGGCTAATTTATAGAGGACN
+
-6A,A@D<@,CFC,C,E86C+:++8CE,,C,6,;,,CAFGGCE,,C66DE,,C:@C,9,,,<6,6CE<,,,,<,,C,,<,7++8+B88,AFF@F,,,,:5?B,,AECFG:+AD,5AF9;,?,4,C,,+++488,=+,=,,,73,+6@6+3,26=,@,733=6,,==FCG,@,,45<?6,,1,+***3,95DD,__3__1/,0+;9+80)0A)/4/))..););6;()/2).))./)0);)1474)4?4)6))))(.640)))1)4)).,,()),8((,.8((-)).9)-4...((,!

It is the first time I encountered such issue so I don't know if you are aware, cheers Hubert

jstjohn · 2015-07-02T13:29:40Z

Looks like your bcl2fastq job is already doing some kind of trimming for
you. This is not expected input for seqprep. Maybe if you have say over
bcl2fastq parameters you could turn this off? Not sure which settings or
defaults would do this in your version.
On Thu, Jul 2, 2015 at 2:23 AM hudenise [email protected] wrote:

After investigating the files, the issue was that a few sequences from the
_1 file were quite short (<10 nt) while they counterparts from file _2 were
significantly longer or also very short.
Example from _1 file:
@Miseq ....
TGCAGGATATCGCGGCCGT
+
BCC-@ECFGGD7F7@FE+6
and the counterpart in _2 file:

TATCCGTTACCTATCGTCCGCGAGAAAGCTAGTAGACACACAGCACCCAGGCGTGCAAGTCACCTTCAGATGACTACACCGAACCTGGTTAAAAGAGTCTATGGCCACCCCTACTTTAGAGTAAAAAAACCACACCTCTATTGCGCTGGGTACTAGAATAAGCTAACTACCTAGTCCGTTTCCGGCTGACTTTTTTGGGAATAACATACCACCCATCGTGATTACGTTCGCCACCGTTCTACTGCTCTCTTCACTAGGTTTGCACATTGTTTGTTCCCCTATGGCTAATTTATAGAGGACN
+
-6A,A@D<@,CFC,C,E86C+:++8CE,,C,6,;,,CAFGGCE,,C66DE,,C:@C
,9,,,<6,6CE<,,,,<,,C,,<,7++8+B88,AFF@F
,,,,:5?B,,AECFG:+AD,5AF9;,?,4,C,,+++488,=+,=,,,73,+6@6
+3,26=,@,733=6,,==FCG,@,,45<?6,,1,+*_**3,95DD,_3
1/,0+;9+80)0A)/4/))..););6;()/2).))./)0);)1474)4?4)6))))(.640)))1)4))
.,,()),8((,.8((-)).9)-4...((,!

It is the first time I encountered such issue so I don't know if you are
aware, cheers Hubert

—
Reply to this email directly or view it on GitHub
#29 (comment).

hudenise · 2015-07-02T14:03:06Z

Thanks, I will forward your email to the user who generated the
sequences submitted to our pipeline, cheers Hubert

On 02/07/2015 14:29, John St. John wrote:

Looks like your bcl2fastq job is already doing some kind of trimming for
you. This is not expected input for seqprep. Maybe if you have say over
bcl2fastq parameters you could turn this off? Not sure which settings or
defaults would do this in your version.
On Thu, Jul 2, 2015 at 2:23 AM hudenise [email protected] wrote:

After investigating the files, the issue was that a few sequences from the
_1 file were quite short (<10 nt) while they counterparts from file _2 were
significantly longer or also very short.
Example from _1 file:
@Miseq ....
TGCAGGATATCGCGGCCGT
+
BCC-@ECFGGD7F7@FE+6
and the counterpart in _2 file:

TATCCGTTACCTATCGTCCGCGAGAAAGCTAGTAGACACACAGCACCCAGGCGTGCAAGTCACCTTCAGATGACTACACCGAACCTGGTTAAAAGAGTCTATGGCCACCCCTACTTTAGAGTAAAAAAACCACACCTCTATTGCGCTGGGTACTAGAATAAGCTAACTACCTAGTCCGTTTCCGGCTGACTTTTTTGGGAATAACATACCACCCATCGTGATTACGTTCGCCACCGTTCTACTGCTCTCTTCACTAGGTTTGCACATTGTTTGTTCCCCTATGGCTAATTTATAGAGGACN
+
-6A,A@D<@,CFC,C,E86C+:++8CE,,C,6,;,,CAFGGCE,,C66DE,,C:@C
,9,,,<6,6CE<,,,,<,,C,,<,7++8+B88,AFF@F
,,,,:5?B,,AECFG:+AD,5AF9;,?,4,C,,+++488,=+,=,,,73,+6@6
+3,26=,@,733=6,,==FCG,@,,45<?6,,1,+*_**3,95DD,_3
1/,0+;9+80)0A)/4/))..););6;()/2).))./)0);)1474)4?4)6))))(.640)))1)4))
.,,()),8((,.8((-)).9)-4...((,!

It is the first time I encountered such issue so I don't know if you are
aware, cheers Hubert

—
Reply to this email directly or view it on GitHub
#29 (comment).

Reply to this email directly or view it on GitHub:
#29 (comment)

Dr Hubert DENISE

Metagenomics
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus,
Hinxton,
Cambridge, CB10 1SD,
United Kingdom
Tel : (+44)01223 494102

chloeloiseau · 2016-11-15T09:57:18Z

Hello,
I am using SeqPrep after Trimmomatic (which trims the reads) and I am experiencing for some files this error:
/tmp/sge_spool/lhi10/job_scripts/19611347: line 20: 15553 Segmentation fault (core dumped) SeqPrep -f trimmomatic_input_1P.fastq -r trimmomatic_input_2P.fastq -1 seqprep_1_trimmed.fastq.gz -2 seqprep_2_trimmed.fastq.gz -3 seqprep_1_notmerged.fastq.gz -4 seqprep_2_notmerged.fastq.gz -A AGATCGGAAGAGCACACGTCT -B AGATCGGAAGAGCGTCGTGTA -L 20 -o 40 -s seqprep_merged.fastq.gz -2 file_seqprep.txt.gz 2>> seqprep.log

Above, you mention that the segmentation fault may be explained by the fact that the read pairs in the input file to SeqPrep may not have the same length. Does this mean Trimmomatic should not be used prior to SeqPrep and that the program expects the same length of read pairs to work?

Many thanks for you help on this issue
Chloé

hudenise · 2016-11-15T10:06:30Z

Dear Chloe,
Indeed we're using SeqPrep upstream of Trimmomatic on the raw reads with
just the primer/adapter removed. Then we apply Trimmomatic on the merged
file. Sincerely, Hubert

On 15/11/2016 09:57, chloeloiseau wrote:

Hello,
I am using SeqPrep after Trimmomatic (which trims the reads) and I am experiencing for some files this error:
/tmp/sge_spool/lhi10/job_scripts/19611347: line 20: 15553 Segmentation fault (core dumped) SeqPrep -f trimmomatic_input_1P.fastq -r trimmomatic_input_2P.fastq -1 seqprep_1_trimmed.fastq.gz -2 seqprep_2_trimmed.fastq.gz -3 seqprep_1_notmerged.fastq.gz -4 seqprep_2_notmerged.fastq.gz -A AGATCGGAAGAGCACACGTCT -B AGATCGGAAGAGCGTCGTGTA -L 20 -o 40 -s seqprep_merged.fastq.gz -2 file_seqprep.txt.gz 2>> seqprep.log

Above, you mention that the segmentation fault may be explained by the fact that the read pairs in the input file to SeqPrep may not have the same length. Does this mean Trimmomatic should not be used prior to SeqPrep and that the program expects the same length of read pairs to work?

Many thanks for you help on this issue
Chloé

Dr Hubert DENISE

Metagenomics
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus,
Hinxton,
Cambridge, CB10 1SD,
United Kingdom
Tel : (+44)01223 494102

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault while merging PE reads #29

Segmentation fault while merging PE reads #29

hudenise commented Jun 11, 2015

hudenise commented Jul 2, 2015

jstjohn commented Jul 2, 2015

hudenise commented Jul 2, 2015

chloeloiseau commented Nov 15, 2016

hudenise commented Nov 15, 2016

Segmentation fault while merging PE reads #29

Segmentation fault while merging PE reads #29

Comments

hudenise commented Jun 11, 2015

hudenise commented Jul 2, 2015

jstjohn commented Jul 2, 2015

hudenise commented Jul 2, 2015

chloeloiseau commented Nov 15, 2016

hudenise commented Nov 15, 2016