-
Notifications
You must be signed in to change notification settings - Fork 7
/
CITATION.cff
64 lines (63 loc) · 2.3 KB
/
CITATION.cff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: SeqFu
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Andrea
family-names: Telatin
email: [email protected]
affiliation: Quadram Institute Bioscience
orcid: 'https://orcid.org/0000-0001-7619-281X'
- given-names: Giovanni
family-names: Birolo
- given-names: Piero
family-names: Fariselli
identifiers:
- type: doi
value: 10.3390/bioengineering8050059
description: >-
SeqFu: A Suite of Utilities for the Robust and
Reproducible Manipulation of Sequence Files
abstract: >-
Sequence files formats (FASTA and FASTQ) are
commonly used in bioinformatics, molecular biology
and biochemistry. With the advent of
next-generation sequencing (NGS) technologies, the
number of FASTQ datasets produced and analyzed has
grown exponentially, urging the development of
dedicated software to handle, parse, and manipulate
such files efficiently. Several bioinformatics
packages are available to filter and manipulate
FASTA and FASTQ files, yet some essential tasks
remain poorly supported, leaving gaps that any
workflow analysis of NGS datasets must fill with
custom scripts. This can introduce harmful
variability and performance bottlenecks in pivotal
steps. Here we present a suite of tools, called
SeqFu (Sequence Fastx utilities), that provides a
broad range of commands to perform both common and
specialist operations with ease and is designed to
be easily implemented in high-performance
analytical pipelines. SeqFu includes
high-performance implementation of algorithms to
interleave and deinterleave FASTQ files, merge
Illumina lanes, and perform various quality
controls (identification of degenerate primers,
analysis of length statistics, extraction of
portions of the datasets). SeqFu dereplicates
sequences from multiple files keeping track of
their provenance. SeqFu is developed in Nim for
high-performance processing, is freely available,
and can be installed with the popular package
manager Miniconda.
keywords:
- fastq
- fasta
- bioinformatics
- dereplication
- next-generation sequencing
license: MIT