This project performs an automated phylogenetic analysis of nucleotide sequences retrieved from NCBI. The pipeline includes sequence alignment, concatenation, phylogenetic inference (maximum likelihood, maximum parsimony, and Bayesian), and formatting for visualization.
- MAFFT v7.505 – Multiple sequence alignment.
- ModelTest-NG v0.1.7 – Substitution model selection.
- RAxML-NG v1.2.2 – Maximum likelihood phylogenetic inference.
- MPBoot – Maximum parsimony inference.
- MrBayes – Bayesian phylogenetic inference.
- FigTree v1.4.4 – Phylogenetic tree visualization.
- FASTA2NEX – FASTA to NEXUS converter.
Note: The original code was modified to better fit the workflow of this pipeline. The adapted version is available in this repository.
- Concatenate Fasta Tool – Concatenate multiple FASTA files.
Biopythonpandas
Install with:
pip install biopython pandasor
micromamba install -n myenv biopython pandas -c conda-forgepython3 Fasta_Entrez.py 'nucleotide' 'ACC_RANGE[accn]' 1000
mv algn_*.fasta algn_NAME.fastapython3 names_converter.py algn_NAME.fasta ids_names.csv
mv algn_NAME_mod.fasta algn_NAME.fastamkdir concat_seq
cp algn_*.fasta ./concat_seq/
python3 Concatenate.py concat_seq/ concat_os.fasta
rm -rf concat_seq/mpboot -s algn_NAME.fasta -pre max_parsimony_NAME -bb 4000modeltest-ng -i algn_NAME.fastaraxml-ng --msa algn_NAME.fasta --model MODEL --prefix NAME --threads 5 --seed 2
raxml-ng --bootstrap --msa algn_NAME.fasta --model MODEL --prefix NAME --seed 2 --threads 5
raxml-ng --support --tree NAME.raxml.bestTree --bs-trees NAME.raxml.bootstraps --prefix NAME --threads 5python3 fasta2nex.py algn_NAME.fasta > algn_NAME.nexusmb -i algn_NAME.nexus- Replace
NAMEwith actual dataset names such asadh1,os1283,os9971,os17357, etc. - The evolutionary models used in RAxML-NG must be chosen based on the output from
modeltest-ng. - It is recommended to execute each block using shell scripts or within a controlled environment (e.g., Micromamba).
Aleff Cavalcante
Alexandre Soares
Ana Fernando
Ravi Silva
Rendrick Carreira for providing the code to transform FASTA to NEXUS