Skip to content

Latest commit

 

History

History
165 lines (118 loc) · 3.29 KB

File metadata and controls

165 lines (118 loc) · 3.29 KB

Quick Start Guide

Get started with Bioinfo Tools in 5 minutes!

Installation

Option 1: From Source (Recommended for Development)

# Clone the repository
git clone https://github.com/Mxrcon/Bioinfo-python-scripts.git
cd Bioinfo-python-scripts

# Install in development mode
pip install -e .

# Verify installation
bioinfo-tools --version
# Or: python -m bioinfo_tools --version

Option 2: With Pixi

# Install pixi (if not already installed)
curl -fsSL https://pixi.sh/install.sh | bash

# Clone and use
git clone https://github.com/Mxrcon/Bioinfo-python-scripts.git
cd Bioinfo-python-scripts
pixi install
pixi shell

Your First Command

Let's extract some genes from GenBank files!

1. Prepare Your Data

Create a gene list file genes.txt:

dnaA
rpoB
recA

2. Run the Command

bioinfo-tools extract-genes -i path/to/genbank_files/ \
                            -g genes.txt \
                            -o output_genes/

That's it! Your extracted sequences are now in output_genes/.

Common Tasks

Extract Protein Sequences

bioinfo-tools extract-proteins -i genbank_files/ -g genes.txt -o proteins/

Output structure:

proteins/
├── dnaA/
│   └── genome1.fasta
├── rpoB/
│   └── genome1.fasta
└── recA/
    └── genome2.fasta

Filter GenBank Files

bioinfo-tools extract-cds -i genbank_files/ -g genes.txt -o filtered_gbk/

Run BLAST Analysis

bioinfo-tools blast -q query_sequences/ \
                    -d database_sequences/ \
                    -t nucl \
                    -b blastn \
                    -e 1e-5

Getting Help

# General help
bioinfo-tools --help

# Help for a specific command
bioinfo-tools extract-genes --help

Run Tests

Verify everything works:

# Using the test script
python tests/test_scripts.py

# Or with Python module syntax
python -m unittest tests.test_scripts

Next Steps

Troubleshooting

"bioinfo-tools: command not found"

Install the package:

pip install -e .

Or use Python module syntax:

python -m bioinfo_tools extract-genes --help

"No GenBank files found"

Make sure your GenBank files have .gbk, .gb, or .genbank extensions.

Need More Help?

Example Workflow

Complete example from GenBank to BLAST results:

# 1. Filter GenBank files to keep only genes of interest
bioinfo-tools extract-cds -i raw_genbank/ -g important_genes.txt -o filtered_gbk/

# 2. Extract protein sequences
bioinfo-tools extract-proteins -i filtered_gbk/ -g important_genes.txt -o proteins/

# 3. Run BLAST against a reference database
bioinfo-tools blast -q proteins/dnaA/ \
                    -d reference_db/ \
                    -t prot \
                    -b blastp \
                    -e 1e-10 \
                    -o blast_results/

That's all you need to get started! 🚀