KAT: Kmer Alignment Tool

The Kmer Alignment Tool takes DNA genome sequences, choppes them up into kmers, and then aligns those kmers to Reference DNA files. For the alignment it uses a combination of Suffix Arrays and the Burrows-Wheeler Transform.

Installation

The installation zip file includes a makefile. Just go to the folder where you downloaded the files and type make to build the project and generate the executable. The executable is called KAT1.0 and can now be found in the bin folder.

How to Use

Input:

To program requires 4 arguments, so from the terminal you would run path_to_executable 6 project_name ~/my_project_folder .fasta 20

Path to executable: the path to where you saved the KAT1.0 executable, e.g ~/bin/KAT1.0
Number of inputs: 6
Project Name: this is the name that will be used to output table.
Project Folder path: this folder should contain a Sample folder and a Reference folder, inside which you should have the fasta genome files.
Genome file extension: currently only supports “.fasta”. However, the program can handle files with full genomes, contigs, and chromosomes.
kmer size: This should be an integer between 5 and 1000. It represents the number of basepairs for chopping up the samples files.

Output:

A Results folder is created inside the Project Folder. This folder contains:

A project_name.csv file with the number of hits and hit percentage for each sample-reference pair. Percentage = 100 * (number of hits / number of unique kmers in the reference). If the reference has more than one sequence (contigs, chromosomes) then it calculates the number of unique kmers for all together, with kmers that are found in more than one sequence also counting as one unique kmer.
A text file for each Reference-Sample pair that lists the hit kmers and their positions in the reference.

Table example:

Textfile example: NZ_CP008781.1_NZ_CP012480.1.txt

Reference ID Number of sequences in Reference file Number of basepairs in each sequence in Reference file Sample ID Number of unique matches Total number of matches found kmer / index of sequence in reference / index of matches for Khmer in that specific sequence of reference

In the example, kmer AAAAAAAAAC is found in Reference sequence 0, positions 211595 2296603 1409627 1170432

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
include		include
src		src
test_1		test_1
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KAT: Kmer Alignment Tool

Installation

How to Use

Input:

Output:

Table example:

Textfile example: NZ_CP008781.1_NZ_CP012480.1.txt

About

Releases

Packages

Languages

efrainceh/KAT

Folders and files

Latest commit

History

Repository files navigation

KAT: Kmer Alignment Tool

Installation

How to Use

Input:

Output:

Table example:

Textfile example: NZ_CP008781.1_NZ_CP012480.1.txt

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages