Skip to content

dawnmy/HAPPY

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HAPPY

Hi-C data Analysis and Processing PIpeline (HAPPY). This software outputs the contact matrix with different bin sizes (resolutions): 1k, 5k, 10k, 50k, 100k.

Requirement

  1. conda
  2. snakemake
  3. BWA-MEM
  4. pairtools
  5. cooler
  6. SAMtools

conda and snakemake need to be installed manually and the rest will be automatically installed during the first launch of the program.

The output pair file can be easily filered based on provided condition.

Installation

git clone [email protected]:dawnmy/HAPPY.git

Usage

  1. Adapt the config file for the pipeline Modify the config/config.yaml file in the program folder to adapt to your data location.

  2. Download the reference genome and create BWA index and fasta index (.fai). For instance GRCh38 for homo sapiens:

wget https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz -O GRCh38.fasta.gz

pigz -d GRCh38.fasta.gz
bwa index GRCh38.fasta
samtools fai GRCh38.fasta
  1. Make the chromosome sizes file based on the .fai
cut -f1,2 GRCh38.fasta.fai > chrom.all.sizes
  1. Launch the pipeline With 20 threads
snakemake -s runHiC.smk --use-conda -j 20

If you use SGE submission system:

snakemake -s runHiC.smk --use-conda -c "qsub -cwd -pe multislot {threads} -i /dev/null -v PATH" -j 2

Analytics

About

Hi-C data Analysis and Processing PIpeline

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages