Skip to content

Illumina/happyR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

97d093a · Jul 12, 2019
Jun 26, 2019
Jul 12, 2019
Jul 12, 2019
Jun 9, 2019
Jun 26, 2019
Jun 18, 2019
Jul 12, 2019
Aug 15, 2017
Jun 17, 2019
Jul 12, 2019
Jun 26, 2019
Jun 9, 2019
Jun 9, 2019
Jul 9, 2019
Jul 9, 2019
Jun 9, 2019

Repository files navigation

happyR

Build Status codecov

Tools to help analyse your hap.py results in R. See the documentation for usage and examples.

Install

devtools::install_github("Illumina/happyR")

Demo

This example walks through a comparison of samples prepared using PCR-Free versus Nano library preps with 2 replicates per group.

library(happyR)
library(tidyverse, quietly = TRUE)

# groups are defined either by a CSV or data.frame with three 
# required columns: a label for each group (group_id), a unique
# label per replicate (replicate_id) and a path to the respective
# pre-computed hap.py output (happy_prefix)
extdata_dir <- system.file("extdata", package = "happyR")
# these extdata files are supplied with the package
samplesheet <- tibble::tribble(
  ~group_id,  ~replicate_id, ~happy_prefix,
  "PCR-Free", "NA12878-I30", paste(extdata_dir, "NA12878-I30_S1", sep = "/"),
  "PCR-Free", "NA12878-I33", paste(extdata_dir, "NA12878-I33_S1", sep = "/"),
  "Nano",     "NA12878-R1",  paste(extdata_dir, "NA12878-R1_S1", sep = "/"),
  "Nano",     "NA12878-R2",  paste(extdata_dir, "NA12878-R2_S1", sep = "/")
)

# here the above table is used to read hap.py output files from disk 
# and attach the group + replicate metadata
hap_samplesheet <- read_samplesheet_(samplesheet)

# extract summary PASS performance for each replicate and plot
summary <- extract_results(hap_samplesheet$results, table = "summary") %>% 
  inner_join(samplesheet, by = "happy_prefix") %>% 
  filter(Filter == "PASS")

ggplot(data = summary, aes(x = METRIC.Recall, y = METRIC.Precision, color = group_id, shape = Type)) +
  geom_point() + theme_minimal() + 
  xlim(NA, 1) + ylim(NA, 1) +
  scale_color_brewer(palette = "Set2") +
  labs(x = "Recall", y = "Precision", color = "Prep") +
  ggtitle("PCR-Free vs. Nano variant calling",
          "PCR treatment reduces indel calling performance")  

System requirements

Originally developed for R v3.4.0. Tests are run using the most recent available R versions (incl. devel) on Ubuntu (Trusty) and OS X (El Capitan) platforms. happyR has not been tested on Windows. Dependencies are listed in DESCRIPTION.