Scripts and data files for the analysis pipeline described in "High-throughput characterization of mutations in genes that drive clonal evolution using multiplex adaptome capture sequencing".
reference
contains sequence files for the targeted genes of interest and the E. coli REL606 genome with these regions masked out.consensus_read_generation
has the script for using unique molecular identifiers to perform error correction on raw Illumina reads.breseq_postprocessing
has the script for converting raw output generated by running breseq on these files into a format with the read counts supporting the reference versus variant alleles.trajectory_analysis
contains the main scripts for filtering and analyzing the trajectories of mutation frequencies.LTEE-compare
andprotein_structure
contain scripts and information for further analyzing the sets of predicted mutations.