Skip to content

brueggemann-lab/bacteriocins_IceKen_2022

Repository files navigation

bacteriocins_IceKen_2022

Investigating the distribution of bacteriocin biosynthetic gene clusters in Streptococcus pneumoniae genomes recovered from Iceland and Kenya.

Nasopharyngeal competition dynamics are likely to be altered following vaccine introduction: bacteriocin prevalence and diversity among Icelandic and Kenyan pneumococci. Madeleine E. B. Butler​, Melissa J. Jansen van Rensburg​, Angela Karani3, Benedict Mvera​, Donald Akech, Asma Akter​, Calum Forrest​, Andries J. van Tonder, Sigríður J. Quirk, Gunnsteinn Haraldsson, Stephen D. Bentley​, Helga Erlendsdóttir​, Ásgeir Haraldsson​, Karl G. Kristinsson, J. Anthony G. Scott​, Angela B. Brueggemann. Published 12 July 2023: https://doi.org/10.1099/mgen.0.001060

Correspondance: Prof Angela Brueggemann ([email protected]) or Dr Madeleine Butler ([email protected]).

Bacteriocin processing and analysis

Data files

  • data/bigsdb_annotated_export.csv - a full export of genomes with bacteriocin gene annotations and associated metadata used by the processing and analysis code will be added at the time of publication when genomic data are made available.
  • data/continuity_cat_outputs/cluster_cont_Kenya_2500_2500.csv and cluster_cont_VICE_2500_2500.csv - outputs from contiguity_cat.py (see below) describing which bacteriocin genes are found as contiguous clusters within each genome from each dataset.

Code

Code is provided as a Jupyter notebook, where all processing and analysis functions are called and outputs are saved, and also as two text files where the functions are defined.

  • code/bacteriocins/processing_and_visualisation.ipynb - Jupyter notebook calling all processing and analysis functions.
  • code/bacteriocins/processing.py and analysis.py - text files defining functions for processing and analysing bacteriocin data, and for generating visualisations and summaries.

Bacteriocin cluster contiguity assessment

Generating annotated sequence files

  • code/bigs_genbankerator/bigs_genbankerator.py - a command line tool for fetching annotated sequence data from the private BIGSdb database in which whole genome sequence data, annotations and metadata are stored.

Checking bacteriocin cluster contiguity

  • code/contiguity_cat/contiguity_cat.py - a command line tool that takes BIGS_genbankerator.py output files and assesses which bacteriocin gene clusters are contiguous according to user-defined thresholds.
  • code/contiguity_cat/dummy_input.gbk - example output ganbank file from bigs_genbankerator.py for input in contiguity_cat.py.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •