Skip to content

JBEI/htp_16s

Repository files navigation

High Throughput 16S Informatics

This repo holds the command-line application to do the informatics to support a high-throughout 16S sequencing run.

To begin using this tool, please clone it with:

user@machine:~$ git clone ssh://[email protected]:7999/~jdmccauley/htp_16s.git

Dependencies

Please install the following before trying to run this sofware since they are required:

Note that Python 3.11 can be installed (and probably should) and used with pyenv.

Installing

Build before running, testing, or developing.

user@machine:~$ cd htp_16s
user@machine:htp_16s$ poetry install
user@machine:htp_16s$ poetry build
user@machine:htp_16s$ pipx install dist/htp_16s-0.1.0-py3-none-any.whl

Running (with one plate)

user@machine:~$ htp_16s <mode> <input dir path> <optional output dir path>

Where mode is either index or pool for the moment. Note that A01 is reserverd by default, but this can be overridden with the option --nostandard.

For index mode, put a .csv with plate_map somewhere in the name within an input directory, with three columns: PLATE_NAME, WELL_LOCATION and SAMPLE_NAME, like the following:

PLATE_NAME WELL_LOCATION SAMPLE_NAME
my_plate_1 A03 my_sample
my_plate_1 A05 your_sample

Then run with

user@machine:~$ htp_16s index <input dir path> <optional output dir path>

For pool mode, put the plate_map csv and all End RFU csvs in an input directory. Then run with

user@machine:~$ htp_16s pool <input dir path> <optional output dir path>

Running with multiple plates

When running multiple plates, there's two method for running in index mode and one method for running in pool mode.

Note that the plate_maps in both cases must have UNIQUE plate names per plate.

index mode

For index mode, you can either provide one plate_map file with multiple UNIQUE values in the PLATE_NAME column, or you can provide one plate_map file for each plate in a subdirectory in your input directory (where each plate file still has UNIQUE PLATE_NAME values), like so:

my_input_dir/
├── plate_1
│   └── plate_1_plate_map.csv
└── plate_2
    └── plate_2_plate_map.csv

and your plate_maps should have their respective plate names in the PLATE_NAME column:

plate_map_1.csv:

PLATE_NAME WELL_LOCATION SAMPLE_NAME
my_plate_1 A03 my_sample
my_plate_1 A05 your_sample

plate_map_2.csv:

PLATE_NAME WELL_LOCATION SAMPLE_NAME
my_plate_2 A03 my_sample
my_plate_2 A05 your_sample

You'll then get one single Miseq SampleSheet and one single file of primer transfer instructions (for all plates).

pool mode

For pool mode, you must organize your input directory to have subdirectories per plate, with the respective plate maps and RFU files in each like so:

my_input_dir/
├── plate_1
│   ├── End Point Results 1.csv
│   ├── End Point Results 2.csv
│   ├── End Point Results 3.csv
│   ├── End Point Results 4.csv
│   └── plate_1_plate_map.csv
└── plate_2
    ├── End Point Results 1.csv
    ├── End Point Results 2.csv
    ├── End Point Results 3.csv
    ├── End Point Results 4.csv
    └── plate_2_plate_map.csv

After running pool, you'll get an output directory with subdirs named after your input subdirs, with one set of biomek instructions each like so:

20230725_yv_m6cmq_pool_htp_16s
├── plate_1
│   └── plate_1_pooling_biomek_instructions.csv
└── plate_2
    └── plate_2_pooling_biomek_instructions.csv

Help

If you get stuck view the help message with htp_16s --help.

Otherwise, reach out to Josh at [email protected].

Developing

Run the tests with:

user@machine:htp_16s$ poetry run pytest

And test the main script with:

user@machine:htp_16s$ poetry run htp_16s <args>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •