|
1 |
| -# GRIMER |
2 |
| - |
3 | 1 | 
|
4 | 2 |
|
5 |
| -GRIMER performs analysis of microbiome data and generates a portable and interactive dashboard integrating annotation, taxonomy and metadata with focus on contamination detection. More information about the method can be found in the [pre-print](https://doi.org/10.1101/2021.06.22.449360) |
6 |
| - |
7 |
| -## Examples |
8 |
| - |
9 |
| -Online examples of reports generated with GRIMER: https://pirovc.github.io/grimer-reports/ |
10 |
| - |
11 |
| -## Installation |
12 |
| - |
13 |
| -Via conda |
14 |
| - |
15 |
| -```bash |
16 |
| -conda install -c bioconda -c conda-forge grimer |
17 |
| -``` |
18 |
| - |
19 |
| -or locally installing only dependencies via conda: |
20 |
| - |
21 |
| -```bash |
22 |
| -git clone https://github.com/pirovc/grimer.git |
23 |
| -cd grimer |
24 |
| -conda env create -f env.yaml # or mamba env create -f env.yaml |
25 |
| -conda activate grimer # or source activate grimer |
26 |
| -python setup.py install --record files.txt # Uninstall: xargs rm -rf < files.txt |
27 |
| -grimer -h |
28 |
| -``` |
29 |
| - |
30 |
| -## Usage |
31 |
| - |
32 |
| -### Tab-separated input table |
33 |
| -```bash |
34 |
| -grimer -i input_table.tsv |
35 |
| -``` |
36 |
| - |
37 |
| -### BIOM file |
38 |
| -```bash |
39 |
| -grimer -i myfile.biom |
40 |
| -``` |
41 |
| - |
42 |
| -### Tab-separated input table with taxonomic annotated observations (e.g. sk__Bacteria;k__;p__Actinobacteria;c__Actinobacteria...) |
43 |
| -```bash |
44 |
| -grimer -i input_table.tsv -f ";" |
45 |
| -``` |
46 |
| - |
47 |
| -### Tab-separated input table with metadata |
48 |
| -```bash |
49 |
| -grimer -i input_table.tsv -m metadata.tsv |
50 |
| -``` |
| 3 | +GRIMER performs analysis of microbiome studies and generates a portable and interactive dashboard integrating annotation, taxonomy and metadata with focus on contamination detection. |
51 | 4 |
|
52 |
| -### With taxonomy integration (ncbi) |
53 |
| -```bash |
54 |
| -grimer -i input_table.tsv -m metadata.tsv -t ncbi #optional -b taxdump.tar.gz |
55 |
| -``` |
| 5 | +- [Installation, user manual](https://pirovc.github.io/grimer/) |
| 6 | +- [Live examples](https://pirovc.github.io/grimer/examples/) |
| 7 | +- [Pre-print](https://doi.org/10.1101/2021.06.22.449360) |
56 | 8 |
|
57 |
| -### With configuration file to setup external tools, references and annotations |
58 |
| -```bash |
59 |
| -grimer -i input_table.tsv -m metadata.tsv -t ncbi -c config/default.yaml -d -g |
60 |
| -``` |
61 | 9 |
|
62 |
| -### Analyzing any MGnify public study |
63 |
| - |
64 |
| -```bash |
65 |
| -./grimer-mgnify.py -i MGYS00006024 -o output_folder/ |
66 |
| -``` |
67 |
| - |
68 |
| -## Parameters |
69 |
| - |
70 |
| - grimer |
71 |
| - |
72 |
| - optional arguments: |
73 |
| - -h, --help show this help message and exit |
74 |
| - -v, --version show program's version number and exit |
75 |
| - |
76 |
| - required arguments: |
77 |
| - -i INPUT_FILE, --input-file INPUT_FILE |
78 |
| - Main input table with counts (Observation table, Count table, Contingency Tables, ...) or .biom file. By default rows contain observations and columns contain |
79 |
| - samples (use --tranpose if your file is reversed). First column and first row are used as headers. |
80 |
| - |
81 |
| - main arguments: |
82 |
| - -m METADATA_FILE, --metadata-file METADATA_FILE |
83 |
| - Input metadata file in simple tabular format with samples in rows and metadata fields in columns. QIIME 2 metadata format is also accepted, with an extra row to |
84 |
| - define categorical and numerical fields. If not provided and --input-file is a .biom files, will attempt to get metadata from it. |
85 |
| - -t {ncbi,gtdb,silva,greengenes,ott}, --taxonomy {ncbi,gtdb,silva,greengenes,ott} |
86 |
| - Define taxonomy to convert entry and annotate samples. Will automatically download and parse or files can be provided with --tax-files. |
87 |
| - -b [TAX_FILES ...], --tax-files [TAX_FILES ...] |
88 |
| - Optional specific taxonomy files to use. |
89 |
| - -r [RANKS ...], --ranks [RANKS ...] |
90 |
| - Taxonomic ranks to generate visualizations. Use 'default' to use entries from the table directly. Default: default |
91 |
| - -c CONFIG, --config CONFIG |
92 |
| - Configuration file with definitions of references, controls and external tools. |
93 |
| - |
94 |
| - output arguments: |
95 |
| - -g, --mgnify Plot MGnify chart |
96 |
| - -d, --decontam Run and plot DECONTAM |
97 |
| - -l TITLE, --title TITLE |
98 |
| - Title to display on the header of the report. |
99 |
| - -p [{overview,samples,heatmap,correlation} ...], --output-plots [{overview,samples,heatmap,correlation} ...] |
100 |
| - Plots to generate. Default: overview,samples,heatmap,correlation |
101 |
| - -o OUTPUT_HTML, --output-html OUTPUT_HTML |
102 |
| - File to output report. Default: output.html |
103 |
| - --full-offline Embed javascript library in the output file. File will be around 1.5MB bigger but also work without internet connection. That way your report will live forever. |
104 |
| - |
105 |
| - general data options: |
106 |
| - -f LEVEL_SEPARATOR, --level-separator LEVEL_SEPARATOR |
107 |
| - If provided, consider --input-table to be a hierarchical multi-level table where the observations headers are separated by the indicated separator characther |
108 |
| - (usually ';' or '|') |
109 |
| - -y VALUES, --values VALUES |
110 |
| - Force 'count' or 'normalized' data parsing. Empty to auto-detect. |
111 |
| - -w, --cumm-levels Activate if input table has already cummulative values among levels. |
112 |
| - -s, --transpose Transpose --input-table (if samples are listed on columns and observations on rows) |
113 |
| - -u [UNASSIGNED_HEADER ...], --unassigned-header [UNASSIGNED_HEADER ...] |
114 |
| - Define one or more header names containing unsassinged/unclassified counts. |
115 |
| - --obs-replace [OBS_REPLACE ...] |
116 |
| - Replace values on table observations labels/headers (support regex). Example: '_' ' ' will replace underscore with spaces, '^.+__' '' will remove the matching |
117 |
| - regex. |
118 |
| - --sample-replace [SAMPLE_REPLACE ...] |
119 |
| - Replace values on table sample labels/headers (support regex). Example: '_' ' ' will replace underscore with spaces, '^.+__' '' will remove the matching regex. |
120 |
| - -z REPLACE_ZEROS, --replace-zeros REPLACE_ZEROS |
121 |
| - INT (add 'smallest count'/INT to every raw count), FLOAT (add FLOAT to every raw count). Default: 1000 |
122 |
| - --min-frequency MIN_FREQUENCY |
123 |
| - Define minimum number/percentage of samples containing an observation to keep the observation [values between 0-1 for percentage, >1 specific number]. |
124 |
| - --max-frequency MAX_FREQUENCY |
125 |
| - Define maximum number/percentage of samples containing an observation to keep the observation [values between 0-1 for percentage, >1 specific number]. |
126 |
| - --min-count MIN_COUNT |
127 |
| - Define minimum number/percentage of counts to keep an observation [values between 0-1 for percentage, >1 specific number]. |
128 |
| - --max-count MAX_COUNT |
129 |
| - Define maximum number/percentage of counts to keep an observation [values between 0-1 for percentage, >1 specific number]. |
130 |
| - |
131 |
| - Samples options: |
132 |
| - -j TOP_OBS_BARS, --top-obs-bars TOP_OBS_BARS |
133 |
| - Top abundant observations to show in the bars. |
134 |
| - |
135 |
| - Heatmap and clustering options: |
136 |
| - -a TRANSFORMATION, --transformation TRANSFORMATION |
137 |
| - none (counts), norm (percentage), log (log10), clr (centre log ratio). Default: log |
138 |
| - -e METADATA_COLS, --metadata-cols METADATA_COLS |
139 |
| - How many metadata cols to show on the heatmap. Higher values makes plot slower to navigate. |
140 |
| - --optimal-ordering Activate optimal_ordering on linkage, takes longer for large number of samples. |
141 |
| - --show-zeros Do not skip zeros on heatmap. File will be bigger and iteraction with heatmap slower. |
142 |
| - --linkage-methods [{single,complete,average,centroid,median,ward,weighted} ...] |
143 |
| - --linkage-metrics [{braycurtis,canberra,chebyshev,cityblock,correlation,cosine,dice,euclidean,hamming,jaccard,jensenshannon,kulsinski,mahalanobis,minkowski,rogerstanimoto,russellrao,seuclidean,sokalmichener,sokalsneath,sqeuclidean,wminkowski,yule} ...] |
144 |
| - --skip-dendrogram Disable dendogram. Will create smaller files. |
145 |
| - |
146 |
| - Correlation options: |
147 |
| - -x TOP_OBS_CORR, --top-obs-corr TOP_OBS_CORR |
148 |
| - Top abundant observations to build the correlationn matrix, based on the avg. percentage counts/sample. 0 for all |
| 10 | + |
149 | 11 |
|
150 | 12 | ## Powered by
|
151 | 13 |
|
| 14 | + |
152 | 15 | [<img src="https://static.bokeh.org/branding/logos/bokeh-logo.png" height="60">](https://bokeh.org)
|
153 | 16 | [<img src="https://pandas.pydata.org/static/img/pandas.svg" height="40">](https://pandas.org)
|
154 | 17 | [<img src="https://raw.githubusercontent.com/scipy/scipy/master/doc/source/_static/logo.svg" height="40">](https://scipy.org)
|
|
0 commit comments