|
4 | 4 |
|
5 | 5 | # MUSIAL |
6 | 6 |
|
7 | | -**MUSIAL** (MUlti Sample varIant AnaLysis) is a Java command-line tool designed to analyze and summarize single nucleotide variants (SNVs) and insertions/deletions (indels) across multiple prokaryotic samples. |
8 | | -The software aggregates and analyzes variant calls from multiple samples of a prokaryotic species and provides an interface to generate comprehensive statistics and alignments at the genome, gene and protein level. |
9 | | -MUSIAL enables a comprehensive assessment of variability within a species at the genome, gene and protein level, providing insights into, for example, conserved and variable regions, diversity at the gene level and common proteoforms among samples. |
10 | | - |
11 | | -## ✨ Features |
| 7 | +**MUSIAL** (MUlti Sample varIant AnaLysis) is a Java command-line tool to analyze large sets of VCF files with prokaryotic single nucleotide variants (SNVs) and insertions/deletions (indels). It provides an interface for generating comprehensive statistics and alignments, as well as assessing variability at genome, gene and protein levels. |
12 | 8 |
|
13 | 9 | - **Integrates SnpEff and other Sequence Ontology compliant annotations** to help interpret variants. |
14 | 10 | - **Projection to genomic features (genes) facilitates allele- and proteoform-specific information** that supports the characterization of individual samples. |
15 | 11 | - **VCF based sequence reconstruction** at nucleotide and protein sequence level and tabular reports on sample, feature and variant statistics. |
16 | 12 |
|
17 | | -## 📖 Usage |
| 13 | +### 📖 Usage |
18 | 14 |
|
19 | 15 | An executable `jar` file (`Java 21`) is available from the [Releases](https://github.com/Integrative-Transcriptomics/MUSIAL/releases) section. |
20 | 16 | MUSIAL operates on a modular, task-based architecture that is primarily initiated by the `build` task, which creates a JSON file (_storage_) as its primary output; this is then used as input for all other tasks. |
21 | 17 |
|
22 | | -The general CLI usage is `java -jar MUSIAL-v2.4.0.jar <task>`, whereby the following tasks are available: |
| 18 | +Details on the use of the software and tutorials can be found in the repository [Wiki](https://github.com/Integrative-Transcriptomics/MUSIAL/wiki). The general CLI usage is `java -jar MUSIAL-v2.4.2.jar <task>`, whereby the following tasks are available: |
23 | 19 |
|
24 | 20 | <details> |
25 | 21 | <summary><code>build</code> - Build a local database file (storage) in JSON format from variant calls; the mandatory input for other tasks.</summary> |
26 | 22 |
|
27 | 23 | ``` |
28 | 24 | Command line arguments of task build |
29 | 25 |
|
30 | | - -C,--configuration <arg> Path to a JSON file specifying the build task parameter configuration for MUSIAL. |
| 26 | + -C,--configuration <arg> Path to a JSON file specifying the build task parameter configuration for MUSIAL. Visit the documentation for details. |
31 | 27 | ``` |
32 | 28 | </details> |
33 | 29 |
|
34 | 30 | <details> |
35 | | -<summary><code>expand</code> - Expand an existing storage file from variant call format (VCF) files.</summary> |
| 31 | +<summary><code>expand</code> - Expand an existing storage file from variant call files and/or meta data.</summary> |
36 | 32 |
|
37 | 33 | ``` |
38 | 34 | Command line arguments of task expand |
39 | 35 |
|
| 36 | + -d,--dry-run Only report on novel entries without writing the updated storage. |
40 | 37 | -I,--storage <arg> Path to a .json(.gz) file generated with the build task of MUSIAL. |
41 | 38 | -m,--vcfMeta <arg> Path to a .tsv or .csv file specifying sample annotations. |
42 | 39 | -o,--output <arg> Path to write the output file (default: overwrite input file). |
43 | | - -p,--preview Only report on novel entries without writing the updated storage. |
44 | | - -V,--vcfInput <arg> List of file or directory paths. All files must be in VCF format. |
| 40 | + -V,--vcfFiles <arg> List of file or directory paths. All files must be in VCF format. |
45 | 41 | ``` |
46 | 42 | </details> |
47 | 43 |
|
48 | 44 | <details> |
49 | | -<summary><code>view</code> - View the content (features, samples or variants) and their attributes, of a MUSIAL storage file.</summary> |
| 45 | +<summary><code>view</code> - View the content (features, samples or variants; and their attributes) of a MUSIAL storage file.</summary> |
50 | 46 |
|
51 | 47 | ``` |
52 | 48 | Command line arguments of task view |
53 | 49 |
|
54 | | - -C,--content <arg> One of sample, allele, call, variant, type, feature. |
55 | | - -f,--filter <arg> List of feature-, sample names, and/or positions for which the output is to be filtered (default: no filters). Entries may be |
56 | | - ignored depending on the content. |
| 50 | + -C,--content <arg> The content to view. One of FEATURES, SAMPLES, VARIANTS (case-insensitive). |
57 | 51 | -I,--storage <arg> Path to a .json(.gz) file generated with the build task of MUSIAL. |
58 | | - -o,--output <arg> Path to directory or file to write the output to (default: stdout). |
| 52 | + -o,--output <arg> Path to write the output file. If not provided, a default file will be created based on the input file (default). If `print` or |
| 53 | + `stdout` is specified, the output will be printed to the console. |
| 54 | + -q,--query <arg> One or multiple identifiers or genomic ranges (contig:start-end) to query. |
59 | 55 | ``` |
60 | 56 | </details> |
61 | 57 |
|
62 | 58 | <details> |
63 | | -<summary><code>sequence</code> - Export FASTA format sequences of features from a MUSIAL storage file.</summary> |
| 59 | +<summary><code>profile</code> - Profile samples with respect to variants, alleles, or proteoforms.</summary> |
64 | 60 |
|
65 | 61 | ``` |
66 | | -Command line arguments of task sequence |
| 62 | +Command line arguments of task profile |
67 | 63 |
|
68 | | - -c,--content <arg> One of `nt` or `aa` (default: `nt`). |
69 | | - -F,--features <arg> List of feature names to export data for. Non-coding features are skipped if `content` is `aa`. |
70 | | - -I,--input <arg> Path to a .json(.gz) file generated with the build task of MUSIAL. |
71 | | - -k,--conserved Export conserved sites. |
72 | | - -m,--merge Export sequences per allele or proteoform instead of per sample. |
73 | | - -o,--output <arg> Path to a directory to write the output files to (default: parent of input). |
74 | | - -r,--reference Include the reference sequence within the export. |
75 | | - -s,--samples <arg> List of sample names to restrict the sequence export to. |
76 | | - -x,--strip Strip all gap characters from the exported sequences. |
| 64 | + -C,--content <arg> The content to view. One of VARIANTS, ALLELES, PROTEOFORMS (case-insensitive). |
| 65 | + -I,--storage <arg> Path to a .json(.gz) file generated with the build task of MUSIAL. |
| 66 | + -o,--output <arg> Path to write the output file. If not provided, a default file will be created based on the input file (default). If `print` or |
| 67 | + `stdout` is specified, the output will be printed to the console. |
| 68 | + -q,--query <arg> One or multiple identifiers or genomic ranges (contig:start-end) to consider. |
| 69 | + -x,--reduced Represent entries in a reduced format, i.e., sequence types as numbers with 0 as the reference or synonymous sequence and |
| 70 | + variants without detailed call information. |
77 | 71 | ``` |
78 | 72 | </details> |
79 | 73 |
|
80 | | ---- |
| 74 | +<details> |
| 75 | +<summary><code>sequence</code> - Generate and write sequence data.</summary> |
81 | 76 |
|
82 | | -Further details on the use of the software and internal workflows can be found in the repository [Wiki](https://github.com/Integrative-Transcriptomics/MUSIAL/wiki). |
| 77 | +``` |
| 78 | +Command line arguments of task sequence |
| 79 | +
|
| 80 | + -a,--align Whether to align sequences (optional, default: false). |
| 81 | + -c,--content <arg> Whether to generate NUCLEOTIDE or AMINOACID sequences (optional, case-insensitive, default: NUCLEOTIDE). |
| 82 | + -f,--split <arg> Whether to split output files by FEATURE, SAMPLE, BOTH, or NONE (optional, case-insensitive, default: FEATURE). |
| 83 | + -I,--storage <arg> Path to a .json(.gz) file generated with the build task of MUSIAL. |
| 84 | + -l,--locations <arg> One or multiple feature identifiers or genomic ranges (contig:start-end) to generate sequence data of. If none are provided, |
| 85 | + all features or full contig ranges will be considered. |
| 86 | + -m,--merge Whether to merge identical sequences (optional, default: false). |
| 87 | + -o,--output <arg> Path to write the output. If not provided, the directory of the input storage is used. If a directory is provided, files are |
| 88 | + created there. If a file is provided, its parent directory is used. |
| 89 | + -s,--samples <arg> One or multiple sample identifiers to retrieve sequences for (optional). |
| 90 | + -v,--variable Whether to only consider variable positions (optional, default: false). |
| 91 | +``` |
| 92 | +</details> |
83 | 93 |
|
84 | | -## 🌐 Web Interface |
| 94 | +### 🌐 Web Interface |
85 | 95 |
|
86 | | -To provide user-friendly access to its functionalities, MUSIAL is available via a web interface at https://musial-tuevis.cs.uni-tuebingen.de/ currently running version `v2.3.10`. The code is deposited in the `web` branch. |
| 96 | +MUSIAL is also available via a web interface at https://musial-tuevis.cs.uni-tuebingen.de/ currently running version `v2.3.10`. |
87 | 97 |
|
88 | | -## 🔨 Build |
| 98 | +### Build |
89 | 99 |
|
90 | | -MUSIAL `v2.4` is built with `JDK 21.0.6` and `Gradle 8.2.1`. If you want to compile the source code, run `gradle clean build` in the root directory of the project. The JavaDoc of the software is available at [https://integrative-transcriptomics.github.io/MUSIAL/javadoc/](https://integrative-transcriptomics.github.io/MUSIAL/javadoc/). |
| 100 | +MUSIAL `v2.4` is built with `JDK 21.0.6` and `Gradle 9.1.0`. If you want to compile the source code, run `gradle clean build` in the root directory of the project. The JavaDoc of the software is available at [https://integrative-transcriptomics.github.io/MUSIAL/javadoc/](https://integrative-transcriptomics.github.io/MUSIAL/javadoc/). |
91 | 101 |
|
92 | | -## 🙋 Need Help? |
| 102 | +### Need Help? |
93 | 103 |
|
94 | | -- 🎓 Detailed information about the software can be found in the repository's [Wiki](https://github.com/Integrative-Transcriptomics/MUSIAL/wiki). |
95 | | -- 🐛 Found an issue or have a feature request? Feel free to [Open a GitHub issue](https://github.com/Integrative-Transcriptomics/MUSIAL/issues/new). |
| 104 | +- Detailed information about the software can be found in the repository's [Wiki](https://github.com/Integrative-Transcriptomics/MUSIAL/wiki). |
| 105 | +- Found an issue or have a feature request? Feel free to [Open a GitHub issue](https://github.com/Integrative-Transcriptomics/MUSIAL/issues/new). |
0 commit comments