Skip to content

Commit

Permalink
Add Usage and G25 score calculation instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
TheSergeyPixel committed Nov 14, 2022
1 parent 2766567 commit ba793cc
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 2 deletions.
44 changes: 44 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@

## Table of Contents
1. [Installation](#Installation)
2. [Usage](#Usage)
3. [How to convert the result into G25 scores](#How to convert the result into G25 scores)

## Installation

Expand All @@ -14,3 +16,45 @@ Currently, the best way to install Diablo G25 is to git clone the repository and
git clone https://github.com/TheSergeyPixel/Diablo_G25
conda env create -f /path/to/cloned/repo/diablo_g25.yml
```
You can also git clone the repository and manually install the required packages as follows:

```
git clone https://github.com/TheSergeyPixel/Diablo_G25
conda install pandas>=1.5.1
conda install numpy>=1.23.4
pip install admix
```


We are currently working on creating conda package.

## Usage

Diablo G25 requires basic **gzipped** VCF file (for example, HaplotypeCaller + GenotypeGVCFs output) as input. The output
is always generated as tsv file. Run main.py from downloaded repository as follows:

```
python main.py -i /path/to/vcf/file.vcf -o /desired/output/direcotry/output.tsv -m model_name
```
```-i``` and ```-o``` arguments are always required.<br/>
<br/>
23andMe style tsv file with all genotypes will be generated in the directory of output file. <br/>
<br/>
For the ```-m``` option, enter the name of any model provided by [admix](https://github.com/stevenliuyi/admix#models).
If ```-m``` is not provided, K36 model will be used by default.

## How to convert the result into G25 scores

After you have obtained the scores for the model you chose, you can convert them into G25 scores via visiting
[Allelocator calculator](https://allelocator.ovh/simulatedg25.html) and performing following steps:

1. Paste your result from the Diablo G25 output into **Calculator results** field
2. Choose the model you used from **Linear regression matrix** field
3. **Simulated G25 coordinates** field will auto generate your simulated G25 scores
4. You can proceed to [Vahaduo admixture calculator](https://vahaduo.github.io/vahaduo/) to estimate admixture
proportions and calculate Euclidean distances.<br\>

When you open Vahaduo admixture calculator, you would need to paste your G25 coordinates into **target** field,
G25 populations (can be downloaded from another [Vahaduo tool](https://vahaduo.github.io/g25download/)) into **source**
field followed by choice of desired options and running the tool at **single** tab if you have one
sample (line) in your target field or **multi** if you have multiple samples.
2 changes: 1 addition & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

parser.add_argument('-i', '--input', required=True, help='Input as vcf.gz file', type=str)
parser.add_argument('-o', '--output', required=True, help='Output in txt file', type=str)
parser.add_argument('-m', '--model', required=False, help='model to use in admix', type=str)
parser.add_argument('-m', '--model', required=False, help='model to use for admixture', type=str, default='K36')

args = parser.parse_args()

Expand Down
1 change: 0 additions & 1 deletion src.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,3 @@ def admix(in_file, model):
ser = df.iloc[:, -1].str.split(": ", expand=True)[1]
admix_scores = ",".join(a.strip('%') for a in ser)
return admix_scores

0 comments on commit ba793cc

Please sign in to comment.