migadmi

migration and admixtures: method to estimate parameters in a complex admixture graph with multiple and nested admixture events, when the number of source population could be {1,2,3,...}.

For admixture schemes of high complexity, migadmi introduces admixture parameters and makes the symbolic inference of covariance matrix between populations. After that, it optimizes parameters with SLSQP (Sequential Least Squares Programming) method. Delais of the method is described in the paper Historical routes for diversification of domesticated chickpea inferred from landrace genomics.

Input data

1. A binary tree of relationships between base source populations

For some of the population, the relationship should be provided. Then these populations could be used as sources, and after an admixture event, each new mixture population also could play the role of a source. We suggest providing the tree in with .nwk format:

((LEB_d,TUR_d),(IND_d,UZB_d));

To read the tree, use the following code:

from ete3 import Tree
file_tree = 'data/tree_init.nwk'
tree = Tree(file_tree)

2. Allele frequencies in populations or the distance matrix

Table with populations in columns, SNPs in rows. Allele frequency values are numbers in [0,1]. These values are used to calculate the distance matrix between populations using the compositional data analysis (CoDA). Of course, if one wants to avoid CoDA, they can provide the pre-calculated distance matrix between populations.

3. Admixture scheme

We suggest providing the scheme in a file and reading it with read_admixture function. The admixture scheme should be organized as follows. Each row is an admixture event. The first word in a row - the name of the mixture population, then sources:

ETHI_d: LEB_d, TUR_d, IND_d
MOR_d: LEB_d, TUR_d
TUR_k: TUR_d
UZB_k: UZB_d, TUR_k
MOR_k: MOR_d, TUR_k
LEB_k: LEB_d, TUR_k

The number of source populations for each mixture could be {1,2,3,4...}. 1 - when it is just a new branch. In principle, the number is not limited; however, the problem could be weakly identifiable with the growth of the number of sources for an admixture event. To cope with the identification problem, we use regularisation for admixture weights: the Dirichlet prior, with concentration parameter alpha, alpha=1 - the absence of regularisation.

There is an option for step-by-step optimisation of admixture events:

admixture_steps = [[0, 1], [2, 3], [4, 5]]

In this case the exdixture parameters will be optimised in three steps.

Run migadmi

variables, variance_decomposition, weight_sets = migadmi(tree=tree,
              admixtures=admixtures,
              admixture_steps=admixture_steps,
              dist_matrix=d_mx, alpha=1)

To demonstrate the test, please run example.py

Ouput data

1. Optimised parameters

Dictionary of parameters and values. There are three types of parameters: branch lengths, admixture weights, and parts of common variance with sources (alpha parameter on the Figure) Names of these three types start with the letters "t", "w" and "a", respectively. For example:

{'t0': 0.0,
 't1': 0.0,
 't2': 0.729128663694431,
 't3': 0.1202425699727737,
 't4': 0.2139704726613705,
 't5': 0.18213312504471313,
 't6': 0.48002745557810955,
 'a7': 0.23128479141352762,
 'w8': 0.9365519173251352,
 'w9': 0.0,
 'w10': 0.06344808267486482}

2. Decomposition of variance for mixture populations

For each admixture population, migadmi estimates the proportion of variance explained by sources and proportion of own variance.

[['ETHI_d', 'LEB_d', 0.0665210082552726],
 ['ETHI_d', 'TUR_d', 0],
 ['ETHI_d', 'IND_d', 8.95945904316835e-5],
 ['ETHI_d', 'ETHI_d', 0.933389397154296]]  # own variance

3. Names of weight parameters for admixtures

Names of weight parameters, corresponding to the admixture events.

[[w8, w9, w10], [w13, w14], [], [w19, w20], [w23, w24], [w27, w28]]

Getting Started

Clone this directory to your computer

Requirements

To run migadmi methods, you need Python 3.4 or later. A list of required Python packages that the migadmi depends on, are in requirements.txt.

Authors

Anna Igolkina developed the migadmi package, e-mail.

Citation

This method is described in the paper Historical routes for diversification of domesticated chickpea inferred from landrace genomics.

License information

The migadmi package is open-sourced software licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
example		example
simulation		simulation
README.md		README.md
admixture_parameterization.png		admixture_parameterization.png
example.py		example.py
migadmi_opt.py		migadmi_opt.py
pipeline_wnd.py		pipeline_wnd.py
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt
sim_pipeline.py		sim_pipeline.py
sim_testing.py		sim_testing.py
tree_cov.py		tree_cov.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

migadmi

Input data

1. A binary tree of relationships between base source populations

2. Allele frequencies in populations or the distance matrix

3. Admixture scheme

Run migadmi

Ouput data

1. Optimised parameters

2. Decomposition of variance for mixture populations

3. Names of weight parameters for admixtures

Getting Started

Requirements

Authors

Citation

License information

About

Releases

Packages

Languages

iganna/migadmi

Folders and files

Latest commit

History

Repository files navigation

migadmi

Input data

1. A binary tree of relationships between base source populations

2. Allele frequencies in populations or the distance matrix

3. Admixture scheme

Run migadmi

Ouput data

1. Optimised parameters

2. Decomposition of variance for mixture populations

3. Names of weight parameters for admixtures

Getting Started

Requirements

Authors

Citation

License information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages