Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd behaviour of outliers #9

Open
EveTC opened this issue Nov 10, 2021 · 0 comments
Open

Odd behaviour of outliers #9

EveTC opened this issue Nov 10, 2021 · 0 comments

Comments

@EveTC
Copy link

EveTC commented Nov 10, 2021

Hello,

I was wondering if you could help me. After successfully running lffm_ridge() and lfmm_test() on my dataset I receive two SNPs that are extremly high outliers for multiple environmental varaibles.

I took a look at these SNPs in the vcf and found that both are fixed 0/0 in the all but two populations where all individuals are 1/1. I then thought that perhaps these sites had similar environmental variables (where they are outliers) however they are, in fact, contrasting.

I do not know the ins-and-outs of the method, but I can't understand how these SNPs would look like outliers if the environmnetal variables in the two popualtions are so contrasting?

SNMF population coancestry suggests a K=6, which was then used in the lffm_ridge()

Cross-entropy Ancestry barplot
Cross-entropy_plot Barplot_K6

The two sites I refer to above (that are fixed at 1/1 at the two outliers) are 30-CCP and 33-TAY, that cluster neatly as different populations. The rest of the sites are 0/0. Below is a screen shot of some of the env variables that these two SNPs show as large outliers.

image image

The env variables (non-sclaed) for these popualtions are below and the range of the variable in the whole dataset...

Site MinTemp GDD5 CMI hedge density Broadleaf Conif
30-CPP 1.8 20640.0 0.53 4.481674e-01 55 3
33-TAY 3.6 25015.5 0.68 1.610853e-13 60 0
overall range of var -1.2 - 3.6 11817.0 - 26203.5 0.25 - 0.72 0.0 - 1.197232 0 - 60 0-100

Please can you help me understand this behaviour? To me it would make sense for these SNPs to be outliers if the env variables were similar/ the same but they are not? It makes me concerned about the other outliers identified... or are these just not behaving properly for some reason?

Thank you in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant