-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to deal "missing value error" on imputed genotypes data #511
Comments
|
Thanks Florian for the quick turnaround! And I must apologize for not explaining clearly. I am using the imputed genotype data from the GWAS tutorial: Imputation available at this link. The data was generated using the following code: The snp_stats[, 1:5] output is as follows, identical to what is shown in the tutorial. However, when passing this obj$geno_imputed data to the big_SVD function in the subsequent code in the GWAS tutorial: Population structure (available here), it results in the error mentioned above. ### But I am very happy to say that following your suggestion to use snp_autoSVD() instead of big_SVD() for genotype data, the problem has been solved, and the code now runs smoothly, as shown below. Thank you so much, Florian. Your suggestion has been incredibly helpful. |
This is good that you found some workaround. |
My ‘bigstatsr’ version is 1.5.12. Ps. My 'bigsnpr' version is 1.12.2. I loaded the bigsnpr package, and the bigstatsr package was automatically loaded along with it. Subsequently, I used the following two functions: snp_fastImputeSimple {bigsnpr} and snp_autoSVD {bigsnpr}. |
What do you get if you run this reproducible code? zip <- runonce::download_file(
"https://d1ypx1ckp5bo16.cloudfront.net/penncath/penncath.zip",
dir = "tmp-data")
unzip(zip, exdir = "tmp-data", overwrite = FALSE)
library(bigsnpr)
snp_readBed("tmp-data/data/penncath.bed")
penncath <- snp_attach("tmp-data/data/penncath.rds")
penncath$geno_imputed <- snp_fastImputeSimple(Gna = penncath$genotypes,
method = "mode",
ncores = nb_cores())
big_SVD(penncath$geno_imputed, big_scale(), k = 10) For me, it runs forever because there are some variables with no variation that prevent convergence (which now errors with v1.5.14). |
I cannot reproduce the issue, and I have no idea what's going on :/ PS: You should try not to change the working directory; use RStudio projects and stick with the working directory of the project. |
I am using the demonstration data "penncath" from the tutorial of the R package [bigsnpr]. I have verified that there are no missing values in the imputed obj$geno_imputed data, both by SNP and samples. However, when I use the big_SVD function for principal component analysis, I still encounter an error: "You can't have missing values in 'X'", as shown below. I would be very grateful if someone could help me identify where I might have made a mistake.
The text was updated successfully, but these errors were encountered: