Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PolyFun SusiE fails to find echor_mini #7

Open
AMCalejandro opened this issue Sep 20, 2022 · 7 comments
Open

PolyFun SusiE fails to find echor_mini #7

AMCalejandro opened this issue Sep 20, 2022 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@AMCalejandro
Copy link

AMCalejandro commented Sep 20, 2022

just got this running

Note that
Polyfun susie does not get to run as it is missing the environment
Also note the software is installed

Code

columnsnames = echodata::construct_colmap(munged= FALSE,
                                          CHR = "CHR", POS = "POS",
                                          SNP = "SNP", P = "P",
                                          Effect = "BETA", StdErr = "SE", 
                                          A1 = "A1", A2 = "A2", Freq = "FREQ",
                                          N = "N", MAF = "calculate",)



# Pass the sample size as "N" column
# compute_n will do all what is in the docu f N does not exist



finemap_loci(# GENERAL ARGUMENTS 
                                          topSNPs = topSNPs,
                                          results_dir = fullRS_path,
                                          loci = topSNPs$Locus,
                                          dataset_name = "LID_COX",
                                          dataset_type = "GWAS",  
                                          force_new_subset = TRUE,
                                          force_new_LD = FALSE,
                                          force_new_finemap = TRUE,
                                          remove_tmps = FALSE,
                                          
                                          finemap_methods = c("ABF","FINEMAP","SUSIE", "POLYFUN_SUSIE"),
                                          
                                          # Munge full sumstats first
                                          munged = FALSE,
                                          colmap = columnsnames,
                                          # SUMMARY STATS ARGUMENTS
                                          fullSS_path = newSS_name_colmap,
                                          fullSS_genome_build = "hg19",
                                          query_by ="tabix",
                                          
                                          #compute_n = 3500,


                                          bp_distance = 10000,#500000*2,
                                          min_MAF = 0.001, 
                                          trim_gene_limits = FALSE,
                                          
                                          
                                          case_control = FALSE,
                                          
                                          
                                         
                                          # FINE-MAPPING ARGUMENTS
                                          ## General
                                          n_causal = 5,
                                          credset_thresh = .95,
                                          consensus_thresh = 2,
                                         

                                          # LD ARGUMENTS 
                                          LD_reference = "1KGphase3",#"UKB",
                                          superpopulation = "EUR",
                                          download_method = "axel",
                                          LD_genome_build = "hg19",
                                          leadSNP_LD_block = FALSE,
                                         
                                          #### PLotting args ####
                                          plot_types = c("simple"),
                                          show_plot = TRUE,
                                          zoom = "1x",
                                          tx_biotypes = NULL,
                                          nott_epigenome = FALSE,
                                          nott_show_placseq = FALSE,
                                          nott_binwidth = 200,
                                          nott_bigwig_dir = NULL,
                                          xgr_libnames = NULL,
                                          roadmap = FALSE,
                                          roadmap_query = NULL,
                                          
                                          #### General args ####
                                          seed = 2022,
                                          nThread = 20,
                                          verbose = TRUE
                                          )

Output

PolyFun submodule already installed.
┌─────────────────────────────────────────────────┐
│                                                 │
│   )))> 🦇 RP11-240A16.1 [locus 1 / 3] 🦇 <(((   │
│                                                 │
└─────────────────────────────────────────────────┘

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'SNP'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'SNP' .../QC_SNPs_COLMAP.txt; grep
    -v ^'SNP' .../QC_SNPs_COLMAP.txt | sort
    -k2,2n
    -k3,3n ) > .../file2fb2fcecd3b_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_SNPs_COLMAP.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 4:32425284-32445284
Adding 'query' column to results.
Retrieved data with 76 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/RP11-240A16.1_LID_COX_subset.tsv.gz
+ Query: 76 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
Using existing 'N' column.
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 76 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/RP11-240A16.1_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/LD/RP11-240A16.1.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 76 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 76 SNPs.
+ dat = 76 SNPs.
+ 76 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
Using existing 'N' column.
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF
SUSIE
✅ All required columns present.
✅ All optional columns present.
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
Using existing 'N' column.
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 76 SNPs.
+ dat = 76 SNPs.
+ 76 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../RP11-240A16.1 &&
    .../finemap_v1.4.1_x86_64
   
    --sss
   
    --in-files .../master
   
    --log
   
    --n-threads 20
   
    --n-causal-snps 5

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - [email protected]        |
| - [email protected]          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

------------
FINE-MAPPING (1/1)
------------
- GWAS summary stats               : FINEMAP/data.z
- SNP correlations                 : FINEMAP/data.ld
- Causal SNP stats                 : FINEMAP/data.snp
- Causal configurations            : FINEMAP/data.config
- Credible sets                    : FINEMAP/data.cred
- Log file                         : FINEMAP/data.log_sss
- Reading input                    : done!   

- Updated prior SD of effect sizes : 0.05 0.0528 0.0558 0.0589 

- Number of GWAS samples           : 2687
- Number of SNPs                   : 76
- Prior-Pr(# of causal SNPs is k)  : 
  (0 -> 0)
   1 -> 0.584
   2 -> 0.292
   3 -> 0.096
   4 -> 0.0234
   5 -> 0.00449
- 1800 configurations evaluated (0.122/100%) : converged after 122 iterations
- Computing causal SNP statistics  : done!   
- Regional SNP heritability        : 0.0276 (SD: 0.00441 ; 95% CI: [0.0196,0.0371])
- Log10-BF of >= one causal SNP    : 24.4
- Post-expected # of causal SNPs   : 4.74
- Post-Pr(# of causal SNPs is k)   : 
  (0 -> 0)
   1 -> 9.4e-21
   2 -> 2.73e-11
   3 -> 1.41e-07
   4 -> 0.265
   5 -> 0.735
- Writing output                   : done!   
- Run time                         : 0 hours, 0 minutes, 0 seconds
2 data.cred* file(s) found in the same subfolder.
Selected file based on postPr_k: data.cred5
Importing conditional probabilities (.cred file).
No configurations were causal at PP>=0.95.
Importing marginal probabilities (.snp file).
Importing configuration probabilities (.config file).
FINEMAP was unable to identify any credible sets at PP>=0.95.
++ Credible Set SNPs identified = 0
++ Merging FINEMAP results with multi-finemap data.

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
Preparing sample size column (N).
Using existing 'N' column.
+ SUSIE:: sample_size=2,687
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 76 SNPs.
+ dat = 76 SNPs.
+ 76 SNPs in common.
Converting obj to sparseMatrix.
+ SUSIE:: Using `susie_rss()` from susieR v0.12.27
+ SUSIE:: Extracting Credible Sets.
++ Credible Set SNPs identified = 2
++ Merging SUSIE results with multi-finemap data.

+++ Multi-finemap:: POLYFUN_SUSIE +++
PolyFun submodule already installed.
PolyFun:: Fine-mapping with method=SUSIE
PolyFun:: Using priors from mode=precomputed
Unable to find conda binary. Is Anaconda installed?Locus RP11-240A16.1 complete in: 0.33 min
┌─────────────────────────────────────────┐
│                                         │
│   )))> 🦇 XYLT1 [locus 2 / 3] 🦇 <(((   │
│                                         │
└─────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'SNP'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'SNP' .../QC_SNPs_COLMAP.txt; grep
    -v ^'SNP' .../QC_SNPs_COLMAP.txt | sort
    -k2,2n
    -k3,3n ) > .../file2fb33669f7f_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_SNPs_COLMAP.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 16:17034975-17054975
Adding 'query' column to results.
Retrieved data with 80 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/XYLT1_LID_COX_subset.tsv.gz
+ Query: 80 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
Using existing 'N' column.
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 80 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/XYLT1_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/LD/XYLT1.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 78 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 78 SNPs.
+ dat = 78 SNPs.
+ 78 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
Using existing 'N' column.
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF
SUSIE
✅ All required columns present.
✅ All optional columns present.
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
Using existing 'N' column.
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 78 SNPs.
+ dat = 78 SNPs.
+ 78 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../XYLT1 &&
    .../finemap_v1.4.1_x86_64
   
    --sss
   
    --in-files .../master
   
    --log
   
    --n-threads 20
   
    --n-causal-snps 5

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - [email protected]        |
| - [email protected]          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

------------
FINE-MAPPING (1/1)
------------
- GWAS summary stats               : FINEMAP/data.z
- SNP correlations                 : FINEMAP/data.ld
- Causal SNP stats                 : FINEMAP/data.snp
- Causal configurations            : FINEMAP/data.config
- Credible sets                    : FINEMAP/data.cred
- Log file                         : FINEMAP/data.log_sss
- Reading input                    : done!   

- Updated prior SD of effect sizes : 0.05 0.0522 0.0545 0.0568 

- Number of GWAS samples           : 2687
- Number of SNPs                   : 78
- Prior-Pr(# of causal SNPs is k)  : 
  (0 -> 0)
   1 -> 0.584
   2 -> 0.292
   3 -> 0.0961
   4 -> 0.0234
   5 -> 0.0045
- 1077 configurations evaluated (0.198/100%) : converged after 198 iterations
- Computing causal SNP statistics  : done!   
- Regional SNP heritability        : 0.0119 (SD: 0.00385 ; 95% CI: [0.00536,0.0204])
- Log10-BF of >= one causal SNP    : 4.46
- Post-expected # of causal SNPs   : 1.96
- Post-Pr(# of causal SNPs is k)   : 
  (0 -> 0)
   1 -> 0.245
   2 -> 0.548
   3 -> 0.204
   4 -> 0.00238
   5 -> 0
- Writing output                   : done!   
- Run time                         : 0 hours, 0 minutes, 0 seconds
3 data.cred* file(s) found in the same subfolder.
Selected file based on postPr_k: data.cred2
Importing conditional probabilities (.cred file).
No configurations were causal at PP>=0.95.
Importing marginal probabilities (.snp file).
Importing configuration probabilities (.config file).
FINEMAP was unable to identify any credible sets at PP>=0.95.
++ Credible Set SNPs identified = 0
++ Merging FINEMAP results with multi-finemap data.

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
In addition: Warning messages:
1: In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')
2: In susie_suff_stat(XtX = XtX, Xty = Xty, n = n, yty = (n - 1) *  :
  IBSS algorithm did not converge in 100 iterations!
                  Please check consistency between summary statistics and LD matrix.
                  See https://stephenslab.github.io/susieR/articles/susierss_diagnostic.html
Preparing sample size column (N).
Using existing 'N' column.
+ SUSIE:: sample_size=2,687
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 78 SNPs.
+ dat = 78 SNPs.
+ 78 SNPs in common.
Converting obj to sparseMatrix.
+ SUSIE:: Using `susie_rss()` from susieR v0.12.27
+ SUSIE:: Extracting Credible Sets.
++ Credible Set SNPs identified = 1
++ Merging SUSIE results with multi-finemap data.

+++ Multi-finemap:: POLYFUN_SUSIE +++
PolyFun submodule already installed.
PolyFun:: Fine-mapping with method=SUSIE
PolyFun:: Using priors from mode=precomputed
Unable to find conda binary. Is Anaconda installed?Locus XYLT1 complete in: 0.32 min
┌────────────────────────────────────────┐
│                                        │
│   )))> 🦇 LRP8 [locus 3 / 3] 🦇 <(((   │
│                                        │
└────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'SNP'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'SNP' .../QC_SNPs_COLMAP.txt; grep
    -v ^'SNP' .../QC_SNPs_COLMAP.txt | sort
    -k2,2n
    -k3,3n ) > .../file2fb4113b218_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_SNPs_COLMAP.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 1:53768300-53788300
Adding 'query' column to results.
Retrieved data with 52 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LRP8_LID_COX_subset.tsv.gz
+ Query: 52 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
Using existing 'N' column.
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 52 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LRP8_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LD/LRP8.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 51 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 51 SNPs.
+ dat = 51 SNPs.
+ 51 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
Using existing 'N' column.
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF
SUSIE
✅ All required columns present.
✅ All optional columns present.
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
Using existing 'N' column.
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 51 SNPs.
+ dat = 51 SNPs.
+ 51 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../LRP8 &&
    .../finemap_v1.4.1_x86_64
   
    --sss
   
    --in-files .../master
   
    --log
   
    --n-threads 20
   
    --n-causal-snps 5

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - [email protected]        |
| - [email protected]          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

------------
FINE-MAPPING (1/1)
------------
- GWAS summary stats               : FINEMAP/data.z
- SNP correlations                 : FINEMAP/data.ld
- Causal SNP stats                 : FINEMAP/data.snp
- Causal configurations            : FINEMAP/data.config
- Credible sets                    : FINEMAP/data.cred
- Log file                         : FINEMAP/data.log_sss
- Reading input                    : done!   

- Updated prior SD of effect sizes : 0.05 0.0517 0.0535 0.0554 

- Number of GWAS samples           : 2687
- Number of SNPs                   : 51
- Prior-Pr(# of causal SNPs is k)  : 
  (0 -> 0)
   1 -> 0.585
   2 -> 0.292
   3 -> 0.0955
   4 -> 0.0229
   5 -> 0.00431
- 1081 configurations evaluated (0.123/100%) : converged after 123 iterations
- Computing causal SNP statistics  : done!   
- Regional SNP heritability        : 0.0259 (SD: 0.00368 ; 95% CI: [0.0188,0.0334])
- Log10-BF of >= one causal SNP    : 24.9
- Post-expected # of causal SNPs   : 5
- Post-Pr(# of causal SNPs is k)   : 
  (0 -> 0)
   1 -> 5.84e-22
   2 -> 1.71e-17
   3 -> 1.74e-11
   4 -> 4.56e-06
   5 -> 1
- Writing output                   : done!   
- Run time                         : 0 hours, 0 minutes, 0 seconds
1 data.cred* file(s) found in the same subfolder.
Selected file based on postPr_k: data.cred5
Importing conditional probabilities (.cred file).
No configurations were causal at PP>=0.95.
Importing marginal probabilities (.snp file).
Importing configuration probabilities (.config file).
FINEMAP was unable to identify any credible sets at PP>=0.95.
++ Credible Set SNPs identified = 0
++ Merging FINEMAP results with multi-finemap data.

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
In addition: Warning message:
In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')
Preparing sample size column (N).
Using existing 'N' column.
+ SUSIE:: sample_size=2,687
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 51 SNPs.
+ dat = 51 SNPs.
+ 51 SNPs in common.
Converting obj to sparseMatrix.
+ SUSIE:: Using `susie_rss()` from susieR v0.12.27
+ SUSIE:: Extracting Credible Sets.
++ Credible Set SNPs identified = 3
++ Merging SUSIE results with multi-finemap data.

+++ Multi-finemap:: POLYFUN_SUSIE +++
PolyFun submodule already installed.
PolyFun:: Fine-mapping with method=SUSIE
PolyFun:: Using priors from mode=precomputed
Unable to find conda binary. Is Anaconda installed?Locus LRP8 complete in: 0.33 min

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 6 ▶▶▶ Postprocess data 🎁 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Returning results as nested list.
All loci done in: 0.97 min
$`RP11-240A16.1`
NULL

$XYLT1
NULL

$LRP8
NULL

$merged_dat
Null data.table (0 rows and 0 cols)

Warning message:
In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')

Session Info

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SNPlocs.Hsapiens.dbSNP155.GRCh37_0.99.22 SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20 BSgenome_1.65.2                         
 [4] rtracklayer_1.57.0                       Biostrings_2.65.3                        XVector_0.37.1                          
 [7] GenomicRanges_1.49.1                     GenomeInfoDb_1.33.5                      IRanges_2.31.2                          
[10] S4Vectors_0.35.3                         BiocGenerics_0.43.1                      forcats_0.5.2                           
[13] stringr_1.4.1                            dplyr_1.0.10                             purrr_0.3.4                             
[16] readr_2.1.2                              tidyr_1.2.0                              tibble_3.1.8                            
[19] ggplot2_3.3.6                            tidyverse_1.3.2                          data.table_1.14.2                       
[22] echolocatoR_2.0.1                       

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                  reticulate_1.26             R.utils_2.12.0              tidyselect_1.1.2            RSQLite_2.2.16             
  [6] AnnotationDbi_1.59.1        htmlwidgets_1.5.4           grid_4.2.0                  BiocParallel_1.31.12        XGR_1.1.8                  
 [11] munsell_0.5.0               codetools_0.2-18            interp_1.1-3                DT_0.24                     withr_2.5.0                
 [16] colorspace_2.0-3            OrganismDbi_1.39.1          Biobase_2.57.1              filelock_1.0.2              knitr_1.40                 
 [21] supraHex_1.35.0             rstudioapi_0.14             DescTools_0.99.46           MatrixGenerics_1.9.1        GenomeInfoDbData_1.2.8     
 [26] mixsqp_0.3-43               bit64_4.0.5                 echoconda_0.99.7            basilisk_1.9.2              vctrs_0.4.1                
 [31] generics_0.1.3              xfun_0.32                   biovizBase_1.45.0           BiocFileCache_2.5.0         R6_2.5.1                   
 [36] AnnotationFilter_1.21.0     bitops_1.0-7                cachem_1.0.6                reshape_0.8.9               DelayedArray_0.23.1        
 [41] assertthat_0.2.1            BiocIO_1.7.1                scales_1.2.1                googlesheets4_1.0.1         nnet_7.3-17                
 [46] rootSolve_1.8.2.3           gtable_0.3.1                lmom_2.9                    ggbio_1.45.0                ensembldb_2.21.4           
 [51] rlang_1.0.5                 MungeSumstats_1.5.13        echodata_0.99.14            splines_4.2.0               lazyeval_0.2.2             
 [56] gargle_1.2.0                dichromat_2.0-0.1           hexbin_1.28.2               broom_1.0.1                 checkmate_2.1.0            
 [61] modelr_0.1.9                BiocManager_1.30.18         yaml_2.3.5                  reshape2_1.4.4              snpStats_1.47.1            
 [66] backports_1.4.1             GenomicFeatures_1.49.6      ggnetwork_0.5.10            Hmisc_4.7-1                 RBGL_1.73.0                
 [71] tools_4.2.0                 echoplot_0.99.5             ellipsis_0.3.2              catalogueR_1.0.0            RColorBrewer_1.1-3         
 [76] proxy_0.4-27                coloc_5.1.0                 Rcpp_1.0.9                  plyr_1.8.7                  base64enc_0.1-3            
 [81] progress_1.2.2              zlibbioc_1.43.0             RCurl_1.98-1.8              basilisk.utils_1.9.2        prettyunits_1.1.1          
 [86] rpart_4.1.16                deldir_1.0-6                viridis_0.6.2               haven_2.5.1                 cluster_2.1.3              
 [91] SummarizedExperiment_1.27.2 ggrepel_0.9.1               fs_1.5.2                    crul_1.2.0                  magrittr_2.0.3             
 [96] echotabix_0.99.8            dnet_1.1.7                  openxlsx_4.2.5              reprex_2.0.2                googledrive_2.0.0          
[101] mvtnorm_1.1-3               ProtGenerics_1.29.0         matrixStats_0.62.0          hms_1.1.2                   patchwork_1.1.2            
[106] XML_3.99-0.10               jpeg_0.1-9                  readxl_1.4.1                gridExtra_2.3               compiler_4.2.0             
[111] biomaRt_2.53.2              crayon_1.5.1                R.oo_1.25.0                 htmltools_0.5.3             echoannot_0.99.7           
[116] tzdb_0.3.0                  Formula_1.2-4               expm_0.999-6                Exact_3.1                   lubridate_1.8.0            
[121] DBI_1.1.3                   dbplyr_2.2.1                MASS_7.3-58.1               rappdirs_0.3.3              boot_1.3-28                
[126] Matrix_1.4-1                piggyback_0.1.3             cli_3.3.0                   R.methodsS3_1.8.2           echofinemap_0.99.3         
[131] parallel_4.2.0              igraph_1.3.4                pkgconfig_2.0.3             GenomicAlignments_1.33.1    dir.expiry_1.5.0           
[136] RCircos_1.2.2               foreign_0.8-82              osfr_0.2.8                  xml2_1.3.3                  rvest_1.0.3                
[141] echoLD_0.99.7               VariantAnnotation_1.43.3    digest_0.6.29               graph_1.75.0                httpcode_0.3.0             
[146] cellranger_1.1.0            htmlTable_2.4.1             gld_2.6.5                   restfulr_0.0.15             curl_4.3.2                 
[151] Rsamtools_2.13.4            rjson_0.2.21                lifecycle_1.0.1             nlme_3.1-159                jsonlite_1.8.0             
[156] viridisLite_0.4.1           fansi_1.0.3                 downloadR_0.99.4            pillar_1.8.1                susieR_0.12.27             
[161] lattice_0.20-45             GGally_2.1.2                googleAuthR_2.0.0           KEGGREST_1.37.3             fastmap_1.1.0              
[166] httr_1.4.4                  survival_3.3-1              glue_1.6.2                  zip_2.2.0                   png_0.1-7                  
[171] bit_4.0.4                   Rgraphviz_2.41.1            class_7.3-20                stringi_1.7.8               blob_1.2.3                 
[176] latticeExtra_0.6-30         memoise_2.0.1               irlba_2.3.5                 e1071_1.7-11                ape_5.6-2     

@AMCalejandro AMCalejandro added the bug Something isn't working label Sep 20, 2022
@bschilder bschilder self-assigned this Sep 20, 2022
@bschilder
Copy link
Member

It looks like this is happening because echoconda is unable to find your conda binary. It uses this to find a valid python executable that it can use to run various PolyFun functions. So I think this is actually the issue, rather than PolyFun itself not being installed.

Are you on an HPC or your local computer? Could you provide your full conda env path?

As a side note, PolyFun itself is not currently available as a package, only as a collection of python scripts distributed by cloning the polyfun github repo. I've tried to automate cloning the polyfun repo with echofinemap::POLYFUN_install() whenever users call functions that require PolyFun.

@AMCalejandro
Copy link
Author

Thi was within the MAGMA.Celltyping docker container, and I am not specifying the conda env, so it should be using echor_mini

Re echofinemap::POLYFUN_install()
Yes, I got it installed within the docker container

@bschilder bschilder transferred this issue from RajLabMSSM/echolocatoR Sep 20, 2022
@bschilder
Copy link
Member

Thi was within the MAGMA.Celltyping docker container, and I am not specifying the conda env, so it should be using echor_mini

Well it's trying to use conda to create "echoR_mini", but it can't find the conda binary to do so. Thus the error message:

Unable to find conda binary. Is Anaconda installed? 

Will try to replicate this in a new container

@AMCalejandro
Copy link
Author

@AMCalejandro
Copy link
Author

AMCalejandro commented Sep 21, 2022

Also, I just noticed conda is not even installed in MAGMA.Celltyping docker container
I am not sure if this is expected, and this is all managed by echoconda?
With a container, do not we want to substitute the creation of a conda env?
I understand this will change when echolocatoR has its own container

@bschilder
Copy link
Member

Also, I just noticed conda is not even installed in MAGMA.Celltyping docker container
I am not sure if this is expected, and this is all managed by echoconda?
With a container, do not we want to substitute the creation of a conda env?
I understand this will change when echolocatoR has its own container

Yes, that's expected. basilisk or reticulate are used to automatically install miniconda if conda is not already available.

@bschilder
Copy link
Member

In case this is useful https://cran.r-project.org/web/packages/reticulate/vignettes/versions.html

Yup, this is reticulate, which is one of the main packages that echoconda uses under the hood. But sometimes reticulate can't find the conda binary. Will assess whether that's the source of the issue in this instance.

@bschilder bschilder moved this from Todo to In Progress in 🦇🦇 echoverse 🦇🦇 Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

No branches or pull requests

2 participants