Question Regarding Convergence of Approximation and Original Method for Correlation Computation #4

qianzach · 2024-03-28T14:05:17Z

I hope this finds you well. I had a quick question regarding the estimations of the latent correlation matrices.

Description

I would like to get an accurate estimate of the latent correlation matrix. There is no compiling error, but I've observed that when I use both the original and approx, I reach a maximum iteration warning, which is likely due to the fact that my data is sparse. I am curious to see if this compromises the accuracy of the estimate.

I would like to know which estimate is better, and if there is a way to improve these estimates because some of the results are quite different. I've tried to adjust the tol parameter as well, but the results stay relatively similar for the original. I have also experimented with different shrinkage values and lower boundary values.

latentcor version: 0.2.5
Python version: Python 3.11.7
Operating System: macOS Ventura

What I Did

Consider the nxk matrix mat where n > k. The latent correlation we want to measure are the column-wise covariates. As such, the tps argument is simply just an array with "tru" (the data we are dealing with are all gene expressions of single cell data, so we assume truncated Gaussian copula).

latentcor(mat, tps = tps_arr, tol = 1e-17 ,method ='original', use_nearPD=True)['R'] #using original method

latentcor(mat, tps = tps_arr, method ='approx', nu = 0.01, ratio = 0.9, use_nearPD=True)['R'] #use approx

The end result is that I get higher magnitudes of correlation (+/- 0.1 more depending on the sign +/-) when using the approximation . However, since both are done executing by the max iteration termination, I'm not sure what is the better estimate.

After looking at some of the base code, I see it may be relevant to nearest_corr(), but the n_fact parameter you have set is already so high, so I am a bit confused by the difference in results. Thank you!

The text was updated successfully, but these errors were encountered:

mingzehuang · 2024-04-06T02:52:38Z

Hi, @qianzach; sorry for the late reply! Generally speaking, 'original' is more accurate than 'approx'. However, I'm investigating the convergence problem you mentioned. I'll get back to you ASAP!

qianzach · 2024-04-06T04:26:51Z

No worries! I see. Thank you so much! Just to provide an additional detail-- the issue occurs when using statsmodels.stats.correlation_tools.corr_nearest. I reach maximum iterations (likely due to very sparse data), but it seems like this might be the source of the differences in the approximation and the original method latent correlation.

Thanks!

mingzehuang · 2024-04-14T18:54:50Z

Hi, @qianzach. I think we use corr_nearest() just to further adjust the output to guarantee the output matrix is positive definite. If it doesn't converge properly, you can try turning it off by setting use_nearPD = False. Then, you can get the semi-definite output matrix, which you may adjust yourself. And it's the original result from our algorithm :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding Convergence of Approximation and Original Method for Correlation Computation #4

Question Regarding Convergence of Approximation and Original Method for Correlation Computation #4

qianzach commented Mar 28, 2024 •

edited

Loading

mingzehuang commented Apr 6, 2024

qianzach commented Apr 6, 2024 •

edited

Loading

mingzehuang commented Apr 14, 2024

Question Regarding Convergence of Approximation and Original Method for Correlation Computation #4

Question Regarding Convergence of Approximation and Original Method for Correlation Computation #4

Comments

qianzach commented Mar 28, 2024 • edited Loading

Description

What I Did

mingzehuang commented Apr 6, 2024

qianzach commented Apr 6, 2024 • edited Loading

mingzehuang commented Apr 14, 2024

qianzach commented Mar 28, 2024 •

edited

Loading

qianzach commented Apr 6, 2024 •

edited

Loading