Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling before updating #220

Open
dafeda opened this issue Sep 2, 2024 · 0 comments
Open

Scaling before updating #220

dafeda opened this issue Sep 2, 2024 · 0 comments

Comments

@dafeda
Copy link
Collaborator

dafeda commented Sep 2, 2024

Problem Formulation

During the update process, inverting the autocovariance matrix of responses is required. However, if the responses are on vastly different scales, the autocovariance matrix can become ill-conditioned, leading to the following warning: LinAlgWarning: Ill-conditioned matrix. This issue has been reported in several real FMU use-cases.

To build some intuition, here's a draft of a test that uses synthetic data to force this warning:

Link to the test

In this test, there are only two responses, with one response on a very different scale than the other.

@Oddvar Lia mentioned large differences in this post: Slack discussion.

Possible Solutions

1. Use Standard Scaling

One suggestion by @Blunde1 is to use a standard scaler from sklearn or similar to do the following:

- X_prior_std, Y_prior_std = Scaler_X(X_prior), Scaler_Y(Y_prior)
- Apply the Scaler_Y also to d
- X_posterior_std = update(X_prior_std, Y_prior_std, d)
- X_posterior = Scaler_X.inverseTransform(X_posterior_std)

"I think this should all be okay, as the transformations are affine and the update is for (an approximate) Gaussian. But should probably be double checked. I hope it should be theoretically equivalent, but numerically better conditioned."

One concern is whether standard scaling is appropriate for responses such as water-breakthrough, which may be mostly zero but can then become large.

Note from Glison:

"Yes! I think I understood Feda's question and I agree with Berent's suggestion. I think it is similar to what we have used in our implementations during the PhD and in Petrobras.

Feda Curic, if we use the user-defined standard deviation of the observation error to standardize the data (and everything that comes from the data) in the computations, we should not have issues related to sim data that are mostly zero and have a breakthrough. The user should define a minimum tolerance for all observations anyway and zero is not valid.

Defining tolerances for water production is actually a common doubt among engineers when building FMU setup for HM. I usually recommend that they set something like MAX(MINTOL, 0.1*dobs), where MINTOL is a reasonable value for the problem. MINTOL = 1% is a common choice when WCUT is the type of data.

I believe that this approach is very similar to the paper that Vinicius Rios has cited. There, Emerick suggests to use the diagonal of CD to standardize."

2. Look into the Implementation Used by Emerick

Vinicius suggested reviewing the following paper for a possible solution:

Emerick, A.A., 2016. Analysis of the performance of ensemble-based assimilation of production and seismic data. Journal of Petroleum Science and Engineering 139, 219–239.
DOI: 10.1016/j.petrol.2016.01.029

Next Steps

  • Identify an approach that performs well in general and is also effective on specific data FMU produces, such as water-breakthrough.
@dafeda dafeda added this to SCOUT Sep 3, 2024
@eivindjahren eivindjahren added christmas-review Issues and PRs for Christmas review and removed christmas-review Issues and PRs for Christmas review labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants