Skip to content

Conversation

@lechevaa
Copy link
Contributor

I'm a Postdoc at NORCE, this is my first PR to OPM.

Added Hybrid Newton Nonlinear Preconditioning Flags and Features for three phase compositional black oil problem.

Reference to the introduction of the method within OPM:
Lechevallier, Antoine, Sandve, Tor Harald, Landa-Marbán, David, Kane, Birane, and Sarah Eileen Gasda. "Incremental Machine Learning for Near-Well Prediction: A Non-Linear Preconditioning Approach to Faster History Matching." Paper presented at the SPE Reservoir Simulation Conference, Galveston, Texas, USA, March 2025. doi: https://doi.org/10.2118/223843-MS

Short description about HybridNewton method:
The goal is simply to update the initial guess of Newton's method with a new value closer to the solution to speed up the convergence. The update is done using a neural network.

What is new compared to the paper:
I finally managed to fix the Pressure prediction ! This is now fully supported

Some general information about the implementation:

  • The user provides a json file with all required information i.e path to neural network, input features, output features, time to apply the model, cell indexes, feature engineering, scalers.

  • Multiple models can be provided within a single config file.

  • The units are assumed to be the same as what is outputted by opm.

  • For now I only implemented for FlowProblemBlackOil.

Input features supported:

  • Dynamic: PRESSURE, SWAT, SGAS, SOIL, RS, RV (same name as what is outputted by opm)
  • Static: TIMESTEP (scalar feature!), PERMX

Output Features supported:

  • absolute dynamic: PRESSURE, SWAT, SGAS, SOIL, RS, RV
  • relative dynamic: DELTA_PRESSURE, DELTA_SWAT, DELTA_SGAS, DELTA_SOIL, DELTA_RS, DELTA_RV

New Flags:
--use-hy-ne=bool : use or not the HybridNewton method
--hy-ne-config-file=path : Path to the configuration file (json) that allows OPM to interpret correctly how to precondition Newton's method.

New files:

opm/simulators/flow/HybridNewtonConfig.hpp

struct HybridNewtonConfig that parses the main information required to run HybridNewton

opm/simulators/flow/HybridNewton.hpp

Contains the class implementation of BlackOilHybridNewton. The main method is runHybridNewton given a HybridNewtonConfig.

What is missing for now:
I created a python code that tests all supported features in all combinations possible and generate neural network models together with config files.

I have done some tests in python, running Drogon and SPE1 and predicting the exact solution to ensure that Newton's method converges in 0 iteration.

However these tests are not yet included within the opm python testing framework. I need to be careful about dependencies (same as the one in opm-common/python/opm/ml/ml_tools/requirements.txt) and testing.

TODO:
adapt my existing tests to the opm-simulator python testing

@akva2 akva2 added the manual:new-feature This is a new feature and should be described in the manual label Aug 26, 2025
Copy link
Member

@akva2 akva2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the OPM project. Don't be discouraged by the nitpickery, just want to get it out of the way before somebody does a proper review.

using WellModel = GetPropType<TypeTag, Properties::WellModel>;
using AquiferModel = GetPropType<TypeTag, Properties::AquiferModel>;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no


WellModel wellModel_;
AquiferModel aquiferModel_;
AquiferModel aquiferModel_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

struct RestartWritingInterval { static constexpr int value = 0xffffff; }; // disable

// Path to the config file containing all Hybrid Newton parameters
struct HyNeConfigFile { static constexpr auto value = "foo.json"; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please choose a better default

#include <string>
#include <iostream>
#include <fstream>
#include <boost/property_tree/json_parser.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property tree is a a really heavy library and we try to avoid it in simulator units. Is there a reason why you can't use our wrapper class, PropertyTree.hpp ?

: simulator_(simulator)
, configsLoaded_(false)
{}
void tryApplyHybridNewton()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

return applicable;
}

bool shouldApplyHybridNewton(double current_time, const HybridNewtonConfig& config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

}


void validateAllConfigs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smells like a const

* \return A tensor of shape [n_cells x n_input_features], where rows
* correspond to cells and columns correspond to input features.
*/
Opm::ML::Tensor<Evaluation>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are already in the Opm namespace, avoid Opm:: where possible.

* \throws std::runtime_error if an expected output feature is missing
* or if state update fails for any cell.
*/
void updateInitialGuess(Opm::ML::Tensor<Evaluation>& output, const HybridNewtonConfig& config, const std::vector<int>& cell_indices, size_t n_cells)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break long line

} // namespace Opm


#endif No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eol at eof

@akva2
Copy link
Member

akva2 commented Aug 26, 2025

jenkins build this please

@lechevaa
Copy link
Contributor Author

Thank you very much, I will address these comments !

@daavid00
Copy link
Member

daavid00 commented Sep 1, 2025

jenkins build this please

@lechevaa lechevaa marked this pull request as ready for review September 2, 2025 12:09
@totto82 totto82 requested a review from jakobtorben September 12, 2025 12:23
@daavid00 daavid00 added this to the Release 2025.10 milestone Sep 15, 2025
Copy link
Member

@jakobtorben jakobtorben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the OPM community!

This is an interesting method. Kudos for getting this ML method to work within the constraints of OPM and C++ (or not Python). In addition to my comments, here are some general comments to consider for improving the maintainability and performance.

  • HybridNewton is currently quite a large class that has a lot of responsibilities. I think it could benefit from separation of concerns. A lot of the parsing and setup could potentially be moved into HybridNewtonConfig.hpp or elsewhere, such that HybridNewton could focus on the algorithm, integration into the simulator and state updates.
  • There is currently a lot of redundant work that is being done every time we call runHybridNewton, such as reading the indices and model weights from file and allocating the model and tensors. I think the method would benefit from doing more analysis and setup up front to make it faster during application. This has to be balanced with not making it too memory heavy and maintaining its minimal impact when the method is not used.
  • I have not had a look at the Python files as I assume this is for the Python tests that are in progress. Getting these in at some point would be great.
  • Consider also adding tests for the different C++ components. All the dependencies in HybridNewton make this challenging but everything in HybridNewtonConfig should be easily testable.
  • Not something that needs to be addressed in this PR, but since the method introduced here is completely dependent on an externally generated model, there need to be an openly available and easy to follow documentation or library for helping users creating the models.

Comment on lines 130 to 137
// Load cell indices from file at runtime
std::vector<int> cell_indices = loadCellIndicesFromFile(config.cell_indices_file);
const std::size_t n_cells = cell_indices.size();

// Pass cells to each stage
auto input = constructInputTensor(config, cell_indices, n_cells);
auto output = constructOutputTensor(input, config, n_cells);
updateInitialGuess(output, config, cell_indices, n_cells);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading all the indices from file and allocating the input and output tensor every time we apply the model sounds like expensive operations that can limit performance when using this for larger models. Is it instead possible to only do this once in an initial setup and store the values for later use?
I think for the cell indices this at least makes sense since reading input from file can be very slow. For the input and output tensors, it would give better performance at the cost of higher peak memory use, but I think it would still be worth considering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current way I have been using the HybridNewton method, I run the model on small domains (few hundreds cells) and use the model in most of cases a single time at a specific time provided by the user. Moreover, I use very small ML models. So the overall idea I tried to follow is to keep everything minimal and apply ML model only at time and location where it is really worth.

If one can develop some ML models that can be used at a lot of time steps, then the repetitive cell load from file shall be removed as you recommend.

The first version I wrote had cell indices hard encoded in the config file but I decided to switch to save the cell indices in a file and only loading them once we are sure the model shall be used to avoid memory use.

I have done some tests on Drogon, trying to apply the method on the whole field and I ran into memory issue for tensor allocation even with few features. I believe the amount of cells has to be kept quite low for now and some future developments shall improve scalability of the whole machine learning part.

I just wanted to give a bit more context around the method. I trust your decision if you confirm that I shall load it initially.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. It makes sense to keep it simple initially and avoid doing premature optimisation. These dense MLP models gets quite resource heavy for lots of input and output cells, so it makes sense to limit to a small number of cells around the wells.

Even if the number of cell indices is quite small i still think it makes sense to read them up front and reuse, then the extra memory allocated is very limited. Also good to get all the setup, file reading and checks done before starting the simulation, to avoid failing after having spent time simulating.

Comment on lines 489 to 490
bool is_delta = name.compare(0, 6, "DELTA_") == 0;
std::string actual_name = is_delta ? name.substr(6) : name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocating a string in an nested loop over cells could be quite expensive. Consider using string_view or avoid working with comparing strings.

@lechevaa
Copy link
Contributor Author

Thank you very much for your review ! I will try to address the comments as soon as possible.

@daavid00
Copy link
Member

jenkins build this please

@lechevaa
Copy link
Contributor Author

I have update the code trying to follow your reviews.

I just wanted to add some thoughts on the tests and usage:

  • C++ tests: this I have to learn how to do in OPM for HybridNewtonConfig at least, I do not think I will have time before release but I will definitely have to do that.

  • Python tests: work in progress to adapt the test I already have, I am not sure that I can fit them before the release.

  • Overall tutorial on how to generate an external model and use it within OPM. I was thinking that if this PR is included in next release, then I can create a Github repo for demonstrating based on the release version, this would be the most practical. Then the repo could either be in OPM project or CSSR tools (at first then moved at some point in OPM Project).

Regarding Python tests, if you believe it is an absolute necessary for the release, I can overwork on it and have a first version probably next week.

Thanks again for reviewing !

@daavid00
Copy link
Member

It depends on OPM/opm-common#4715, running jenkins there.

Copy link
Member

@jakobtorben jakobtorben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes. Several parts were easier for me to understand now.

If you manage to get some tests in before the release that would be good but it is not absolutely necessary, we certainly don't have a perfect code coverage already.

Good that you are thinking of adding a tutorial. I think it would make sense to create a repo directly into the OPM project then, as this feature would depend on that tutorial.

One general comment to consider is that I see you have quite extensively dropped using brackets. This will make the code more concise but can hurt readability and maintainability, and easily lead to introducing bugs when adding new things later. So I would consider to avoid using this format. If you are not convinced you can get some more opinions here https://stackoverflow.com/questions/12193170/whats-the-purpose-of-using-braces-i-e-for-a-single-line-if-or-loop.

@lechevaa
Copy link
Contributor Author

Thanks again for your reviews !

I tried to refactor the two files as you mentioned. I hope the readability is much better now.

I also added some Python tests.
Regarding them, I am using only the opm.io.ecl module for managing data.

I wanted to also use the from opm.simulators.BlackOilSimulator, however I ran into some issues using it.
The first issue is that it is not possible to instantiate two BlackOilSimulator objects, making comparison between a baseline simulation and a Hybrid Newton simulation not directly possible.
To overcome this, I thought that I could have a single instance of BlackOilSimulator, and modify the fluid state to the previous timestep solution so that I can run it again using HybridNewton. This is also not supported for now, or I could not find it.

If one of this functionality becomes available, I can rewrite the code using more opm python parts and avoid using subprocess to call flow.

Finally, I struggled a bit to set up the imports, so I am not convinced these python tests will go through CI due to import errors.

@lechevaa
Copy link
Contributor Author

Thank you for the changes. Several parts were easier for me to understand now.

If you manage to get some tests in before the release that would be good but it is not absolutely necessary, we certainly don't have a perfect code coverage already.

Good that you are thinking of adding a tutorial. I think it would make sense to create a repo directly into the OPM project then, as this feature would depend on that tutorial.

One general comment to consider is that I see you have quite extensively dropped using brackets. This will make the code more concise but can hurt readability and maintainability, and easily lead to introducing bugs when adding new things later. So I would consider to avoid using this format. If you are not convinced you can get some more opinions here https://stackoverflow.com/questions/12193170/whats-the-purpose-of-using-braces-i-e-for-a-single-line-if-or-loop.

Sorry in the previous commit, I refactored the code and forgot some brackets at the specific place you pointed out, it was not on purpose. I updated it back so there shall be brackets everywhere now. If I am still missing some, I apologize and will update it.

@jakobtorben
Copy link
Member

Thanks for all the changes. I think it is close to being ready now. Go over the docs in HybridNewton.hpp and HybridNewtonConfig.hpp to make sure that they are updated given recent changes, and check the formatting in HybridNewtonConfig.hpp. The test pipeline also needs to be fixed before we can merge it in, either with or without tests.

@lechevaa
Copy link
Contributor Author

lechevaa commented Oct 2, 2025

The test pipeline is fixed and I updated the documentation for all the important functions.
Do not hesitate to tell me if there is anything else to improve before it to be ready, and thanks again for the time you spend on reviewing !

Copy link
Member

@jakobtorben jakobtorben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the changes. Except the tiny detail this looks good to me now. Approving.

* output tensor. Scaling and transformations defined in the configuration
* are applied automatically.
*
* \param input The input tensor of shape [n_cells x n_input_features].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not consistent with the documentation for constructInputTensor

    * Thus, the tensor length equals:
    *
    *     (# of scalar features) + (# of per-cell features × n_cells).

@daavid00
Copy link
Member

daavid00 commented Oct 6, 2025

PR approved and green jenkins in OPM/opm-common#4715, merging this.

@daavid00 daavid00 merged commit 2d327a6 into OPM:master Oct 6, 2025
1 check passed
@daavid00 daavid00 removed this from the Release 2025.10 milestone Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

manual:new-feature This is a new feature and should be described in the manual

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants