-
Notifications
You must be signed in to change notification settings - Fork 130
Hybrid newton flag #6424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hybrid newton flag #6424
Conversation
akva2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome to the OPM project. Don't be discouraged by the nitpickery, just want to get it out of the way before somebody does a proper review.
opm/simulators/flow/FlowProblem.hpp
Outdated
| using WellModel = GetPropType<TypeTag, Properties::WellModel>; | ||
| using AquiferModel = GetPropType<TypeTag, Properties::AquiferModel>; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
opm/simulators/flow/FlowProblem.hpp
Outdated
|
|
||
| WellModel wellModel_; | ||
| AquiferModel aquiferModel_; | ||
| AquiferModel aquiferModel_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
| struct RestartWritingInterval { static constexpr int value = 0xffffff; }; // disable | ||
|
|
||
| // Path to the config file containing all Hybrid Newton parameters | ||
| struct HyNeConfigFile { static constexpr auto value = "foo.json"; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please choose a better default
opm/simulators/flow/HybridNewton.hpp
Outdated
| #include <string> | ||
| #include <iostream> | ||
| #include <fstream> | ||
| #include <boost/property_tree/json_parser.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
property tree is a a really heavy library and we try to avoid it in simulator units. Is there a reason why you can't use our wrapper class, PropertyTree.hpp ?
| : simulator_(simulator) | ||
| , configsLoaded_(false) | ||
| {} | ||
| void tryApplyHybridNewton() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline
opm/simulators/flow/HybridNewton.hpp
Outdated
| return applicable; | ||
| } | ||
|
|
||
| bool shouldApplyHybridNewton(double current_time, const HybridNewtonConfig& config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const?
opm/simulators/flow/HybridNewton.hpp
Outdated
| } | ||
|
|
||
|
|
||
| void validateAllConfigs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smells like a const
opm/simulators/flow/HybridNewton.hpp
Outdated
| * \return A tensor of shape [n_cells x n_input_features], where rows | ||
| * correspond to cells and columns correspond to input features. | ||
| */ | ||
| Opm::ML::Tensor<Evaluation> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are already in the Opm namespace, avoid Opm:: where possible.
opm/simulators/flow/HybridNewton.hpp
Outdated
| * \throws std::runtime_error if an expected output feature is missing | ||
| * or if state update fails for any cell. | ||
| */ | ||
| void updateInitialGuess(Opm::ML::Tensor<Evaluation>& output, const HybridNewtonConfig& config, const std::vector<int>& cell_indices, size_t n_cells) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break long line
| } // namespace Opm | ||
|
|
||
|
|
||
| #endif No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eol at eof
|
jenkins build this please |
|
Thank you very much, I will address these comments ! |
8400e1b to
857ce7e
Compare
4215aa4 to
b7595cc
Compare
|
jenkins build this please |
jakobtorben
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome to the OPM community!
This is an interesting method. Kudos for getting this ML method to work within the constraints of OPM and C++ (or not Python). In addition to my comments, here are some general comments to consider for improving the maintainability and performance.
HybridNewtonis currently quite a large class that has a lot of responsibilities. I think it could benefit from separation of concerns. A lot of the parsing and setup could potentially be moved intoHybridNewtonConfig.hppor elsewhere, such thatHybridNewtoncould focus on the algorithm, integration into the simulator and state updates.- There is currently a lot of redundant work that is being done every time we call
runHybridNewton, such as reading the indices and model weights from file and allocating the model and tensors. I think the method would benefit from doing more analysis and setup up front to make it faster during application. This has to be balanced with not making it too memory heavy and maintaining its minimal impact when the method is not used. - I have not had a look at the Python files as I assume this is for the Python tests that are in progress. Getting these in at some point would be great.
- Consider also adding tests for the different C++ components. All the dependencies in
HybridNewtonmake this challenging but everything inHybridNewtonConfigshould be easily testable. - Not something that needs to be addressed in this PR, but since the method introduced here is completely dependent on an externally generated model, there need to be an openly available and easy to follow documentation or library for helping users creating the models.
opm/simulators/flow/HybridNewton.hpp
Outdated
| // Load cell indices from file at runtime | ||
| std::vector<int> cell_indices = loadCellIndicesFromFile(config.cell_indices_file); | ||
| const std::size_t n_cells = cell_indices.size(); | ||
|
|
||
| // Pass cells to each stage | ||
| auto input = constructInputTensor(config, cell_indices, n_cells); | ||
| auto output = constructOutputTensor(input, config, n_cells); | ||
| updateInitialGuess(output, config, cell_indices, n_cells); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading all the indices from file and allocating the input and output tensor every time we apply the model sounds like expensive operations that can limit performance when using this for larger models. Is it instead possible to only do this once in an initial setup and store the values for later use?
I think for the cell indices this at least makes sense since reading input from file can be very slow. For the input and output tensors, it would give better performance at the cost of higher peak memory use, but I think it would still be worth considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current way I have been using the HybridNewton method, I run the model on small domains (few hundreds cells) and use the model in most of cases a single time at a specific time provided by the user. Moreover, I use very small ML models. So the overall idea I tried to follow is to keep everything minimal and apply ML model only at time and location where it is really worth.
If one can develop some ML models that can be used at a lot of time steps, then the repetitive cell load from file shall be removed as you recommend.
The first version I wrote had cell indices hard encoded in the config file but I decided to switch to save the cell indices in a file and only loading them once we are sure the model shall be used to avoid memory use.
I have done some tests on Drogon, trying to apply the method on the whole field and I ran into memory issue for tensor allocation even with few features. I believe the amount of cells has to be kept quite low for now and some future developments shall improve scalability of the whole machine learning part.
I just wanted to give a bit more context around the method. I trust your decision if you confirm that I shall load it initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. It makes sense to keep it simple initially and avoid doing premature optimisation. These dense MLP models gets quite resource heavy for lots of input and output cells, so it makes sense to limit to a small number of cells around the wells.
Even if the number of cell indices is quite small i still think it makes sense to read them up front and reuse, then the extra memory allocated is very limited. Also good to get all the setup, file reading and checks done before starting the simulation, to avoid failing after having spent time simulating.
opm/simulators/flow/HybridNewton.hpp
Outdated
| bool is_delta = name.compare(0, 6, "DELTA_") == 0; | ||
| std::string actual_name = is_delta ? name.substr(6) : name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allocating a string in an nested loop over cells could be quite expensive. Consider using string_view or avoid working with comparing strings.
|
Thank you very much for your review ! I will try to address the comments as soon as possible. |
|
jenkins build this please |
|
I have update the code trying to follow your reviews. I just wanted to add some thoughts on the tests and usage:
Regarding Python tests, if you believe it is an absolute necessary for the release, I can overwork on it and have a first version probably next week. Thanks again for reviewing ! |
|
It depends on OPM/opm-common#4715, running jenkins there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes. Several parts were easier for me to understand now.
If you manage to get some tests in before the release that would be good but it is not absolutely necessary, we certainly don't have a perfect code coverage already.
Good that you are thinking of adding a tutorial. I think it would make sense to create a repo directly into the OPM project then, as this feature would depend on that tutorial.
One general comment to consider is that I see you have quite extensively dropped using brackets. This will make the code more concise but can hurt readability and maintainability, and easily lead to introducing bugs when adding new things later. So I would consider to avoid using this format. If you are not convinced you can get some more opinions here https://stackoverflow.com/questions/12193170/whats-the-purpose-of-using-braces-i-e-for-a-single-line-if-or-loop.
|
Thanks again for your reviews ! I tried to refactor the two files as you mentioned. I hope the readability is much better now. I also added some Python tests. I wanted to also use the from opm.simulators.BlackOilSimulator, however I ran into some issues using it. If one of this functionality becomes available, I can rewrite the code using more opm python parts and avoid using subprocess to call flow. Finally, I struggled a bit to set up the imports, so I am not convinced these python tests will go through CI due to import errors. |
Sorry in the previous commit, I refactored the code and forgot some brackets at the specific place you pointed out, it was not on purpose. I updated it back so there shall be brackets everywhere now. If I am still missing some, I apologize and will update it. |
|
Thanks for all the changes. I think it is close to being ready now. Go over the docs in |
|
The test pipeline is fixed and I updated the documentation for all the important functions. |
jakobtorben
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the changes. Except the tiny detail this looks good to me now. Approving.
opm/simulators/flow/HybridNewton.hpp
Outdated
| * output tensor. Scaling and transformations defined in the configuration | ||
| * are applied automatically. | ||
| * | ||
| * \param input The input tensor of shape [n_cells x n_input_features]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not consistent with the documentation for constructInputTensor
* Thus, the tensor length equals:
*
* (# of scalar features) + (# of per-cell features × n_cells).|
PR approved and green jenkins in OPM/opm-common#4715, merging this. |
I'm a Postdoc at NORCE, this is my first PR to OPM.
Added Hybrid Newton Nonlinear Preconditioning Flags and Features for three phase compositional black oil problem.
Reference to the introduction of the method within OPM:
Lechevallier, Antoine, Sandve, Tor Harald, Landa-Marbán, David, Kane, Birane, and Sarah Eileen Gasda. "Incremental Machine Learning for Near-Well Prediction: A Non-Linear Preconditioning Approach to Faster History Matching." Paper presented at the SPE Reservoir Simulation Conference, Galveston, Texas, USA, March 2025. doi: https://doi.org/10.2118/223843-MS
Short description about HybridNewton method:
The goal is simply to update the initial guess of Newton's method with a new value closer to the solution to speed up the convergence. The update is done using a neural network.
What is new compared to the paper:
I finally managed to fix the Pressure prediction ! This is now fully supported
Some general information about the implementation:
The user provides a json file with all required information i.e path to neural network, input features, output features, time to apply the model, cell indexes, feature engineering, scalers.
Multiple models can be provided within a single config file.
The units are assumed to be the same as what is outputted by opm.
For now I only implemented for FlowProblemBlackOil.
Input features supported:
Output Features supported:
New Flags:
--use-hy-ne=bool : use or not the HybridNewton method
--hy-ne-config-file=path : Path to the configuration file (json) that allows OPM to interpret correctly how to precondition Newton's method.
New files:
opm/simulators/flow/HybridNewtonConfig.hpp
struct HybridNewtonConfig that parses the main information required to run HybridNewton
opm/simulators/flow/HybridNewton.hpp
Contains the class implementation of BlackOilHybridNewton. The main method is runHybridNewton given a HybridNewtonConfig.
What is missing for now:
I created a python code that tests all supported features in all combinations possible and generate neural network models together with config files.
I have done some tests in python, running Drogon and SPE1 and predicting the exact solution to ensure that Newton's method converges in 0 iteration.
However these tests are not yet included within the opm python testing framework. I need to be careful about dependencies (same as the one in opm-common/python/opm/ml/ml_tools/requirements.txt) and testing.
TODO:
adapt my existing tests to the opm-simulator python testing