Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid non-unique saturation #457

Closed
wants to merge 1 commit into from

Conversation

alifbe
Copy link
Collaborator

@alifbe alifbe commented Sep 18, 2024

Avoid non-unique saturation by increasing the s[w/g/o]int resolution by factor 10.

Close #444

@alifbe alifbe marked this pull request as ready for review September 18, 2024 08:54
@alifbe alifbe requested a review from berland September 18, 2024 08:54
@berland
Copy link
Collaborator

berland commented Sep 18, 2024

A little bit skeptical to just reduce the accuracy attainable by pyscal. I think there is a different solution to the problem that can be isolated to the sgof-code. Since the problem is flaky and occuring very seldom, there is even no guarantee that this patch will solve it.

@alifbe
Copy link
Collaborator Author

alifbe commented Sep 18, 2024

@berland The issue came because sgint has less accuracy compared to when the dataframe dumped into string (default 7 digit precision) hence the duplicates in saturation column. It worked when tested using reproduce_failure from failed test last night.

pyscal/src/pyscal/gasoil.py

Lines 156 to 157 in 4162b12

self.table["sgint"] = list(map(round, self.table["SG"] * SWINTEGERS))
self.table = self.table.drop_duplicates("sgint")

@berland
Copy link
Collaborator

berland commented Sep 18, 2024

Since it is a (barely) flaky problem, I do believe it passed the test when running reproduce_failure, since you have perturbed the whole setup by changing SWINTEGERS, but that gives me no confidence that the bug has really gone.

@eivindjahren
Copy link

Could be that you make the test fail more reliably by setting a target

@alifbe
Copy link
Collaborator Author

alifbe commented Sep 21, 2024

Since it is a (barely) flaky problem, I do believe it passed the test when running reproduce_failure, since you have perturbed the whole setup by changing SWINTEGERS, but that gives me no confidence that the bug has really gone.

I thought reproduce_failure only changed with different version of python/pytest.

Also, I see that the parameters that are passed to test test_slgof_hypo are identical after my PR.

    @reproduce_failure('6.112.1', b'AXicY2BABgcUwBQTiKkblxd0rHMLiHkBIgsAY7kGgQ==')
>   @given(
        st.floats(min_value=0.0, max_value=0.3),
        st.floats(min_value=0.0, max_value=0.3),
        st.floats(min_value=0.0, max_value=0.3),
        st.floats(min_value=1.0 / float(1000 * SWINTEGERS), max_value=0.5),
    )
E   hypothesis.errors.DidNotReproduce: Expected the test to raise an error, but it completed successfully.

tests/test_slgof.py:125: DidNotReproduce
----------------------------------------------------------------------------------- Hypothesis -----------------------------------------------------------------------------------
Falsifying example: test_slgof_hypo(
    swl=0.0,
    sorg=0.15625,
    sgcr=0.1472499615763899,
    h=6.103515625e-05,
)
============================================================================ short test summary info =============================================================================
FAILED tests/test_slgof.py::test_slgof_hypo - hypothesis.errors.DidNotReproduce: Expected the test to raise an error, but it completed successfully.
=============================================================================== 1 failed in 0.95s ================================================================================

compare to the error reported here: https://github.com/equinor/pyscal/actions/runs/10913039495/job/30288816903#step:9:28884

@berland
Copy link
Collaborator

berland commented Sep 23, 2024

I have changed my mind, I think you are on the right track with this PR.

Some questions:

  • Is it SWINTEGERS itself that should be multiplied by 10?
  • It is still problematic that we are not able to reproduce the failure without @reproduce_failure, especially when we perturb SWINTEGERS, to rule out that we don't introduce other kinds of similar problems. I have made some attempts with hypothesis.target() but not successful yet.

@eivindjahren
Copy link

I have changed my mind, I think you are on the right track with this PR.

Some questions:

  • Is it SWINTEGERS itself that should be multiplied by 10?
  • It is still problematic that we are not able to reproduce the failure without @reproduce_failure, especially when we perturb SWINTEGERS, to rule out that we don't introduce other kinds of similar problems. I have made some attempts with hypothesis.target() but not successful yet.

Let me have a look at it. It could be that failure is due to non-determinism in the algorithm or differences in the environment (which packages are installed).

@eivindjahren
Copy link

In trying to reproduce you get this warning

WARNING:pyscal.gasoil:Requested saturation step length (6.10352e-05) too small, reset to 0.0001
WARNING  pyscal.gasoil:gasoil.py:94 Requested saturation step length (6.10352e-05) too small, reset to 0.0001

@eivindjahren
Copy link

eivindjahren commented Sep 23, 2024

I just ran the code in the PR and got this reproduce string:

@reproduce_failure('6.112.1', b'AXicY2BABgdu+yj3lpQdZUAVhdIAd+UFBw==')

The actual failure seems simple enough:

>       assert np.isclose(slgof["SL"].to_numpy()[0], gasoil.swl + gasoil.sorg)
E       assert False
E        +  where False = <function isclose at 0x7cad3ef0d4b0>(0.0, (0.0 + 0.0001))
E        +    where <function isclose at 0x7cad3ef0d4b0> = np.isclose
E        +    and   0.0 = <pyscal.gasoil.GasOil object at 0x7cacf1d9f410>.swl
E        +    and   0.0001 = <pyscal.gasoil.GasOil object at 0x7cacf1d9f410>.sorg
E       Falsifying example: test_slgof_hypo(
E           # The test always failed when commented parts were varied together.
E           swl=0.0,  # or any other generated value
E           sorg=0.0001,
E           sgcr=0.0,  # or any other generated value
E           h=0.5,  # or any other generated value
E       )

probably due to the following warning:

WARNING:pyscal.gasoil:Requested saturation step length (1e-07) too small, reset to 0.0001

Note that I needed to use either pytest --hypothesis-profile=ci to avoid slow failures or

from hypothesis import given, settings, HealthCheck

@settings(
    max_examples=10000,
    suppress_health_check=[HealthCheck.too_slow],
    deadline=None,
    print_blob=True,
)

We should probably change to the setup used in ert:

from hypothesis import HealthCheck, settings

# Timeout settings are unreliable both on CI and
# when running pytest with xdist so we disable it
settings.register_profile(
    "no_timeouts",
    deadline=None,
    suppress_health_check=[HealthCheck.too_slow],
    print_blob=True,
)
settings.load_profile("no_timeouts")

@alifbe
Copy link
Collaborator Author

alifbe commented Sep 23, 2024

I have changed my mind, I think you are on the right track with this PR.

Some questions:

  • Is it SWINTEGERS itself that should be multiplied by 10?

I thought about it, but doing that will trigger changes in several other places. Some of the test assume a certain SWINTEGERS e.g.

low_sgcrvalue = 0.00001 # 1/10 of 1/SWINTEGERS

  • It is still problematic that we are not able to reproduce the failure without @reproduce_failure, especially when we perturb SWINTEGERS, to rule out that we don't introduce other kinds of similar problems. I have made some attempts with hypothesis.target() but not successful yet.

That's true. I don't have experience with hypothesis.target() but will also have a look.

@alifbe
Copy link
Collaborator Author

alifbe commented Sep 23, 2024

In trying to reproduce you get this warning

WARNING:pyscal.gasoil:Requested saturation step length (6.10352e-05) too small, reset to 0.0001
WARNING  pyscal.gasoil:gasoil.py:94 Requested saturation step length (6.10352e-05) too small, reset to 0.0001

Yes, that warning can be ignored. I think typically end-user will not specify such a low saturation step length.

I think the bug will occur for low h and due to rounding issue as pointed out by @berland

@berland
Copy link
Collaborator

berland commented Oct 2, 2024

@alifbe , it seems to work in #458

@alifbe alifbe closed this Oct 2, 2024
@alifbe alifbe deleted the fix-non-unique-saturation branch October 2, 2024 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hypothesis test failure - test_slgof.py::test_slgof_hypo
3 participants