-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests to document current behavior of post-instr in branches #435
Conversation
|
||
Where there's a raise or a failwith guarded by a conditional, the | ||
branch itself is instrumented, but the raising expression is not | ||
post-instrumented... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the incremental value of having this test over the tests in apply/special.t
, including
bisect_ppx/test/instrument/apply/special.t
Line 293 in e302656
> let _ = failwith (print_endline "foo"; "bar") |
if
-expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe there's a need for this test from the standpoint of the current
implementation. I do see its value in terms of future-proofing and documenting.
The if cond then raise
pattern is quite common. As it stands, one must be
cautious with how the raising expression is written. It's important to note that
even seemingly innocuous refactoring, which shouldn't affect coverage, could
potentially introduce non-visitable coverage points. This can be quite
surprising.
Consider the following example:
-| if cond then raise Not_found
+| if cond then raise_s [%sexp "Key not found", { key : Key.t}]
This results in the insertion of non-visitable point 1:
-| if cond then raise Not_found
+| if cond then (__bisect_post__visit 1 (raise_s [%sexp "Key not found", { key : Key.t}]))
``
For example, if you use `invalid_arg`, or a raising helper, its | ||
application will be post-instrumented (unless all of its arguments are | ||
labeled !?), resulting in a code location that can never be visited, | ||
by design. These are currently acknowledged limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an acknowledged and inevitable limitation because there is no way for Bisect to know that a specific identifier refers to a function that always or never raises. In fact, it doesn't know that for the identifiers failwith
or invalid_arg
either, as those can be redefined and shadowed by the user, including by another module or a third-party library that is not even processed by Bisect during compilation of the user's particular project. I don't see that we need a test for this -- the actual behavior is covered by the normal apply
tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Raises or diverges)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no way for Bisect to know that a specific identifier refers to a
function that always or never raises
I agree. It's fair to say that "a strategy to limit non-visitable points, which
relies partly on identifiers, will inevitably encounter these limitations."
However, if we were to discover an alternative approach (assuming one exists),
we might still see improvements in this test. In other words, the insertion of
post-instrumentation might not be inevitable when considering the range of
possible instrumentation strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Raises or diverges)
Btw, I didn't understand the part where you said "or diverges".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Raises or diverges)
Btw, I didn't understand the part where you said "or diverges".
Post-instrumentation checks that a function evaluates to a value. The two ways that it doesn't in OCaml are when it raises an exception or when it diverges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are the two ways that the out-edge of aPexp_apply
may not be followed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "diverging" encompass non-termination and potentially the effects in OCaml 5? Something else?
Regarding the effects, I'm optimistic that their influence on the strategy will
be similar to that of exceptions. I hope this won't necessitate too big of an
exploration of the OCaml 5 runtime 🤯. Thank you again for your explanation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "diverging" is non-termination. OCaml 5 effects are supposed to always be returned from exactly once, so they should not affect control flow as visible to Bisect, although this is not enforced, and for effects that correspond to async computation the "exactly once" might be later than program termination, so in that sense they behave like exceptions.
If you try and disable post-instrumentation with a `coverage off` | ||
directive, the branch pre-instrumentation is removed as well, | ||
resulting in losing the ability to control that the branch itself is | ||
visited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this to attribute.t
? That is the place for testing [@coverage off]
. It already includes cases for interaction of [@coverage off]
with multiple language constructs in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your suggestion to add tests to attribute.t
, which I had
overlooked. I will proceed to incorporate new tests there to demonstrate how to
disable coverage in branches (same PR ok?).
Despite the presence of branch coverage testing in attribute.t
, I advocate for
retaining this specific part of the control test here. It serves to illustrate
that this method does not provide a viable workaround for the issue at hand.
This was in response to your comment in #433:
I believe in your case you should suppress instrumentation of the out-edge
with [@coverage off], though I don't immediately recall whether the way that
is implemented will also cause the branch instrumentation to also be
supressed.
This might come across more as literate testing (akin to a tutorial, perhaps?)
rather than a pure behavior test. I've attempted to construct a small narrative
around the added tests. However, if you have strong reservations, I'm open to
removing this section entirely.
@@ -65,3 +65,136 @@ cases are in tail position iff the match expression is in tail position. | |||
| () -> | |||
___bisect_visit___ 2; | |||
print_endline "bar" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analogous questions for the match.t
changes as for the if.t
changes.
Thank you for taking the time to review the PR and for your comments! I really Overall, my intuition centers around the hypothesis that the In other words, the post-instrumentation would never be produced for each case I have the hunch that doing something along these lines would avoid, in more In this context, yes I indeed believe in the need for dedicated tests to ensure In #434, you said:
I agree. I think this is part of what I would like to challenge, for the I acknowledge that I haven't fully realized the details of the concept yet, and As a compelling argument in favor of these specific tests, examining the Perhaps you find the cost of adding such tests outweighs the potential benefits, |
I would suggest to keep the tests that are redundant right now locally, and to include them as part of the PR if you do suggest an alternative instrumentation strategy for |
It may be easier to address this by extending the |
Understood. I am more accustomed to first publishing tests and then reviewing |
Another way to do this is to open the PR with the tests in the first commit and the other changes afterwards, with the changes in test output included in the later commits. I suppose you are potentially turning this PR into that right now :) Thank you. |
Hi @aantron . I hope all is well with you! I was doing a little pass over my opened PRs and decided it was time to break radio silence on this one 😃 sorry for not being more active on it. By the way, I wanted to correct something in my previous comments : a friend of mine pointed out that my use of the term I have been using it for a while now, and I am still confused about the intention for the current instrumentation strategy in most cases for out-edge instrumentation. In my experience, the instrumentation frequently either adds unvisitable-by-design coverage points, or forces you to write the code in such a way that the raising calls are in tail position, which oftentimes feels arbitrary for the reviewer who is not aware that this is motivated by really non-trivial characteristics of the coverage tool. Do you know if At any rate, I don't feel I understand enough the current behavior to be proposing actual changes. I need first to have some way to monitor the current behavior. That being said, I understand from our exchange that this may not be something that has its place directly within this repository. So I thought I would try and develop a little collection of characterization tests, in my own time, in an external repo. Maybe then I can revisit with a better understanding, or simply try and adjust my uses by following how my characterization tests are affected by upgrades to new versions of bisect_ppx. The other thing that doesn't make me very comfortable is that I don't understand why no other user is reporting these limitations. I am guessing that the tools isn't really used a lot, or perhaps in such a way that people don't pay close attention to these unvisitable-points. I'm happy to wait and see which direction is the community going to go and how the coverage part of the roadmap https://ocaml.org/tools/platform-roadmap#w21-run-tests will actually be implemented. I can revisit when more people are on board. Thanks a lot! |
This PR adds new test cases to monitor the current behavior of branch
instrumentation in Bisect_ppx. The test cases focus on complex scenarios
involving if-then-else and match branches discussed in #434.
The aim is not to modify the existing behavior directly, but to establish a
baseline for future changes.
By adding these tests, we can monitor how future changes to Bisect_ppx may alter
these behaviors (TDD).