Should decoy raise exceptions? #204

andreaevooq · 2023-06-15T12:32:48Z

Hello again!

While using the library, we noticed that Decoy doesn't raise an exception when:

We try calling a method that is not part of the mocked spec
We try calling a method that is part of the mocked spec which is not mocked

What is the design decision behind it? It is true that the errors should be catched by the linter and are highlighted while coding in an IDE, but it would be great to have explicit exceptions instead of None returned.

For reference, create_autospec raises an AttributeError.

The text was updated successfully, but these errors were encountered:

mcous · 2023-06-15T16:01:34Z

Hello! That's a really great question, and it is an intentional design decision to, in general, minimize exceptions thrown by Decoy. The reasoning behind that decision is to minimize unnecessary coupling between the tests and the implementation of the code under test. Such coupling is unavoidable when mocking, but an assertion creates stronger coupling forces than a no-op. The testdouble wiki's article on "mocks" gives some interesting history and details on how over-asserting messes with the design feedback from your tests. I highly recommend that entire wiki, in fact.

We try calling a method that is not part of the mocked spec

I think there's a convincing argument that an assertion is appropriate in this case. You've given the mock guardrails in the form of a spec and then violated them. I'll see about adding an assertion and if it messes with anything. I think I'd have to consider it a breaking change, but version numbers are cheap. For the v2 line, I suppose a Warning would work without introducing a breaking change (see below).

We try calling a method that is part of the mocked spec which is not mocked

I feel strongly that this should not be an assertion. If it's important that the test subject not call a certain method, then you can add a verify call with times=0 to enforce that behavior. Otherwise, I don't think the test subject should be "punished" if it's still able to produce the output expected by the test.

For the sake of developer convenience, it might be appropriate to trigger a Warning in this case, similar to MiscalledStubWarning.

There is definitely a reason this is not already a warning, though I don't remember if it was technical or philosophical. I'll have to check my Git history, because I'm fairly sure there are no (longer) technical blockers to implementing such a warning.

Very much appreciate the feedback and questions!

In the next major version of Decoy, this warning will become an error. Closes #204

SyntaxColoring · 2024-10-07T22:41:31Z

@mcous What about this third case:

We try calling a method that is part of the mocked spec, and is mocked, but not in a way that matches the actual call

For example, suppose the test does this:

decoy.when(foo.bar(123)).then_return(True)

But the subject does this:

x = foo.bar(456)

So there's a straightforward bug in the subject, and the test is trying to do its job to catch it, but the None return obscures what's going on.

I think the current rehearsal syntax makes it difficult to change this. In decoy.when(foo.bar(123)), foo.bar(123) needs to not raise anything. I can imagine a few breaking-change ways to solve that, but before getting into those, how do you feel about it abstractly?

mcous · 2024-10-07T23:05:07Z

Hey @SyntaxColoring! Is that case covered by MiscalledStubWarning?

(As for the rehearsal syntax, if I was to issue a new major version, I would ditch it now that ParamSpec exists)

SyntaxColoring · 2024-10-08T01:45:20Z

👋

Yeah, in the sense that MiscalledStubWarning functions as it was intended to cover that case. I guess I'm suggesting, or resuggesting, that MiscalledStubWarning should be upgraded to an exception.

My reasoning is basically: the current return of None causes us to only ever see downstream effects of our mistakes, instead of the mistakes themselves. This is frequently frustrating to me, and I think it's surprising to Decoy newcomers.

Here are some simplified examples of how I've seen MiscalledStubWarning play out:

Chained mocks:

result_1 = dependency_1(incorrect_args)
result_2 = dependency_2(result_1)
result_3 = dependency_3(result_2)
return result_3

This prints 3 MiscalledStubWarnings that you need to work through to discern the root problem.

Logic based on the mock's return:
```
some_bool = dep_a(incorrect_args)
if some_bool:
    return dep_b(dep_c())
else:
    return dep_d(dep_e())
```
This makes the subject execute a completely different codepath (None is falsey), causing more downstream mismatches to sort through.

But even in simpler cases, like the one above, MiscalledStubWarning doesn't quite work for me. If I have a dozen tests failing across the suite, and Pytest is intermingling all of their warnings together, and those are getting intermingled with unrelated warnings elsewhere in the suite....

I see your rationale above:

I don't think the test subject should be "punished" if it's still able to produce the output expected by the test.

And a similar line of thinking in #219 (which is about case 2, and is about going nothing->warning instead of warning->exception):

Adding a warning in this case forces every test using given subject to decoy.when or decoy.verify every single dependency the subject uses, even if a particular test isn't concerned with that behavior. In Decoy's own test suite this would trigger 8 warnings, and "fixing" those warnings would bloat the tests in question without providing meaningful design feedback

In other words, I think you're saying if a subject calls dep_a() and dep_b(), but a test can get away with omitting dep_a() and having it return None implicitly, that's good, because it can make the test shorter and more focused.

I see where you're coming from there, and I agree with the goal, but I think I come down on the other side of the tradeoff? I'd rather have the bloat and punishment that you're describing than the MiscalledStubWarning stuff. It would still be a chore, but a chore that seems more mindless and more self-explanatory.

How do you feel about that tradeoff these days?

mcous · 2024-10-08T15:15:49Z

Before getting into the philosophy: Pytest should allow one to set the warning to an error, since we're going through Python's warning system. It was my intention that this could be configured in cases where a team's philosophies disagreed with my own:

pytestmark = pytest.mark.filterwarnings("error::decoy.warnings.MiscalledStubWarning")

(Read "not recommended" in the docs there as "not recommended philosophically" rather than "not recommended because this API is not stable")

Philosophy time: I haven't shifted on the core belief that a stub should not throw unless explicitly configured for a given interaction (or the Python runtime itself would throw). I think outside of Decoy's primary use case - driving the design of APIs using TDD - pain that one encounters is more a signal that the code's implementation should change or be rewritten fresh rather than a signal that the stubbing library isn't being strict enough.

This is especially true of using Decoy for "Logic based on the mock's return". I don't think this holds up as well with the pain you mentioned with "chaining stub calls", but I suspect this is more pain when refactoring code and tests rather than the initial implementation. So, based on recent experience with vitest-when, I think it is Decoy's job to:

Make it easy to debug the state of spies and stubs when (re)writing tests
Raise warnings and errors at the time they actually happen
Make it possible to layer on behaviors to Decoy that may disagree with my philosophies in order to serve one's own preferences
- e.g. write your own Decoy wrapper that layers a fallback raise Miscalled case to every stub (not currently possible)

I think that Decoy's current architecture, with a lot of error/warning checking happening in decoy.reset and with Decoy's own inability to distinguish between an actual call and a rehearsal, get in the way of these goals in a way that I'm interested in solving. Started #256 if you'd like to continue this discussion over there!

This was referenced Aug 13, 2023

feat(spy): warn if mock used with a missing attribute #218

Merged

refactor(warnings): rework MiscalledStubWarning checker for clarity #219

Merged

mcous closed this as completed in #218 Aug 13, 2023

mcous added a commit that referenced this issue Aug 13, 2023

feat(spy): warn if mock used with a missing attribute (#218)

72cefe4

In the next major version of Decoy, this warning will become an error. Closes #204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should decoy raise exceptions? #204

Should decoy raise exceptions? #204

andreaevooq commented Jun 15, 2023 •

edited

Loading

mcous commented Jun 15, 2023 •

edited

Loading

SyntaxColoring commented Oct 7, 2024

mcous commented Oct 7, 2024

SyntaxColoring commented Oct 8, 2024 •

edited

Loading

mcous commented Oct 8, 2024

Should decoy raise exceptions? #204

Should decoy raise exceptions? #204

Comments

andreaevooq commented Jun 15, 2023 • edited Loading

mcous commented Jun 15, 2023 • edited Loading

SyntaxColoring commented Oct 7, 2024

mcous commented Oct 7, 2024

SyntaxColoring commented Oct 8, 2024 • edited Loading

mcous commented Oct 8, 2024

andreaevooq commented Jun 15, 2023 •

edited

Loading

mcous commented Jun 15, 2023 •

edited

Loading

SyntaxColoring commented Oct 8, 2024 •

edited

Loading