Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should decoy raise exceptions? #204

Closed
andreaevooq opened this issue Jun 15, 2023 · 5 comments · Fixed by #218
Closed

Should decoy raise exceptions? #204

andreaevooq opened this issue Jun 15, 2023 · 5 comments · Fixed by #218

Comments

@andreaevooq
Copy link

andreaevooq commented Jun 15, 2023

Hello again!

While using the library, we noticed that Decoy doesn't raise an exception when:

  • We try calling a method that is not part of the mocked spec
  • We try calling a method that is part of the mocked spec which is not mocked

What is the design decision behind it? It is true that the errors should be catched by the linter and are highlighted while coding in an IDE, but it would be great to have explicit exceptions instead of None returned.

For reference, create_autospec raises an AttributeError.

@mcous
Copy link
Owner

mcous commented Jun 15, 2023

Hello! That's a really great question, and it is an intentional design decision to, in general, minimize exceptions thrown by Decoy. The reasoning behind that decision is to minimize unnecessary coupling between the tests and the implementation of the code under test. Such coupling is unavoidable when mocking, but an assertion creates stronger coupling forces than a no-op. The testdouble wiki's article on "mocks" gives some interesting history and details on how over-asserting messes with the design feedback from your tests. I highly recommend that entire wiki, in fact.

We try calling a method that is not part of the mocked spec

I think there's a convincing argument that an assertion is appropriate in this case. You've given the mock guardrails in the form of a spec and then violated them. I'll see about adding an assertion and if it messes with anything. I think I'd have to consider it a breaking change, but version numbers are cheap. For the v2 line, I suppose a Warning would work without introducing a breaking change (see below).

We try calling a method that is part of the mocked spec which is not mocked

I feel strongly that this should not be an assertion. If it's important that the test subject not call a certain method, then you can add a verify call with times=0 to enforce that behavior. Otherwise, I don't think the test subject should be "punished" if it's still able to produce the output expected by the test.

For the sake of developer convenience, it might be appropriate to trigger a Warning in this case, similar to MiscalledStubWarning.

There is definitely a reason this is not already a warning, though I don't remember if it was technical or philosophical. I'll have to check my Git history, because I'm fairly sure there are no (longer) technical blockers to implementing such a warning.

Very much appreciate the feedback and questions!

mcous added a commit that referenced this issue Aug 13, 2023
In the next major version of Decoy, this warning will become an error.

Closes #204
@SyntaxColoring
Copy link
Contributor

@mcous What about this third case:

  • We try calling a method that is part of the mocked spec, and is mocked, but not in a way that matches the actual call

For example, suppose the test does this:

decoy.when(foo.bar(123)).then_return(True)

But the subject does this:

x = foo.bar(456)

So there's a straightforward bug in the subject, and the test is trying to do its job to catch it, but the None return obscures what's going on.

I think the current rehearsal syntax makes it difficult to change this. In decoy.when(foo.bar(123)), foo.bar(123) needs to not raise anything. I can imagine a few breaking-change ways to solve that, but before getting into those, how do you feel about it abstractly?

@mcous
Copy link
Owner

mcous commented Oct 7, 2024

Hey @SyntaxColoring! Is that case covered by MiscalledStubWarning?

(As for the rehearsal syntax, if I was to issue a new major version, I would ditch it now that ParamSpec exists)

@SyntaxColoring
Copy link
Contributor

SyntaxColoring commented Oct 8, 2024

👋

Yeah, in the sense that MiscalledStubWarning functions as it was intended to cover that case. I guess I'm suggesting, or resuggesting, that MiscalledStubWarning should be upgraded to an exception.

My reasoning is basically: the current return of None causes us to only ever see downstream effects of our mistakes, instead of the mistakes themselves. This is frequently frustrating to me, and I think it's surprising to Decoy newcomers.

Here are some simplified examples of how I've seen MiscalledStubWarning play out:

  • Chained mocks:

    result_1 = dependency_1(incorrect_args)
    result_2 = dependency_2(result_1)
    result_3 = dependency_3(result_2)
    return result_3

    This prints 3 MiscalledStubWarnings that you need to work through to discern the root problem.

  • Logic based on the mock's return:

    some_bool = dep_a(incorrect_args)
    if some_bool:
        return dep_b(dep_c())
    else:
        return dep_d(dep_e())

    This makes the subject execute a completely different codepath (None is falsey), causing more downstream mismatches to sort through.

But even in simpler cases, like the one above, MiscalledStubWarning doesn't quite work for me. If I have a dozen tests failing across the suite, and Pytest is intermingling all of their warnings together, and those are getting intermingled with unrelated warnings elsewhere in the suite....


I see your rationale above:

I don't think the test subject should be "punished" if it's still able to produce the output expected by the test.

And a similar line of thinking in #219 (which is about case 2, and is about going nothing->warning instead of warning->exception):

Adding a warning in this case forces every test using given subject to decoy.when or decoy.verify every single dependency the subject uses, even if a particular test isn't concerned with that behavior. In Decoy's own test suite this would trigger 8 warnings, and "fixing" those warnings would bloat the tests in question without providing meaningful design feedback

In other words, I think you're saying if a subject calls dep_a() and dep_b(), but a test can get away with omitting dep_a() and having it return None implicitly, that's good, because it can make the test shorter and more focused.

I see where you're coming from there, and I agree with the goal, but I think I come down on the other side of the tradeoff? I'd rather have the bloat and punishment that you're describing than the MiscalledStubWarning stuff. It would still be a chore, but a chore that seems more mindless and more self-explanatory.

How do you feel about that tradeoff these days?

@mcous
Copy link
Owner

mcous commented Oct 8, 2024

Before getting into the philosophy: Pytest should allow one to set the warning to an error, since we're going through Python's warning system. It was my intention that this could be configured in cases where a team's philosophies disagreed with my own:

pytestmark = pytest.mark.filterwarnings("error::decoy.warnings.MiscalledStubWarning")

(Read "not recommended" in the docs there as "not recommended philosophically" rather than "not recommended because this API is not stable")

Philosophy time: I haven't shifted on the core belief that a stub should not throw unless explicitly configured for a given interaction (or the Python runtime itself would throw). I think outside of Decoy's primary use case - driving the design of APIs using TDD - pain that one encounters is more a signal that the code's implementation should change or be rewritten fresh rather than a signal that the stubbing library isn't being strict enough.

This is especially true of using Decoy for "Logic based on the mock's return". I don't think this holds up as well with the pain you mentioned with "chaining stub calls", but I suspect this is more pain when refactoring code and tests rather than the initial implementation. So, based on recent experience with vitest-when, I think it is Decoy's job to:

  • Make it easy to debug the state of spies and stubs when (re)writing tests
  • Raise warnings and errors at the time they actually happen
  • Make it possible to layer on behaviors to Decoy that may disagree with my philosophies in order to serve one's own preferences
    • e.g. write your own Decoy wrapper that layers a fallback raise Miscalled case to every stub (not currently possible)

I think that Decoy's current architecture, with a lot of error/warning checking happening in decoy.reset and with Decoy's own inability to distinguish between an actual call and a rehearsal, get in the way of these goals in a way that I'm interested in solving. Started #256 if you'd like to continue this discussion over there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants