Proposal: introducing a nerdctl error package #3393

apostasie · 2024-09-01T23:39:50Z

IMHO, we do have a problem with errors management in the codebase.

Quite often, they are confusing, hard to decipher.
See for example #3302 or #3197.

This is a hindrance for users, as some of these messages are frankly misleading, or generally not informative enough on what is the actual problem by flashing a symptom instead (example: "tcp connection failure" is a symptom, the actual problem may be that "your DNS server is unreachable").

This is also a PITA for us, as we are quite often left reading the tea leaves in bug reports, squinting at some cryptic error which origin is hard to pinpoint.

And finally, this prevents us from (integration) testing error reporting to the user, as the only thing we could do is duplicating the string in tests if we want to validate the output.

This is not just circumstantial - it is a systemic issue:

most of the time we fmt.Errorf, which makes it impossible for consuming code to test the actual condition with errors.Is.
we do not wrap errors in a very mindful way - "invalid argument" is fine - but where is it coming from? a filestore? cobra argument validation?

Ability to test errors (errors.Is(X)) would allow consuming code:

to specialize behavior as certain errors (or errors categories) may be recoverable, or retry-able
decide what to tell the user - which is not necessarily the same thing as what the error text says
provide localization

And wrapping errors into meaningful categories would still keep things simple enough for consuming code.

While some of my recent PRs have introduced better erroring in places, I think it is time to propose the introduction of a nerdctl "errors (categories)" package, and encourage the use of var errors and errors joining.

This is this PR, which defines a handful of high-level errors that code can leverage to wrap errors.
Maybe there is more of these categories, or maybe we want to massage the ones proposed here. Feedback welcome as usual, and this may of course evolve as we strengthen the codebase.

Note that I have been wondering if we should just leverage containerd/errdefs instead of creating a new subpackage.
I am now leaning towards "no", for a few reasons:

it is confusing to return containerd errors, when what we want to express is a nerdctl error condition - we should handle containerd errors
containerd has its own set of concepts and expectations, which likely does not match what we need
containerd/errdefs is likely missing some of the error categories we need, so, we might end-up with two packages, leading to more confusion
we will not have flexibility to change containerd/errdefs as we see fit

This new nerdctl subpackage is NOT meant to collect and hold every and all errors we are going to define. Each package should still define their own errors. These here are meant to provide categories that will wrap the errors introduced by independent packages.

Sorry for the long explanation here, but I thought this required some extended explanation.

Let me know your thoughts.

Cheers.

pkg/errs/errors.go

AkihiroSuda

We have to keep errors.Is(err, containerderrdefs.ErrNotFound) functional (including cases where a function in nerdctl/pkg is called by a thirdparty project)

Signed-off-by: apostasie <[email protected]>

apostasie · 2024-09-02T17:29:53Z

We have to keep errors.Is(err, containerderrdefs.ErrNotFound) functional (including cases where a function in nerdctl/pkg is called by a thirdparty project)

Done.

djdongjin · 2024-09-02T18:22:32Z

pkg/errs/errors.go

+	"github.com/containerd/errdefs"
+)
+
+var (


Many of these are already defined in https://github.com/containerd/errdefs/blob/02b65bcc1b50358f7c1cca419c76f538349af784/errors.go#L37

ErrInvalidArgument, ErrFailedPrecondition.

And ErrSystemIsBroken, ErrNetworkCondition, ErrServerIsMisbehaving are similar to ErrInternal and ErrUnavailable.

Can we reuse the definition in containerd/errdefs?

I can alias them if you think it is helpful, but I think it is a bad idea to just use them directly as outlined above.

If you think it's a bad idea, maybe just wrap it with %w so it works for both nerdctl's and containerd's definition.

If you think it's a bad idea, maybe just wrap it with %w so it works for both nerdctl's and containerd's definition.

Just to clarify: containerd returning errors from errdefs are fine - and may of course be handled and wrapped, and should NOT be removed, or ignored - and none of this proposal will "break" existing code testing for errdefs.ErrNotFound.

example:

if errors.Is(containerdErr, errdefs.ErrNotFound) { return errors.Join(ErrVolumeRemovalFail, containerdErr) }

Same applies to any other error - from any third-party lib, or from golang.

apostasie · 2024-09-02T19:13:11Z

Clarifying intent with a clear example (untested, pseudo-code):

Current situation - imagine some nerdctl code package doing:

if err := os.Stat("somefile"); err != nil {
	return fmt.Errorf("something not right with filesystem: %w", err)
}

Returned error can only be tested by consuming code by testing against low-level golang errors (eg: os.ErrNotExists) and said consuming code cannot assess whether it is fatal (broken system) or something survivable without actual knowledge of what is going on inside the module.

Suggestion:

var ErrNotFound = errors.New("resource cannot be found")
var ErrSystemBroken = errors.New("something is wrong with your system")

if err := os.Stat("somefile"); err != nil {
	if errors.Is(err, os.ErrNotExists){
		return errors.Join(errs.ErrNotFound, err)
	}
	return errors.Join(errs.ErrSystemBroken, err)
}

Now, consuming code can test against a small set of meaningful nerdctl errors that do explain what is the issue instead of flashing a low-level symptom.

As to why we should not use the existing errdefs IMHO - see reasons above, but chiefly, errdefs does not have all the errors we need (as obviously some of them do not make sense), so, consuming code would have to be instructed to tests against errdefs in some conditions, and ALSO to tests against our own errs package in other conditions.
This is confusing and not a really good API...

As I said, aliasing would be fine of course (I do not quite care, as long as consuming code can just test against nerdclt/errs/ErrXXX) consistently.

dmcgowan · 2024-09-02T23:07:46Z

containerd/errdefs is likely missing some of the error categories we need, so, we might end-up with two packages, leading to more confusion

The categories defined by errdefs are intentionally constrained and based on grpc error handling, best summarized in https://cloud.google.com/apis/design/errors. We choose this for consistency across packages which may have implementations on either side of grpc. While the error handling does not need to exactly map to containerd or be what gets presented to end users, there is value in having a consistent way to represent errors in the code, whether those errors come from internal packages or from containerd. I would say what you are describing is not missing categories, but the ability to attach error useful details around those categories. In errdefs we are also trying to figure out ways to make sure we can attach more useful details to errors and make sure those get transferred to the client, such as ensuring that parse errors and network errors are more informative and actionable to end users. This will likely be done more by defining rich error types or wrapping existing types rather than creating new base types with errors.New.

apostasie · 2024-09-02T23:57:06Z

Thanks @dmcgowan .

So, coming back to the original problem then:

What we have right now is code that just rewrap system level or third party errors withfmt.Errorf, making it impossible for embedders to actually "test" error conditions, or provide meaningful feedback to users.

What is the suggested, concrete solution for that problem, here in nerdctl, that we could implement now?

(I appreciate that containerd itself may come up with something generic in the future, but it is not there right now I guess?)

apostasie · 2024-10-05T21:53:49Z

Bump

apostasie force-pushed the dev-error-package branch from 134a576 to 71ab88f Compare September 1, 2024 23:41

AkihiroSuda reviewed Sep 2, 2024

View reviewed changes

pkg/errs/errors.go Outdated Show resolved Hide resolved

AkihiroSuda requested changes Sep 2, 2024

View reviewed changes

apostasie force-pushed the dev-error-package branch from 71ab88f to e78e92d Compare September 2, 2024 15:26

apostasie requested a review from AkihiroSuda September 2, 2024 15:27

Introducing error package

c0514ee

Signed-off-by: apostasie <[email protected]>

apostasie force-pushed the dev-error-package branch from e78e92d to c0514ee Compare September 2, 2024 15:29

AkihiroSuda requested review from dmcgowan and a team September 2, 2024 16:02

djdongjin reviewed Sep 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: introducing a nerdctl error package #3393

Proposal: introducing a nerdctl error package #3393

apostasie commented Sep 1, 2024

AkihiroSuda left a comment

apostasie commented Sep 2, 2024

djdongjin Sep 2, 2024

apostasie Sep 2, 2024

djdongjin Sep 2, 2024

apostasie Sep 2, 2024

apostasie commented Sep 2, 2024

dmcgowan commented Sep 2, 2024

apostasie commented Sep 2, 2024 •

edited

Loading

apostasie commented Oct 5, 2024

Proposal: introducing a nerdctl error package #3393

Are you sure you want to change the base?

Proposal: introducing a nerdctl error package #3393

Conversation

apostasie commented Sep 1, 2024

AkihiroSuda left a comment

Choose a reason for hiding this comment

apostasie commented Sep 2, 2024

djdongjin Sep 2, 2024

Choose a reason for hiding this comment

apostasie Sep 2, 2024

Choose a reason for hiding this comment

djdongjin Sep 2, 2024

Choose a reason for hiding this comment

apostasie Sep 2, 2024

Choose a reason for hiding this comment

apostasie commented Sep 2, 2024

dmcgowan commented Sep 2, 2024

apostasie commented Sep 2, 2024 • edited Loading

apostasie commented Oct 5, 2024

apostasie commented Sep 2, 2024 •

edited

Loading