-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Cleaning up orphaned directories and files when containers creat… #3457
fix: Cleaning up orphaned directories and files when containers creat… #3457
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Is this testable? E.g., can we add a test in https://github.com/containerd/nerdctl/blob/772feb40ee9d0d51db6da8ac3a8f737bbb48475f/cmd/nerdctl/container/container_create_linux_test.go?
} | ||
} | ||
|
||
func generateRemoveOrphanedDirsFunc(ctx context.Context, id, dataStore string, internalLabels internalLabels) func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: are we deleting multiple dirs, or only 1. If only 1, we should name it generateRemoveOrphanedDirFunc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it deletes one or more of the following directories, the function name is Dirs.
/var/lib/nerdctl/1935db59/containers/<namespace>/<container id>
/var/lib/nerdctl/1935db59/etchosts/<namespace>/<container id>
Hi, @djdongjin Thanks for comments !!!
I think it is difficult to implement tests based on the number of files in the directory under Is there any E2E test that can perform tests based on the number of local files ... ???? |
Here is an e2e test for you that does it (the tricky part is to access the right location): func TestIssue2993(t *testing.T) {
nerdtest.Setup()
testCase := &test.Case{
Description: "Issue #2993 - nerdctl is no longer leaking etchosts dirs & files on failure",
// Make sure we are in a private env so that other tests do not pollute this
Require: nerdtest.Private,
// Run a first container successfully
Setup: func(data test.Data, helpers test.Helpers) {
helpers.Ensure("run", "--name", data.Identifier(), testutil.CommonImage)
},
Cleanup: func(data test.Data, helpers test.Helpers) {
helpers.Anyhow("rm", "-f", data.Identifier())
},
// Try to run a second container with the same name
Command: func(data test.Data, helpers test.Helpers) test.Command {
return helpers.Command("run", "--name", data.Identifier(), testutil.CommonImage)
},
Expected: func(data test.Data, helpers test.Helpers) *test.Expected {
return &test.Expected{
// We expect the command to fail
ExitCode: 1,
// As the name is already used
Errors: []error{errors.New("is already used by")},
Output: func(stdout string, info string, t *testing.T) {
// Get the test DataRoot, the containerd address, and get the dataStore location
base := testutil.NewBase(t)
dataStore, err := clientutil.DataStore(string(data.ReadConfig(nerdtest.DataRoot)), base.ContainerdAddress())
assert.NilError(t, err, "data store failed")
// Now, we are interested in "etchosts", and in the Namespace (data.Identifier() with `nerdtest.Private`)
etchostsdir := filepath.Join(dataStore, "etchosts", data.Identifier())
// Read the dir
files, err := os.ReadDir(etchostsdir)
assert.NilError(t, err, fmt.Sprintf("failed reading %s", etchostsdir))
// We are expecting only one directory there, as the other container should have deleted its directory on failure
assert.Equal(t, len(files), 1, "there should be only 1 subdirectory in the etchosts store")
},
}
},
}
testCase.Run(t)
} Obviously, the important part is the |
30416e3
to
9baf881
Compare
Thanks !!!!!!! |
Pending addition of a test validating the fix, LGTM. CI failures are unrelated (windows regression #3437). |
Added tests work well in my local env, but they fail in GitHub Actions. Details
There seems to be a problem with the creation of the If there are any changes that need to be made about the test, I would like your advise Updated I forgot to run When running the command, I can check the failure of tests occured in GitHub Actions. Details
/Users/haytok/workspace/nerdctl/cmd/nerdctl/container [fix-to-clean-up-orphaned-files]
> go test -p 1 ../../../cmd/nerdctl/... -run "TestIssue2993" -args -test.target=docker
ok github.com/containerd/nerdctl/v2/cmd/nerdctl 0.005s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/apparmor 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/builder 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/completion 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/compose 0.005s [no tests to run]
test target: "docker"
[rm --data-root /tmp -f nerdctl-testissue2993]
[run --name nerdctl-testissue2993 --data-root /tmp -d ghcr.io/stargz-containers/alpine:3.13-org sleep infinity]
--- FAIL: TestIssue2993 (0.04s)
--- FAIL: TestIssue2993/Issue_#2993_-_nerdctl_no_longer_leaks_containers_and_etchosts_directories_and_files_when_container_creation_fails_or_a_container_is_removed. (0.02s)
container_create_linux_test.go:216: assertion failed: error is not nil: open /tmp/1935db59/containers/nerdctl-test: no such file or directory
--- FAIL: TestIssue2993/Issue_#2993_-_nerdctl_no_longer_leaks_containers_and_etchosts_directories_and_files_when_container_creation_fails_or_a_container_is_removed.#01 (0.00s)
container_create_linux_test.go:235: assertion failed: error is not nil: open /tmp/1935db59/containers/nerdctl-test: no such file or directory
FAIL
FAIL github.com/containerd/nerdctl/v2/cmd/nerdctl/container 0.042s
? github.com/containerd/nerdctl/v2/cmd/nerdctl/helpers [no test files]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/image 0.005s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/inspect 0.004s [no tests to run]
? github.com/containerd/nerdctl/v2/cmd/nerdctl/internal [no test files]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/ipfs 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/login 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/namespace 0.004s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/network 0.005s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/system 0.007s [no tests to run]
ok github.com/containerd/nerdctl/v2/cmd/nerdctl/volume 0.005s [no tests to run]
FAIL Therefore, I'll investigate how to retrieve dataStore in test code. |
@@ -174,3 +180,72 @@ func TestCreateWithTty(t *testing.T) { | |||
withTtyContainer := base.InspectContainer(withTtyContainerName) | |||
assert.Equal(base.T, 0, withTtyContainer.State.ExitCode) | |||
} | |||
|
|||
// TestIssue2993 tests https://github.com/containerd/nerdctl/issues/2993 | |||
func TestIssue2993(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to mark this test as DockerIncompatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
Added testutil.DockerIncompatible(t)
to tests.
|
||
// TestIssue2993 tests https://github.com/containerd/nerdctl/issues/2993 | ||
func TestIssue2993(t *testing.T) { | ||
base := testutil.NewBase(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggesting you use a dedicated namespace for that test, so that we can parallelize the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed to specify independent namespaces for each test case.
Hi, @apostasie Thanks for the review. I have added tests for this issue, so could you re-review when you have time ? CI failure is related to this comment (#3457 (comment)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @haytok !
Left a few comments on tests.
If you want, there is more info about tests principles here: https://github.com/containerd/nerdctl/blob/main/docs/testing/README.md#principles
And details about a new tooling framework here:
https://github.com/containerd/nerdctl/blob/main/docs/testing/tools.md
containerName: "failure-of-container-creation", | ||
namespace: "failure-of-container-creation", | ||
command: func(t *testing.T, containerName, namespace, containersPath, etchostsPath string) { | ||
base := testutil.NewBaseWithNamespace(t, namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use testutil.Identifier(t)
for the namespace name.
Same thing for container name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !!! Fixed.
assert.NilError(t, err) | ||
etchostsDirCounts := len(etchostsDirs) | ||
|
||
base.Cmd("rm", "-f", containerName).Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also want to make sure you run this ^ before anything else, so that leftovers from previous tests do not make you fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed based on the testing principles.
command: func(t *testing.T, containerName, namespace, containersPath, etchostsPath string) { | ||
base := testutil.NewBaseWithNamespace(t, namespace) | ||
|
||
defer base.Cmd("rm", "-f", containerName).Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use t.Cleanup instead of defer.
Also make sure to call your cleanup routine ALSO before you start doing anything to take care of leftovers from previous tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !!! fixed based on the testing principles.
The local branch was out of date and I was unaware that there was documentation for the test. |
5e6bfc3
to
833cb72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Reviewers, about the CI failures:
|
…ion fails When trying to create new containers, the following directories are created based on the new container ID: - `/var/lib/nerdctl/1935db59/containers/<namespace>/<container id>` - `/var/lib/nerdctl/1935db59/etchosts/<namespace>/<container id>` When containers with existing names are attempted to be created, the processes fail. However, in the events of failures, these directories are not cleaned up. This issue is reported in the following: - containerd#2993 This commit resolves the issue by cleaning up the mentioned directories when containers creation fails. Also, It has also been modified so that the following directory is also cleaned up when the container is removed. - `/var/lib/nerdctl/1935db59/etchosts/<namespace>/<container id>` Note that tests of logic to fix bug in Issue containerd#2993 are added based on the following testing principles. - https://github.com/containerd/nerdctl/blob/main/docs/testing/tools.md Signed-off-by: Hayato Kiwata <[email protected]>
4cec857
to
2c2745e
Compare
Hi, @apostasie I have squashed all commits. |
@haytok appreciate all your efforts here! |
…ion fails
When trying to create new containers, the following directories are created based on the new container ID:
/var/lib/nerdctl/1935db59/containers/<namespace>/<container id>
/var/lib/nerdctl/1935db59/etchosts/<namespace>/<container id>
If containers with existing names are attempted to be created, the processes fail.
However, in the events of failures, these directories are not cleaned up.
This issue is reported in the following:
This commit resolves the issue by cleaning up the mentioned directories when containers creation fails.