-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When checking IsDangling make sure image is not in manifest list #2360
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@nalind @Luap99 @mheon @mtrmac PTAL @ericcurtin This is needed for OCI images in RamaLama. Currently if you convert a model to an OCI Image it creates an manifestlist pointing to the untagged built image, but if the user does a |
libimage/runtime.go
Outdated
return false, err | ||
} | ||
|
||
for _, image := range allImages { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned by how expensive this is going to make the dangling check. It feels like there should be a better way of checking manifest list membership, but maybe that's a c/storage RFE for when we refactor there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the libimage/manifests
code has, sadly?, been implemented here in c/common, in a way that makes it ~invisible to c/storage.
But, yes, this is not good at all. isDangling
is called from ListImages
. So:
- With
SetListData
, this is an infinite recursion. - and, anyway, this is O(images^2) at least.
This needs to be implemented in a way where “all images” are only processed once (well, ideally it should be an O(1) query, but I don’t think we have that data), to build a data structure which allows isDangling
to be executed in, at worst O(1) I/O if not O(1) CPU.
And then I suspect that removeRecursive
might want to call this internal version, with the data structure pre-built, instead of the public IsDangling
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree this is O(n^n) which should not be done, basically the code does a list images then filters (which calls isDangling() on each image which then should not in turn call list again all images for each image)
It should properly build a map of all images used in manifests and then pass that map around so the dangling filter can do a quick check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… Oops, not an infinite recursion, because the recursive calls don’t have SetListData
set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right n^2 not n^n, my bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to comment the same above:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably faster to first compute a (reverse) mapping from manifest lists to images and just do look ups as Paul suggested.
libimage/runtime.go
Outdated
return false, err | ||
} | ||
|
||
for _, image := range allImages { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the libimage/manifests
code has, sadly?, been implemented here in c/common, in a way that makes it ~invisible to c/storage.
But, yes, this is not good at all. isDangling
is called from ListImages
. So:
- With
SetListData
, this is an infinite recursion. - and, anyway, this is O(images^2) at least.
This needs to be implemented in a way where “all images” are only processed once (well, ideally it should be an O(1) query, but I don’t think we have that data), to build a data structure which allows isDangling
to be executed in, at worst O(1) I/O if not O(1) CPU.
And then I suspect that removeRecursive
might want to call this internal version, with the data structure pre-built, instead of the public IsDangling
.
BTW I don’t understand the motivating situation (is there, you know, a reproducer?), I don’t much understand the libimage/manifests code (something to fix on my side, sure) — so the above is only pointing out specific issues, it’s not an endorsement of the approach in general. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a more verbose commit message explaining the reasoning. I think the current behavior follows the man pages and docs. I'd love to see tests here and for Podman as well (along with man page changes) if we go forward.
I planned on adding the tests to podman. The easiest test is to do pseudo code.
Last line should fail. |
It might well be the case that we want the “dangling” filter (exposed in the UI) unchanged but |
189032d
to
00b7490
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick skim.
libimage/layer_tree.go
Outdated
@@ -22,6 +23,8 @@ type layerTree struct { | |||
// emptyImages do not have any top-layer so we cannot create a | |||
// *layerNode for them. | |||
emptyImages []*Image | |||
|
|||
manifestDigests map[digest.Digest]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document keys/values. And if it is effectively a set, make the values struct{}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you are after here. This is just a mapping of digests that are used in the manifest lists in containers/storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping of what, to what?
Yes, you know that. I think I know that, now. Please write that down, to save future readers time so that they can decide just reading the data structure whether this data is relevant to them.
E.g. the keys are not the digests of all manifests of all images in the layer tree. Neither are the keys the digests of all manifest lists that exist in the storage (they are not digests of manifest lists at all ; also, for an ordinary podman pull
, we typically store one per-platform manifest, and the parent manifest list, in c/storage, along with the pulled image, and those usually (?) have a top layer).
(And it turns out “to what” has the answer “always exactly true
”, so it doesn’t need to be a boolean; a struct{}
works just as well, when we are only after the presence of the key in the map.)
libimage/layer_tree.go
Outdated
@@ -124,6 +128,15 @@ func (r *Runtime) newLayerTreeFromData(images []*Image, layers []storage.Layer) | |||
topLayer := img.TopLayer() | |||
if topLayer == "" { | |||
tree.emptyImages = append(tree.emptyImages, img) | |||
if manifestList, _ := img.IsManifestList(context.TODO()); manifestList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do this for every layer tree? Some (e.g. for Image.Parent
) are one-time and one-purpose only.
(I don’t know whether this would be an option to layerTree
, or a separate data structure parallel to layerTree
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want @vrothberg to comment on this. We can develop a different structure and process it separately or rename this structure to a different name. I read the tree as a place to cache processed data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below:
Let's make sure to benchmark on a system with 100,200,400,800 images and if there is a noticeable difference.
I think it makes sense to only compute this data when needed and cache it. Using the layerTree for that purpose looks attractive as it's plumbed to cache/reset data. But the manifest digests could be queried via a function that either computes it fresh or returns the cached map?
d534dbd
to
4cdd485
Compare
Vendor in latest containers/common Currently if you create an un tagged image and add it to a manifest list, podman image prune will remove the image, and leave you with a broken manifest list. This PR removes the image from dangling in this situation and prevents the pruning. We have seen this trigger issues with RamaLama which is creating manifest lists in this manner, and then users pruning the images. Verifing: containers/common#2360 Signed-off-by: Daniel J Walsh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit worried about the performance impacts. The tango for computing and operating on the layer tree addressed some serious bottlenecks for image operations.
The manifest-list checks may have a negative impact but I am unable to guess how much.
Let's make sure to benchmark on a system with 100,200,400,800 images and if there is a noticeable difference.
@@ -22,6 +23,8 @@ type layerTree struct { | |||
// emptyImages do not have any top-layer so we cannot create a | |||
// *layerNode for them. | |||
emptyImages []*Image | |||
|
|||
manifestListDigests map[digest.Digest]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhatdan, can you add a comment here describing map and how it should be used?
@@ -677,3 +678,12 @@ func (l *list) Instances() []digest.Digest { | |||
} | |||
return instances | |||
} | |||
|
|||
func (l *list) Digests() []digest.Digest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment here as well to make sure callers know which digests are returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok Adding comments, but I notice:
// Instances returns the list of image instances mentioned in this list.
func (l *list) Instances() []digest.Digest {
instances := make([]digest.Digest, 0, len(l.oci.Manifests))
for _, instance := range l.oci.Manifests {
instances = append(instances, instance.Digest)
}
return instances
}// Digests returns list of all image digests referenced in the manifestList
func (l *list) Digests() []digest.Digest {
digests := make([]digest.Digest, 0) for i := range l.docker.Manifests {
digests = append(digests, l.docker.Manifests[i].Digest)
}
return digests
}
I added the second function. Is there a difference between these two? IE a l.oci.Manifests and a l.docker.Manifests. Should my Digests function return both? Or should I just remove my DIgests and use Instances?
@@ -124,6 +128,15 @@ func (r *Runtime) newLayerTreeFromData(images []*Image, layers []storage.Layer) | |||
topLayer := img.TopLayer() | |||
if topLayer == "" { | |||
tree.emptyImages = append(tree.emptyImages, img) | |||
if manifestList, _ := img.IsManifestList(ctx); manifestList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment why we're doing this here.
Currently when we run podman image prune or podman images --filter dangling It is pruning images that are in a local manifest. These images are not dangling because they are currently in use by a named manifest list. You can create this situation simply by doing echo "from scratch" > /tmp/Containerfile id=$(podman build /tmp) podman manifest create test $id podman image prune --force podman image exists $id Will return an error since the image was pruned. Now the local manifest test is broken. Signed-off-by: Daniel J Walsh <[email protected]>
@vrothberg added comments |
Currently when we run podman image prune or podman images --filter
dangling
It is pruning images that are in a local manifest. These images are
not dangling because they are currently in use by a named manifest list.
You can create this situation simply by doing
echo "from scratch" > /tmp/Containerfile
id=$(podman build /tmp)
podman manifest create test $id
podman image prune --force
podman image exists $id
Will return an error since the image was pruned. Now the local manifest
test is broken.
Signed-off-by: Daniel J Walsh [email protected]<!--- Please read the contributing guidelines before proceeding --->