Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer digests are not preserved after foreign layer removal #519

Open
mthalman opened this issue Jul 28, 2024 · 12 comments
Open

Layer digests are not preserved after foreign layer removal #519

mthalman opened this issue Jul 28, 2024 · 12 comments
Assignees
Labels
🔖 ADO Has corresponding ADO item bug Something isn't working

Comments

@mthalman
Copy link
Member

Ever since the removal of foreign layers from Windows images, there's unexpected behavior with respect to the digest of the base Windows layer. When pushing an image to a registry that is based on a Windows image, the base image layer (which corresponds to the base Windows image) does not match the original image. This is unexpected (for example, Linux images do not behave this way).

Repro Steps

docker pull mcr.microsoft.com/windows/nanoserver:ltsc2022

docker tag mcr.microsoft.com/windows/nanoserver:ltsc2022 myacr.azurecr.io/nanoserver:latest

docker push myacr.azurecr.io/nanoserver:latest

docker manifest inspect mcr.microsoft.com/windows/nanoserver:ltsc2022-amd64
{
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
                "mediaType": "application/vnd.docker.container.image.v1+json",
                "size": 638,
                "digest": "sha256:f0ca296450062003226b7b71be17afe4ccb62ff1b50995b07c20772c29359551"
        },
        "layers": [
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 116769712,
                        "digest": "sha256:4f08abc68a60a91547c026d50bc32384338152a95155ab082f6ea8aaf874d94b"
                }
        ]
}

docker manifest inspect myacr.azurecr.io/nanoserver:latest
{
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
                "mediaType": "application/vnd.docker.container.image.v1+json",
                "size": 638,
                "digest": "sha256:f0ca296450062003226b7b71be17afe4ccb62ff1b50995b07c20772c29359551"
        },
        "layers": [
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 120490378,
                        "digest": "sha256:652774a5d82a114642848f8b0b8d486ec1b4995f9dda56e36fe4ac7563429990"
                }
        ]
}

Notice that the layer digests, 4f08abc68a60a91547c026d50bc32384338152a95155ab082f6ea8aaf874d94b vs 652774a5d82a114642848f8b0b8d486ec1b4995f9dda56e36fe4ac7563429990, are different. They even have different sizes.

@mthalman mthalman added bug Something isn't working triage New and needs attention labels Jul 28, 2024
@ntrappe-msft ntrappe-msft self-assigned this Jul 31, 2024
@ntrappe-msft ntrappe-msft added 🔖 ADO Has corresponding ADO item and removed triage New and needs attention labels Jul 31, 2024
@ntrappe-msft
Copy link
Contributor

ntrappe-msft commented Aug 6, 2024

Hi, I tried to repro your scenario with the following steps on a Windows 11 host.

  1. Pull most recent Nano Server image: docker pull mcr.microsoft.com/windows/nanoserver:ltsc2022.
  2. Get information on that image: docker inspect mcr.microsoft.com/windows/nanoserver:ltsc2022.
  3. Tag that image: docker tag mcr.microsoft.com/windows/nanoserver:ltsc2022 <ACR>.azurecr.io/nanoserver:latest
  4. Get information on tagged image: docker inspect <ACR>.azurecr.io/nanoserver:latest
  5. Push tagged image to ACR: docker push <ACR>.azurecr.io/nanoserver:latest
  6. Pull tagged image from ACR: docker pull <ACR>.azurecr.io/nanoserver:latest
  7. Get information on pulled image: docker inspect <ACR>.azurecr.io/nanoserver:latest

I saw the same information (including sizes) across all the images in steps 2, 4, and 7. Are you doing anything else? We don't expect to see the image changing at all when pushing/pulling to the Azure Container Registry (ACR).

@mthalman
Copy link
Member Author

mthalman commented Aug 7, 2024

Please follow the steps as I've defined them. Your steps are different and don't demonstrate the issue. First, I'm using docker manifest inspect, not docker inspect. Using docker inspect doesn't show the layer digests. Second, your step 6 isn't really doing anything because the image already exists locally. Pulling the pushed image isn't necessary anyway as you can query the pushed image directly with docker manifest inspect.

@ntrappe-msft
Copy link
Contributor

@mthalman Thank you for your patience as I've been diving more into this issue.

I tested docker manifest inspect across three images, all derived from the same nanoserver:ltsc2022 image:

  1. Local: docker pull mcr.microsoft.com/windows/nanoserver:ltsc2022
  2. Tagged for Docker Hub: docker tag <above image ID> <dockerhub>/nanoserver:latest
  3. Tagged for Azure Container Registry: docker tag <above image ID> <acr>/nanoserver:latest

Here are the sizes for the layer of each:

  • Local: 116,816,937 bytes
  • Docker Hub: 120,554,921 bytes
  • Azure Container Registry: 120,490,378 bytes

We can see slight variations in size, with Docker Hub and ACR being about 3% larger. This is normal and can attributed to the following factors:

  • Compression: When Docker images are pushed to a registry (like Docker Hub or ACR), they are compressed, which can vary the sizes. Different compression algorithms result in different compression effects.
  • Metadata: Additional metadata may be added by the registry, which can alter the size.
  • Layer Repackaging: The image may be repackaged when pushing to the registry, causing changes in size.

@ntrappe-msft
Copy link
Contributor

ntrappe-msft commented Aug 28, 2024

I'm still trying to investigate why we're seeing differences in behavior between Windows and Linux. For Windows, the image digests change when tagged, but not for Linux. This may be due to how the reparse points are handled when we're getting ready to export the image to a repository.

Example

  1. Local (Untagged) Nano Server
"config": {
  "size": 638,
  "digest": "sha256:e82f76d7080851ee8fead794e4eb957de1150f1a84270fa5d660e14582782ce2"
  },
  1. Tagged Nano Server for ACR
"config": {
  "size": 638,
  "digest": "sha256:f0ca296450062003226b7b71be17afe4ccb62ff1b50995b07c20772c29359551"
},
  1. Local (Untagged) Alpine
"config": {
  "size": 1486,
  "digest": "sha256:0b4426ad4bf25e13fb09112b9dcb5d5b09b3c5684599654583913b2714a705a2"
},
  1. Tagged Alpine for Docker Hub
"config": {
  "size": 1486,
  "digest": "sha256:0b4426ad4bf25e13fb09112b9dcb5d5b09b3c5684599654583913b2714a705a2"
},

@doctorpangloss
Copy link

Is this related to goharbor/harbor#20143 goharbor/harbor#20133 ?

@ntrappe-msft
Copy link
Contributor

@mthalman Also taking a look at your original output, do you consistently see that the config digest field stays the same? Or does it change between the originally pulled image and tagged version?

Original Nano Server

docker manifest inspect mcr.microsoft.com/windows/nanoserver:ltsc2022-amd64
{
  "config": {
    "size": 638,
    "digest": "sha256:f0ca296450062003226b7b71be17afe4ccb62ff1b50995b07c20772c29359551"
  },
  "layers": [
    {
      "size": 116769712,
      "digest": "sha256:4f08abc68a60a91547c026d50bc32384338152a95155ab082f6ea8aaf874d94b"
    }
 ]
}

Config Digest: f0ca2964...
Base Layer Digest: 4f08abc...

Tagged Nano Server

docker manifest inspect mcr.microsoft.com/windows/nanoserver:ltsc2022-amd64
{
  "config": {
    "size": 638,
    "digest": "sha256:f0ca296450062003226b7b71be17afe4ccb62ff1b50995b07c20772c29359551"
  },
  "layers": [
    {
      "size": 120490378,
      "digest": "sha256:652774a5d82a114642848f8b0b8d486ec1b4995f9dda56e36fe4ac7563429990"
    }
 ]
}

Config Digest: f0ca2964... 🔔[SAME]🔔
Base Layer Digest: 652774a... 🔔[DIFF]🔔

@mthalman
Copy link
Member Author

@mthalman Also taking a look at your original output, do you consistently see that the config digest field stays the same? Or does it change between the originally pulled image and tagged version?

It stays the same. My original post includes the config digest output and indicates they are the same. This behavior is consistent.

@ntrappe-msft
Copy link
Contributor

Alright, that's not great. It looks like the configuration object isn't consistently indicating whether the image is valid. I'll have to find out why (1) layers are changed and (2) why some config objects are reflecting that or not. Thanks for the confirmation.

Copy link
Contributor

This issue has been open for 30 days with no updates.
@ntrappe-msft, please provide an update or close this issue.

3 similar comments
Copy link
Contributor

This issue has been open for 30 days with no updates.
@ntrappe-msft, please provide an update or close this issue.

Copy link
Contributor

This issue has been open for 30 days with no updates.
@ntrappe-msft, please provide an update or close this issue.

Copy link
Contributor

This issue has been open for 30 days with no updates.
@ntrappe-msft, please provide an update or close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔖 ADO Has corresponding ADO item bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants