Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The result of a HealthCheck when the container is stopped during a HealthCheck in progress #25276

Open
Honny1 opened this issue Feb 10, 2025 · 0 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Honny1
Copy link
Member

Honny1 commented Feb 10, 2025

Issue Description

When you stop the container while the HealthCheck is in progress. Then the podman healthcheck run command will exit with exit code 1 and print unhealthy. But when you run the podman inspect command, you get the HealthCheck status as healthy and the FailingStreak 1. I think this behavior is incorrect from some point of view. The correct solution should be to exit the discussion.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman run -dt --replace --name hc1 --health-cmd 'sleep 25s;echo "hc-done"' quay.io/libpod/alpine:latest sleep 300
  2. podman healthcheck run hc1
  3. From another command line: podman stop hc1
  4. podman inspect hc1

Describe the results you received

The output of podman healthcheck run hc1:
unhealthy
Exit code: 1

The output of podman inspect hc1:

"State": {
               "OciVersion": "1.2.0",
               "Status": "exited",
               "Running": false,
               "Paused": false,
               "Restarting": false,
               "OOMKilled": false,
               "Dead": false,
               "Pid": 0,
               "ExitCode": 137,
               "Error": "",
               "StartedAt": "2025-02-10T17:36:21.390797905+01:00",
               "FinishedAt": "2025-02-10T17:36:59.915870988+01:00",
               "Health": {
                    "Status": "healthy",
                    "FailingStreak": 1,
                    "Log": [
                         {
                              "Start": "2025-02-10T17:36:21.449047969+01:00",
                              "End": "2025-02-10T17:36:46.493166672+01:00",
                              "ExitCode": 0,
                              "Output": "hc-done\n"
                         },
                         {
                              "Start": "2025-02-10T17:36:35.498119684+01:00",
                              "End": "2025-02-10T17:36:59.953785901+01:00",
                              "ExitCode": 1,
                              "Output": ""
                         }
                    ]
               },
               "CheckpointedAt": "0001-01-01T00:00:00Z",
               "RestoredAt": "0001-01-01T00:00:00Z",
               "StoppedByUser": true
          },

Describe the results you expected

On the one hand, the result of the podman healthcheck run is correct. The HealthCheck command in the container has been terminated. On the other hand, the container was killed by the user, so the result should healthy because the user terminated the container. I think it was a good idea to add a new state to explain this state. ("killed") Or unify the results from the podman inspect and the podman healthcheck run.
I think we should discuss the right solution.

podman info output

host:
  arch: arm64
  buildahVersion: 1.39.0-dev
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-2.fc40.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 98.7
    systemPercent: 0.32
    userPercent: 0.97
  cpus: 6
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: workstation
    version: "40"
  eventLogger: journald
  freeLocks: 2047
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.12.11-100.fc40.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 12456427520
  memTotal: 16722104320
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.13.1-1.fc40.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.13.1
    package: netavark-1.13.1-1.fc40.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.13.1
  ociRuntime:
    name: crun
    package: crun-1.19.1-1.fc40.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.19.1
      commit: 3e32a70c93f5aa5fea69b50256cca7fd4aa23c80
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20250121.g4f2c8e7-2.fc40.aarch64
    version: |
      pasta 0^20250121.g4f2c8e7-2.fc40.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 0h 26m 50.00s
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/jrodak/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/jrodak/.local/share/containers/storage
  graphRootAllocated: 67014492160
  graphRootUsed: 17006985216
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/jrodak/.local/share/containers/storage/volumes
version:
  APIVersion: 5.4.0-dev
  Built: 1739193749
  BuiltTime: Mon Feb 10 14:22:29 2025
  GitCommit: a61e378bac80d1ca2e1642522688adcb78b10aa0
  GoVersion: go1.22.11
  Os: linux
  OsArch: linux/arm64
  Version: 5.4.0-dev

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

@Honny1 Honny1 added the kind/bug Categorizes issue or PR as related to a bug. label Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant