Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 'docker compose up --exit-code-from' hangs indefinitely while stopping dependent services #12345

Open
edraze opened this issue Dec 3, 2024 · 7 comments

Comments

@edraze
Copy link

edraze commented Dec 3, 2024

Description

When running docker compose up --exit-code-from <service1>, the command hangs indefinitely while stopping a dependent service (e.g., service2). The issue occurs even after service1 exits, preventing the entire docker compose up process from completing.

The behavior suggests that Docker Compose struggles to gracefully terminate the dependent service (service2), which may lead to a prolonged or infinite shutdown.

Expected behavior: full compose (with all services) stopped after main service exits. Forcefully kill dependent services if it not responding on time (after 10s)
Actual behaviour: Stopping infinitely waiting for service graceful termination

I tried several solutions, but none of them worked:

  1. add stop_signal: SIGKILL to service2 configurations
  2. add stop_grace_period: 10s to service2 configurations

Logs:
[+] Stopping 4/5 0.0s

  | ✔ Container e2e-67-test-suite-1 Stopped 0.0s
  | ✔ Container e2e-67-exchange-1 Stopped 10.2s
  | ✔ Container e2e-67-stub-collector-1 Stopped 10.2s
  | ✔ Container e2e-67-stub-executor-1 Stopped 10.2s
  | ⠏ Container e2e-67-agent-1 Stopping 414.9s

Steps To Reproduce

  1. Create docker compose with two services. First should just exit after start. Second should hang and don't respond to any sinal.
  2. docker compose up --exit-code-from service1

Expected behavior: full compose (with all services) stopped after main service exits. Forcefully kill dependent services if it not responding on time (after 10s)
Actual behaviour: Stopping infinitely waiting for service graceful termination

Compose Version

Docker Compose version v2.30.3-desktop.1

Docker Environment

Client: Docker Engine - Community
 Version:    27.3.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  ai: Ask Gordon - Docker Agent (Docker Inc.)
    Version:  v0.1.0
    Path:     /usr/lib/docker/cli-plugins/docker-ai
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.18.0-desktop.2
    Path:     /usr/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.30.3-desktop.1
    Path:     /usr/lib/docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.37
    Path:     /usr/lib/docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Alpha) (Docker Inc.)
    Version:  v0.0.15
    Path:     /usr/lib/docker/cli-plugins/docker-desktop
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /usr/lib/docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.27
    Path:     /usr/lib/docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.5
    Path:     /usr/lib/docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.4.0
    Path:     /usr/lib/docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /usr/lib/docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.15.0
    Path:     /usr/lib/docker/cli-plugins/docker-scout

Server:
 Containers: 149
  Running: 73
  Paused: 0
  Stopped: 76
 Images: 129
 Server Version: 27.3.1
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 472731909fa34bd7bc9c087e4c27943f9835f111
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.10.14-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 6
 Total Memory: 4.308GiB
 Name: docker-desktop
 ID: 97485626-9c7e-4662-ad8d-c9832bf428af
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///home/server/.docker/desktop/docker-cli.sock
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

Anything else?

No response

@idsulik
Copy link
Collaborator

idsulik commented Dec 3, 2024

Couldn't reproduce it
docker compose up --exit-code-from service1

Container tmp-service2-1  Created
 Container tmp-service1-1  Recreate
 Container tmp-service1-1  Recreated
Attaching to service1-1, service2-1
service1-1  | Exiting immediately
service1-1 exited with code 2
Aborting on container exit...
 Container tmp-service1-1  Stopping
 Container tmp-service2-1  Stopping
 Container tmp-service1-1  Stopped
 Container tmp-service2-1  Stopped

compose.yaml

services:
  service1:
    image: alpine:latest
    command: /bin/sh -c "echo 'Exiting immediately' && exit 2"

  service2:
    image: alpine:latest
    command: /bin/sh -c "trap '' TERM INT && tail -f /dev/null"

@ndeloof
Copy link
Contributor

ndeloof commented Dec 3, 2024

As you get compose stuck this way, could you inspect stuck container e2e-67-agent-1 ?
Docker Compose relies on engine to stop container, and it should be killed by engine after stop timeout (10s by default). Compose should not have much to do but wait for engine to report container stopped.

Would be interesting also to capture docker events as you reproduce this error so we get more details on container state and how engine reacts to API calls by compose.

@edraze
Copy link
Author

edraze commented Dec 4, 2024

@ndeloof sending you more information:
docker-events.log
docker-inspect.json

[+] Stopping 4/5 0.0s

  | ✔ Container e2e-89-test-suite-1 Stopped 0.0s
  | ✔ Container e2e-89-exchange-1 Stopped 10.3s
  | ✔ Container e2e-89-stub-collector-1 Stopped 10.2s
  | ✔ Container e2e-89-stub-executor-1 Stopped 10.2s
  | ⠙ Container e2e-89-agent-1 Stopping 806.2s

It is also interesting that the docker inspect e2e-89-agent-1 command also hangs until the previously run command(docker compose up --exit-code-from test-suite) is terminated. The service(e2e-89-agent-1) eventually stops after manual termination of docker compose up command.

@ndeloof
Copy link
Contributor

ndeloof commented Dec 6, 2024

ok, so container exited as expected, and we can see container kill dfbce29245f7315bca25e5c8f89b499a775576af68aef791a60d5e9b532bed9d event first then after 10s timeout. But there's no following stop or die to report container actually terminated - so Docker Compose not being able to detect status update.
Sounds like a docker engine issue with a missing event. cc @thaJeztah

@thaJeztah
Copy link
Member

We had some issues in containerd where early exits didn't always propagate; wondering if related 🤔 cc @laurazard

Let me also try if I can reproduce using v27.4.0-rc.4

@thaJeztah
Copy link
Member

I did some tries to reproduce with a nightly build of docker desktop (with docker engine 27.4.0-rc.4, but so far didn't manage to.

docker compose up --exit-code-from service1
[+] Running 3/0
 ✔ Synchronized File Shares                                            0.0s
 ✔ Container early_exit-service2-1  Created                            0.0s
 ✔ Container early_exit-service1-1  Created                            0.0s
Attaching to service1-1, service2-1
service1-1  | Exiting immediately
service1-1 exited with code 2
Aborting on container exit...
[+] Stopping 2/2 Desktop   o View Config   w Enable Watch
 ✔ Container early_exit-service1-1  Stopped                            0.0s
 ✔ Container early_exit-service2-1  Stopped                           10.1s

There is a 10-second delay, which is quite likely a timeout because the container not handling SIGTERM (therefore being forcibly killed after 10-seconds);

time="2024-12-06T11:09:44.670107555Z" level=debug msg="Calling POST /v1.47/containers/3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7/stop" spanID=6ee74b43a6b0c3bf traceID=28de5260372a4b380ec54be139f4bd07
time="2024-12-06T11:09:44.670156430Z" level=debug msg="Sending kill signal 15 to container 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7"
time="2024-12-06T11:09:44.671140722Z" level=debug msg="Calling POST /v1.47/containers/68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176/stop" spanID=edf610696796a493 traceID=28de5260372a4b380ec54be139f4bd07
time="2024-12-06T11:09:44.676181472Z" level=debug msg="Calling GET /v1.47/containers/3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7/json" spanID=d446c2b4f9104d6e traceID=28de5260372a4b380ec54be139f4bd07
...
time="2024-12-06T11:09:54.681904629Z" level=info msg="Container failed to exit within 10s of signal 15 - using the force" container=3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 spanID=6ee74b43a6b0c3bf traceID=28de5260372a4b380ec54be139f4bd07
time="2024-12-06T11:09:54.682092212Z" level=debug msg="Sending kill signal 9 to container 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7"
time="2024-12-06T11:09:54.698862504Z" level=debug msg=event module=libcontainerd namespace=moby topic=/tasks/exit
time="2024-12-06T11:09:54.699930671Z" level=debug msg="Calling GET /v1.47/containers/3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7/json" spanID=3df7154bc6076d97 traceID=28de5260372a4b380ec54be139f4bd07
time="2024-12-06T11:09:54.705952796Z" level=debug msg=event module=libcontainerd namespace=moby topic=/tasks/delete
time="2024-12-06T11:09:54.705990379Z" level=info msg="ignoring event" container=3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
...

Docker Events:

2024-12-06T11:57:00.117973715+01:00 container attach 68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176 (com.docker.compose.config-hash=022d877be7fc30bb03761e3ff0d4785adfd56b7f5992fde687fd45f6020d5ab1, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service1, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service1-1)
2024-12-06T11:57:00.120627923+01:00 container attach 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service2-1)
2024-12-06T11:57:00.147858798+01:00 network connect 25e2c694d417ddd48eee2c30ac1b82208078c8d8eae4c9fe5efeae8e9d2462b5 (container=3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7, name=early_exit_default, type=bridge)
2024-12-06T11:57:00.149427465+01:00 network connect 25e2c694d417ddd48eee2c30ac1b82208078c8d8eae4c9fe5efeae8e9d2462b5 (container=68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176, name=early_exit_default, type=bridge)
2024-12-06T11:57:00.286235757+01:00 container start 68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176 (com.docker.compose.config-hash=022d877be7fc30bb03761e3ff0d4785adfd56b7f5992fde687fd45f6020d5ab1, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service1, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service1-1)
2024-12-06T11:57:00.286654674+01:00 container start 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service2-1)
2024-12-06T11:57:00.361082174+01:00 network disconnect 25e2c694d417ddd48eee2c30ac1b82208078c8d8eae4c9fe5efeae8e9d2462b5 (container=68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176, name=early_exit_default, type=bridge)
2024-12-06T11:57:00.365547007+01:00 container die 68898b03ce5caf948ae83e479e28fe5ca7c7dc9029e406aaf9b3ea1ec69a6176 (com.docker.compose.config-hash=022d877be7fc30bb03761e3ff0d4785adfd56b7f5992fde687fd45f6020d5ab1, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service1, com.docker.compose.version=2.31.0, execDuration=0, exitCode=2, image=alpine:latest, name=early_exit-service1-1)
2024-12-06T11:57:00.398387090+01:00 container kill 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service2-1, signal=15)
2024-12-06T11:57:10.404813720+01:00 container kill 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service2-1, signal=9)
2024-12-06T11:57:10.481029137+01:00 network disconnect 25e2c694d417ddd48eee2c30ac1b82208078c8d8eae4c9fe5efeae8e9d2462b5 (container=3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7, name=early_exit_default, type=bridge)
2024-12-06T11:57:10.483563553+01:00 container stop 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, image=alpine:latest, name=early_exit-service2-1)
2024-12-06T11:57:10.486359803+01:00 container die 3d9fdc5e372b271b9e52704d8b9692003afac8290d397deec7748b59d6d40ed7 (com.docker.compose.config-hash=6a73f30355f9ed48f7c60f9115f28eda27b6192fa7638d04122f3df93ffe3059, com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d, com.docker.compose.oneoff=False, com.docker.compose.project=early_exit, com.docker.compose.project.config_files=/Users/thajeztah/Projects/test/early_exit/compose.yaml, com.docker.compose.project.working_dir=/Users/thajeztah/Projects/test/early_exit, com.docker.compose.service=service2, com.docker.compose.version=2.31.0, execDuration=10, exitCode=137, image=alpine:latest, name=early_exit-service2-1)

@ndeloof
Copy link
Contributor

ndeloof commented Dec 6, 2024

Same: was not able to reproduce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants