-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commands on Docker host fail with "returned empty string" #644
Comments
Do you have a bit more precise information with which versions of community.docker and ansible-core this used to work? |
I'm trying to reproduce this locally: ---
- hosts: localhost
gather_facts: false
tasks:
- name: Create temporary container
community.docker.docker_container:
name: test
image: python:3.10-bullseye
state: started
auto_remove: true
command: /bin/bash
interactive: true
tty: true
- name: Add container to inventory
add_host:
name: "test"
ansible_connection: community.docker.docker
ansible_host: test
ansible_user: root
ansible_become: true
ansible_remote_tmp: /tmp/.ansible
ansible_python_interpreter: python
- name: ping
delegate_to: "test"
ansible.builtin.ping: But it works fine for me (I'm running it with
This could also be related to using |
Sorry. No. I wish I did. I'll do some more tests. |
For what it's worth, the Docker versions are 24.0.1 locally and 24.0.2 on the host. Confusingly, running that command directly at the command prompt works fine. I'm pretty sure it didn't yesterday. But the playbook still fails for me. /usr/bin/docker -H=tcp://192.168.1.168:2375 exec -u root -i tmp_data /bin/sh -c '/bin/sh -c "( umask 77 && mkdir -p \"` echo /tmp/.ansible `\" && mkdir \"` echo /tmp/.ansible/ansible-tmp-1686177373.2244887-17139-101563472711239 `\" && echo ansible-tmp-1686177373.2244887-17139-101563472711239=\"` echo /tmp/.ansible/ansible-tmp-1686177373.2244887-17139-101563472711239 `\" ) && sleep 0"' |
The playbook you posted does work on the machine hosting the docker container, but doesn't work from another machine. Even when adding However, as I said, running that command directly from the other machine does work. Also running Also, I'm open to a better way of connecting to the remote Docker containers than
|
What are the differences between these two machines with regard to Python version, ansible-core version, community.docker version, Docker version, ...? |
I never figured this out, but instead I worked around it. Thanks for your help. Instead, I changed it to connect to the docker host via SSH instead:
In fact, I think this is an all-round better solution anyway, since it's both more secure and easier. You're using the existing SSH connection, rather than configuring Docker to open a new port, and it uses SSH for security rather than allowing open access. I don't know why the Docker documentation doesn't recommend this approach more prominently for remote access (e.g. https://docs.docker.com/config/daemon/remote-access/). I don't know if it's worth adding something to the docs here. |
I have the same problem when connecting to a remote docker engine, it happens because the exec call checks the command output too soon before it is executed. I did an analysis of the packets with wireshark and the response from the docker engine is that the command is still running, possible it goes into error because the return code is null and not zero, it would be enough to introduce a delay or repeat the call when the command is still running. ` {"User":"","Privileged":false,"Tty":false,"AttachStdin":true,"AttachStderr":true,"AttachStdout":true,"Detach":false,"DetachKeys":"","Env":null,"WorkingDir":"","Cmd":["/bin/sh","-c","/bin/sh -c '( umask 77 \u0026\u0026 mkdir -p " {"Id":"37620a287a45af02f0481d8ef439eee6b3d93483e07c208f2db841400545a8b5"} HTTP/1.1 200 OK {"ID":"37620a287a45af02f0481d8ef439eee6b3d93483e07c208f2db841400545a8b5","Running":true,"ExitCode":null,"ProcessConfig":{"tty":false,"entrypoint":"/bin/sh","arguments":["-c","/bin/sh -c '( umask 77 && mkdir -p " |
SUMMARYWhen executing a task delegated to a host using the community.docker.docker connection, the task fails with the ISSUE TYPE
COMPONENT NAME
ANSIBLE VERSION
DOCKER VERSIONDocker Host
Ansible Container
I have tried to upgrade the binary in the Ansible container to version 24, but the issue still persists. COLLECTION VERSIONansible-galaxy collection list community.docker
CONFIGURATION
OS / ENVIRONMENTDocker Desktop
Ansible Container
Target Container
STEPS TO REPRODUCECreate the following 3 files in WSL. /Dockerfile FROM python:3.12-slim AS ansible
RUN set -eux; \
apt-get update; \
apt-get install -y --no-install-recommends \
docker.io \
; \
pip install --upgrade pip; \
pip install --no-cache-dir \
boto \
boto3 \
ansible \
; /docker-compose.yaml version: "3.8"
services:
ansible:
stdin_open: true
tty: true
build:
target: ansible
stop_signal: SIGKILL
container_name: "ansible"
volumes:
- ./task.yaml:/task.yaml
alpine:
stdin_open: true
tty: true
image: "alpine:latest"
container_name: "alpine"
stop_signal: SIGKILL /task.yaml ---
- hosts: localhost
connection: local
vars:
docker_extra_args: "-H=tcp://host.docker.internal:2375"
user: root
tasks:
- name: Add container to inventory
ansible.builtin.add_host:
name: alpine
ansible_connection: community.docker.docker
ansible_docker_extra_args: "{{docker_extra_args}}"
- name: "Delegating command to docker container"
delegate_to: alpine
ansible.builtin.command: "echo 'test'" Start the docker containers, then execute the task in the ansible container $ docker compose up --detach
$ docker exec -u root -i ansible /bin/sh -c "ansible-playbook task.yaml" EXPECTED RESULTSThe expected result would be the task completing successfully. Docker Desktop version 4.18.0 is the last version in which this would work. Since the bug appeared after a Docker Desktop update, you might think the issue is related to docker, but it's not the case, please see my analysis. ACTUAL RESULTS
Debugging@calillo had the idea of analysing the traffic exchanged between the docker binary and the docker engine. I did the same to try to figure out what is happening and discovered some additional key information. Docker Binary BehaviorAccording to my findings, when using the First Connection
Second Connection
Comparing against functionning versionIn order to find out what is happening, I've sniffed packets from 3 differents scenarios:
We'll concentrate on the second connection where we are expecting some output. Docker Engine v1.43 (Docker Desktop 4.24.1) via AnsibleWhen inspecting the packets of the second connection, something is weird: the FIN flag is initiated by the binary. How does it know that the connection needs to be closed?
After the second connection closes, the first connection queries the API to get the summary of the exec command. We can see that it was prematurely closed since it's still Docker Engine v1.43 (Docker Desktop 4.24.1) via BashThat same command that fails in Ansible, if we run in bash within the ansible container (so that it runs from the same environement), it executes successufly with the expected ouput. # From WSL
$ docker exec -i ansible bash
# Within Ansible container
$ docker -H=tcp://host.docker.internal:2375 exec -u root -i alpine /bin/sh -c "/bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp `"&& mkdir "` echo ~/.ansible/tmp/ansible-tmp-1697223552.034312-223-168432577105606 `" && echo ansible-tmp-1697223552.034312-223-168432577105606="` echo ~/.ansible/tmp/ansible-tmp-1697223552.034312-223-168432577105606 `" ) && sleep 0'"
ansible-tmp-1697223552.034312-223-168432577105606=/root/.ansible/tmp/ansible-tmp-1697223552.034312-223-168432577105606
$ When inspecting the packets, there is a big difference: the FIN flag is initiated by the Engine once it finishes streaming the command. This is the behaviour I would expect.
Docker Engine v1.41 (Docker Desktop 4.18.0) via AnsibleWhy did it work with the previous docker engine? Let's inspect the packet flow. We get the same behaviour has with the newest engine, but luckily we get the output in between the FIN request and the FIN confirmation.
Since we managed to get the expected response from the exec command, when the first connection queries the API about the summary of the command, ConclusionFrom my understanding, this is not a new bug. It's been dormant and the changes to the engine caused it to surface. The biggest question is what causes Ansible to close the second connection prematurely? Maybe the handle where the socket is piped is closed, thus cascading upward? |
I cannot reproduce this. I guess the main difference is that I use Linux, not WSL. My Docker client in the container is exactly the same as you have, and my Docker daemon also is 24.0.6, though a slightly different build:
(I'm using Arch Linux with Docker from the system packages.) |
In my case the docker engine is on a remote VM (not in localhost), in this scenario it practically always happens. |
I did now try this with a Docker daemon on a remote machine reached via TCP. Same result: it works flawlessly for me. The connection plugin also isn't doing any magic (assuming you don't use |
I have also tested it on Archlinux, and like @felixfontein it works. I took the opportunity to packet sniff the exchange and we are getting the same weird behavior of Ansible or the Binary sending the FIN flag as soon as the connection is upgraded.
As @felixfontein mentioned, there's not at lot happening in the connection plugin, but the binary behaves differently has it does in a shell. When I have some time later this week, I'll try to isolate running the docker binary in a simple python script using |
@jfjauvin did you had a chance to try this out? |
SUMMARY
I'm unable to run commands on a docker host. This used to work.
ISSUE TYPE
COMPONENT NAME
community.docker
ANSIBLE VERSION
COLLECTION VERSION
CONFIGURATION
OS / ENVIRONMENT
Running Ansible in vscode dev container (
mcr.microsoft.com/vscode/devcontainers/python:0-3.10
). Docker version 24.0.1, build 6802122.STEPS TO REPRODUCE
I have a role that includes the following tasks:
EXPECTED RESULTS
I expect this to work.
ACTUAL RESULTS
Running with
-vvv
, everything works up to the last task, which fails:Seems odd that the command has
'/bin/sh', '-c'
twice. Running the same docker command at the command line also doesn't work-it just hangs with no output--but running the command again without the duplicate/bin/sh -c
works as I'd expect.The text was updated successfully, but these errors were encountered: