|
| 1 | +# Shell TSG |
| 2 | + |
| 3 | +**EXPERIMENTAL: `retina shell` is an experimental feature, so the flags and behavior may change in future versions.** |
| 4 | + |
| 5 | +The `retina shell` command allows you to start an interactive shell on a Kubernetes node or pod. This runs a container image with many common networking tools installed (`ping`, `curl`, etc.). |
| 6 | + |
| 7 | +## Testing connectivity |
| 8 | + |
| 9 | +Start a shell on a node or inside a pod |
| 10 | + |
| 11 | +```bash |
| 12 | +# To start a shell in a node (root network namespace): |
| 13 | +kubectl retina shell aks-nodepool1-15232018-vmss000001 |
| 14 | + |
| 15 | +# To start a shell inside a pod (pod network namespace): |
| 16 | +kubectl retina shell -n kube-system pods/coredns-d459997b4-7cpzx |
| 17 | +``` |
| 18 | + |
| 19 | +Check connectivity using `ping`: |
| 20 | + |
| 21 | +```text |
| 22 | +root [ / ]# ping 10.224.0.4 |
| 23 | +PING 10.224.0.4 (10.224.0.4) 56(84) bytes of data. |
| 24 | +64 bytes from 10.224.0.4: icmp_seq=1 ttl=64 time=0.964 ms |
| 25 | +64 bytes from 10.224.0.4: icmp_seq=2 ttl=64 time=1.13 ms |
| 26 | +64 bytes from 10.224.0.4: icmp_seq=3 ttl=64 time=0.908 ms |
| 27 | +64 bytes from 10.224.0.4: icmp_seq=4 ttl=64 time=1.07 ms |
| 28 | +64 bytes from 10.224.0.4: icmp_seq=5 ttl=64 time=1.01 ms |
| 29 | +
|
| 30 | +--- 10.224.0.4 ping statistics --- |
| 31 | +5 packets transmitted, 5 received, 0% packet loss, time 4022ms |
| 32 | +rtt min/avg/max/mdev = 0.908/1.015/1.128/0.077 ms |
| 33 | +``` |
| 34 | + |
| 35 | +Check DNS resolution using `dig`: |
| 36 | + |
| 37 | +```text |
| 38 | +root [ / ]# dig example.com +short |
| 39 | +93.184.215.14 |
| 40 | +``` |
| 41 | + |
| 42 | +The tools `nslookup` and `drill` are also available if you prefer those. |
| 43 | + |
| 44 | +Check connectivity to apiserver using `nc` and `curl`: |
| 45 | + |
| 46 | +```text |
| 47 | +root [ / ]# nc -zv 10.0.0.1 443 |
| 48 | +Ncat: Version 7.95 ( https://nmap.org/ncat ) |
| 49 | +Ncat: Connected to 10.0.0.1:443. |
| 50 | +Ncat: 0 bytes sent, 0 bytes received in 0.06 seconds. |
| 51 | +
|
| 52 | +root [ / ]# curl -k https://10.0.0.1 |
| 53 | +{ |
| 54 | + "kind": "Status", |
| 55 | + "apiVersion": "v1", |
| 56 | + "metadata": {}, |
| 57 | + "status": "Failure", |
| 58 | + "message": "Unauthorized", |
| 59 | + "reason": "Unauthorized", |
| 60 | + "code": 401 |
| 61 | +} |
| 62 | +``` |
| 63 | + |
| 64 | +### nftables and iptables |
| 65 | + |
| 66 | +Accessing nftables and iptables rules requires `NET_RAW` and `NET_ADMIN` capabilities. |
| 67 | + |
| 68 | +```bash |
| 69 | +kubectl retina shell aks-nodepool1-15232018-vmss000002 --capabilities NET_ADMIN,NET_RAW |
| 70 | +``` |
| 71 | + |
| 72 | +Then you can run `iptables` and `nft`: |
| 73 | + |
| 74 | +```text |
| 75 | +root [ / ]# iptables -nvL | head -n 2 |
| 76 | +Chain INPUT (policy ACCEPT 1191K packets, 346M bytes) |
| 77 | + pkts bytes target prot opt in out source destination |
| 78 | +root [ / ]# nft list ruleset | head -n 2 |
| 79 | +# Warning: table ip filter is managed by iptables-nft, do not touch! |
| 80 | +table ip filter { |
| 81 | +``` |
| 82 | + |
| 83 | +**If you see the error "Operation not permitted (you must be root)", check that your `kubectl retina shell` command sets `--capabilities NET_RAW,NET_ADMIN`.** |
| 84 | + |
| 85 | +`iptables` in the shell image uses `iptables-legacy`, which may or may not match the configuration on the node. For example, Ubuntu maps `iptables` to `iptables-nft`. To use the exact same `iptables` binary as installed on the node, you will need to `chroot` into the host filesystem (see below). |
| 86 | + |
| 87 | +## Accessing the host filesystem |
| 88 | + |
| 89 | +On nodes, you can mount the host filesystem to `/host`: |
| 90 | + |
| 91 | +```bash |
| 92 | +kubectl retina shell aks-nodepool1-15232018-vmss000002 --mount-host-filesystem |
| 93 | +``` |
| 94 | + |
| 95 | +This mounts the host filesystem (`/`) to `/host` in the debug pod: |
| 96 | + |
| 97 | +```text |
| 98 | +root [ / ]# ls /host |
| 99 | +NOTICE.txt bin boot dev etc home lib lib64 libx32 lost+found media mnt opt proc root run sbin srv sys tmp usr var |
| 100 | +``` |
| 101 | + |
| 102 | +The host filesystem is mounted read-only by default. If you need write access, use the `--allow-host-filesystem-write` flag. |
| 103 | + |
| 104 | +Symlinks between files on the host filesystem may not resolve correctly. If you see "No such file or directory" errors for symlinks, try following the instructions below to `chroot` to the host filesystem. |
| 105 | + |
| 106 | +## Chroot to the host filesystem |
| 107 | + |
| 108 | +`chroot` requires the `SYS_CHROOT` capability: |
| 109 | + |
| 110 | +```bash |
| 111 | +kubectl retina shell aks-nodepool1-15232018-vmss000002 --mount-host-filesystem --capabilities SYS_CHROOT |
| 112 | +``` |
| 113 | + |
| 114 | +Then you can use `chroot` to switch to start a shell inside the host filesystem: |
| 115 | + |
| 116 | +```text |
| 117 | +root [ / ]# chroot /host bash |
| 118 | +root@aks-nodepool1-15232018-vmss000002:/# cat /etc/resolv.conf | tail -n 2 |
| 119 | +nameserver 168.63.129.16 |
| 120 | +search shncgv2kgepuhm1ls1dwgholsd.cx.internal.cloudapp.net |
| 121 | +``` |
| 122 | + |
| 123 | +`chroot` allows you to: |
| 124 | + |
| 125 | +* Execute binaries installed on the node. |
| 126 | +* Resolve symlinks that point to files in the host filesystem (such as /etc/resolv.conf -> /run/systemd/resolve/resolv.conf) |
| 127 | +* Use `sysctl` to view or modify kernel parameters. |
| 128 | +* Use `journalctl` to view systemd unit and kernel logs. |
| 129 | +* Use `ip netns` to view network namespaces. (However, `ip netns exec` does not work.) |
| 130 | + |
| 131 | +## Systemctl |
| 132 | + |
| 133 | +`systemctl` commands require both `chroot` to the host filesystem and host PID: |
| 134 | + |
| 135 | +```bash |
| 136 | +kubectl retina shell aks-nodepool1-15232018-vmss000002 --mount-host-filesystem --capabilities SYS_CHROOT --host-pid |
| 137 | +``` |
| 138 | + |
| 139 | +Then `chroot` to the host filesystem and run `systemctl status`: |
| 140 | + |
| 141 | +```text |
| 142 | +root [ / ]# chroot /host systemctl status | head -n 2 |
| 143 | +● aks-nodepool1-15232018-vmss000002 |
| 144 | + State: running |
| 145 | +``` |
| 146 | + |
| 147 | +**If `systemctl` shows an error "Failed to connect to bus: No data available", check that the `retina shell` command has `--host-pid` set and that you have chroot'd to /host.** |
| 148 | + |
| 149 | +## Troubleshooting |
| 150 | + |
| 151 | +### Timeouts |
| 152 | + |
| 153 | +If `kubectl retina shell` fails with a timeout error, then: |
| 154 | + |
| 155 | +1. Increase the timeout by setting `--timeout` flag. |
| 156 | +2. Check the pod using `kubectl describe pod` to determine why retina shell is failing to start. |
| 157 | + |
| 158 | +Example: |
| 159 | + |
| 160 | +```bash |
| 161 | +kubectl retina shell --timeout 10m node001 # increase timeout to 10 minutes |
| 162 | +``` |
| 163 | + |
| 164 | +### Firewalls and ImagePullBackoff |
| 165 | + |
| 166 | +Some clusters are behind a firewall that blocks pulling the retina-shell image. To workaround this: |
| 167 | + |
| 168 | +1. Replicate the retina-shell images to a container registry accessible from within the cluster. |
| 169 | +2. Override the image used by Retina CLI with the environment variable `RETINA_SHELL_IMAGE_REPO`. |
| 170 | + |
| 171 | +Example: |
| 172 | + |
| 173 | +```bash |
| 174 | +export RETINA_SHELL_IMAGE_REPO="example.azurecr.io/retina/retina-shell" |
| 175 | +export RETINA_SHELL_IMAGE_VERSION=v0.0.1 # optional, if not set defaults to the Retina CLI version. |
| 176 | +kubectl retina shell node0001 # this will use the image "example.azurecr.io/retina/retina-shell:v0.0.1" |
| 177 | +``` |
| 178 | + |
| 179 | +## Limitations |
| 180 | + |
| 181 | +* Windows nodes and pods are not yet supported. |
| 182 | +* `bpftool` and `bpftrace` are not supported. |
| 183 | +* The shell image link `iptables` commands to `iptables-legacy`, even if the node itself links to `iptables-nft`. |
| 184 | +* `nsenter` is not supported. |
| 185 | +* `ip netns` will not work without `chroot` to the host filesystem. |
| 186 | + |
0 commit comments