Files
monok8s/docs/installing-ssh-pod.md

205 lines
5.0 KiB
Markdown

# Installing the recovery SSHD pod
This page explains how to install a temporary SSH server pod for break-glass recovery.
Use this when normal Kubernetes access is degraded, for example after the API server certificate expires or rotates and you need to retrieve updated host-side credentials.
The SSHD pod is intended for recovery and debugging only. Remove it when you are done.
## What this does
The recovery pod starts an SSH server on the selected node and authorizes your local SSH public key.
The pod also mounts selected host paths under `/host`, so you can inspect the host filesystem and run some host-side recovery commands through `chroot`.
For example:
```sh
chroot /host /bin/sh -lc 'rc-status'
chroot /host /bin/sh -lc 'rc-service crio status'
chroot /host /bin/sh -lc 'rc-service kubelet status'
```
## Requirements
You need:
- A working `kubectl` connection to the cluster.
- Access to the `node-agent` DaemonSet in the `mono-system` namespace.
- A local SSH public key, usually `~/.ssh/id_rsa.pub` or `~/.ssh/id_ed25519.pub`.
Use a public key file only. Do not pass your private key.
## Generate the SSHD manifest
To print the recovery SSHD manifest:
```bash
kubectl exec -i -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub
```
This reads your local public key and places it into the generated pod's `authorized_keys`.
If you use Ed25519 keys, use:
```bash
kubectl exec -i -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_ed25519.pub
```
## Generate and apply the manifest
To create the recovery SSHD resources in one step:
```bash
kubectl exec -i -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub \
| kubectl apply -f -
```
For Ed25519:
```bash
kubectl exec -i -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_ed25519.pub \
| kubectl apply -f -
```
## Why `-i` is used instead of `-it`
Use `-i`, not `-it`, when piping the SSH public key.
The `-t` option allocates a pseudo-TTY. A pseudo-TTY can modify piped input, which is not what you want when passing an SSH public key through stdin.
Correct:
```bash
kubectl exec -i -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub
```
Avoid:
```bash
kubectl exec -it -n mono-system ds/node-agent -- \
ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub
```
## Check that the pod is running
After applying the manifest, check the pod:
```bash
kubectl get pods -n mono-system -l app.kubernetes.io/name=sshd
```
Check the service:
```bash
kubectl get svc -n mono-system -l app.kubernetes.io/name=sshd
```
If the pod does not start, inspect it:
```bash
kubectl describe pod -n mono-system -l app.kubernetes.io/name=sshd
```
## Connect through SSH
The exact SSH command depends on how the generated service exposes the pod.
If the service uses a NodePort such as `30022`, connect with:
```bash
ssh -p 30022 root@<node-ip>
```
Replace `<node-ip>` with the node's reachable IP address.
## Access the host environment
Inside the SSH session, the host filesystem is available under `/host`.
Useful checks:
```sh
ls -la /host
chroot /host /bin/sh -lc 'rc-status'
chroot /host /bin/sh -lc 'rc-service crio status'
chroot /host /bin/sh -lc 'rc-service kubelet status'
```
Restart CRI-O:
```sh
chroot /host /bin/sh -lc 'rc-service crio restart'
```
Restart kubelet:
```sh
chroot /host /bin/sh -lc 'rc-service kubelet restart'
```
You can also inspect host processes from the pod because the recovery pod uses the host PID namespace:
```sh
ps aux | grep -E 'kubelet|crio'
```
## Notes for monok8s host mounts
The recovery pod does not mount host `/` directly.
On monok8s, `/` and `/var` may be private mounts. Mounting them directly as host paths can fail with errors such as:
```text
path "/" is mounted on "/" but it is not a shared or slave mount
```
or:
```text
path "/var" is mounted on "/var" but it is not a shared or slave mount
```
Instead, the recovery pod assembles a minimal host root under `/host` from individual host paths.
For `/var`, it uses the backing path:
```text
/data/var -> /host/var
```
This avoids the private bind-mount issue.
## Remove the recovery pod
When recovery is complete, remove the generated resources.
If the resources use the default SSHD labels:
```bash
kubectl delete deployment -n mono-system -l app.kubernetes.io/name=sshd
kubectl delete service -n mono-system -l app.kubernetes.io/name=sshd
kubectl delete configmap -n mono-system -l app.kubernetes.io/name=sshd
```
If your generated manifest uses a fixed resource name, you can also remove them by name:
```bash
kubectl delete deployment -n mono-system sshd
kubectl delete service -n mono-system sshd
kubectl delete configmap -n mono-system sshd-authorized-keys
```
## Security warning
This pod is powerful.
It runs with root-level recovery access and can inspect or modify host files through `/host`. Treat it as a temporary break-glass tool, not a normal service.
Do not leave it running after recovery.