Files
monok8s/docs/installing-ssh-pod.md

5.0 KiB

Installing the recovery SSHD pod

This page explains how to install a temporary SSH server pod for break-glass recovery.

Use this when normal Kubernetes access is degraded, for example after the API server certificate expires or rotates and you need to retrieve updated host-side credentials.

The SSHD pod is intended for recovery and debugging only. Remove it when you are done.

What this does

The recovery pod starts an SSH server on the selected node and authorizes your local SSH public key.

The pod also mounts selected host paths under /host, so you can inspect the host filesystem and run some host-side recovery commands through chroot.

For example:

chroot /host /bin/sh -lc 'rc-status'
chroot /host /bin/sh -lc 'rc-service crio status'
chroot /host /bin/sh -lc 'rc-service kubelet status'

Requirements

You need:

  • A working kubectl connection to the cluster.
  • Access to the node-agent DaemonSet in the mono-system namespace.
  • A local SSH public key, usually ~/.ssh/id_rsa.pub or ~/.ssh/id_ed25519.pub.

Use a public key file only. Do not pass your private key.

Generate the SSHD manifest

To print the recovery SSHD manifest:

kubectl exec -i -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub

This reads your local public key and places it into the generated pod's authorized_keys.

If you use Ed25519 keys, use:

kubectl exec -i -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_ed25519.pub

Generate and apply the manifest

To create the recovery SSHD resources in one step:

kubectl exec -i -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub \
  | kubectl apply -f -

For Ed25519:

kubectl exec -i -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_ed25519.pub \
  | kubectl apply -f -

Why -i is used instead of -it

Use -i, not -it, when piping the SSH public key.

The -t option allocates a pseudo-TTY. A pseudo-TTY can modify piped input, which is not what you want when passing an SSH public key through stdin.

Correct:

kubectl exec -i -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub

Avoid:

kubectl exec -it -n mono-system ds/node-agent -- \
  ctl create sshd --authkeys /dev/stdin < ~/.ssh/id_rsa.pub

Check that the pod is running

After applying the manifest, check the pod:

kubectl get pods -n mono-system -l app.kubernetes.io/name=sshd

Check the service:

kubectl get svc -n mono-system -l app.kubernetes.io/name=sshd

If the pod does not start, inspect it:

kubectl describe pod -n mono-system -l app.kubernetes.io/name=sshd

Connect through SSH

The exact SSH command depends on how the generated service exposes the pod.

If the service uses a NodePort such as 30022, connect with:

ssh -p 30022 root@<node-ip>

Replace <node-ip> with the node's reachable IP address.

Access the host environment

Inside the SSH session, the host filesystem is available under /host.

Useful checks:

ls -la /host
chroot /host /bin/sh -lc 'rc-status'
chroot /host /bin/sh -lc 'rc-service crio status'
chroot /host /bin/sh -lc 'rc-service kubelet status'

Restart CRI-O:

chroot /host /bin/sh -lc 'rc-service crio restart'

Restart kubelet:

chroot /host /bin/sh -lc 'rc-service kubelet restart'

You can also inspect host processes from the pod because the recovery pod uses the host PID namespace:

ps aux | grep -E 'kubelet|crio'

Notes for monok8s host mounts

The recovery pod does not mount host / directly.

On monok8s, / and /var may be private mounts. Mounting them directly as host paths can fail with errors such as:

path "/" is mounted on "/" but it is not a shared or slave mount

or:

path "/var" is mounted on "/var" but it is not a shared or slave mount

Instead, the recovery pod assembles a minimal host root under /host from individual host paths.

For /var, it uses the backing path:

/data/var -> /host/var

This avoids the private bind-mount issue.

Remove the recovery pod

When recovery is complete, remove the generated resources.

If the resources use the default SSHD labels:

kubectl delete deployment -n mono-system -l app.kubernetes.io/name=sshd
kubectl delete service -n mono-system -l app.kubernetes.io/name=sshd
kubectl delete configmap -n mono-system -l app.kubernetes.io/name=sshd

If your generated manifest uses a fixed resource name, you can also remove them by name:

kubectl delete deployment -n mono-system sshd
kubectl delete service -n mono-system sshd
kubectl delete configmap -n mono-system sshd-authorized-keys

Security warning

This pod is powerful.

It runs with root-level recovery access and can inspect or modify host files through /host. Treat it as a temporary break-glass tool, not a normal service.

Do not leave it running after recovery.