Added cmm docs

2026-05-11 07:56:55 +08:00
parent 7411e1994b
commit 222235fa1c
2 changed files with 328 additions and 0 deletions
--- a/docs/cmm.md
+++ b/docs/cmm.md
@@ -0,0 +1,320 @@
+# CMM integration for monok8s
+
+This document describes how monok8s runs the vendor Connection Manager daemon (`cmm`) from [ASK](https://github.com/we-are-mono/ASK/) on Kubernetes nodes.
+
+`cmm` is part of the NXP/ASK hardware-offload stack. In the vendor layout it is normally started as a boot-time service, together with the `cdx` kernel module and `dpa_app`. monok8s intentionally does **not** follow that model. Kubernetes has priority: the node should boot, kubelet should come up, CNI should configure networking, and only then should the CMM stack start from a DaemonSet.
+
+## Startup model
+
+The intended startup order is:
+
+1. The node boots.
+2. `kubelet` starts.
+3. CNI is configured.
+4. The `cmm` DaemonSet starts on the node.
+5. The DaemonSet prepares the DPA/CDX runtime and starts `cmm` in the foreground.
+
+This is different from the vendor flow, where CMM-related components are treated as host services started early during boot. That flow is a poor fit for monok8s because CNI and Kubernetes-owned networking must win any ordering conflict.
+
+## Local changes from vendor ASK
+
+monok8s carries a small set of patches so the ASK CMM stack behaves correctly inside a Kubernetes pod.
+
+### `cmm`
+
+The `cmm` daemon is patched to:
+
+- run in the foreground, so it can be supervised directly by Kubernetes;
+- log to stdout/stderr, so logs are visible through `kubectl logs`;
+- avoid exiting when it sees CNI-managed conntrack entries it does not understand.
+
+The last item is important. On a Kubernetes node, conntrack is not exclusively owned by CMM. CNI, kubelet, host networking, and ordinary pods can all create conntrack entries. CMM must tolerate that environment.
+
+### `cdx` kernel module
+
+The `cdx` module is patched so loading the module does **not** automatically start `dpa_app`.
+
+In monok8s, module loading and DPA configuration are separate steps. This avoids doing device configuration too early, before Kubernetes networking is ready.
+
+### `dpa_app`
+
+`dpa_app` is patched so the XML config paths can be supplied through environment variables:
+
+- `CDX_CFG_FILE`
+- `CDX_PCD_FILE`
+- `CDX_PDL_FILE`
+- `CDX_SP_FILE`
+
+This lets the DaemonSet select different XML files per node, board, or port layout without rebuilding the image.
+
+## Patch locations
+
+The relevant patch sets are under:
+
+```text
+patches/ask/upstream/libnetfilter-conntrack
+patches/ask/cmm
+patches/ask/cdx
+patches/ask/dpa
+```
+
+Other ASK patches in the tree are mostly kernel-porting work for the target NXP LSDK kernel, including the 6.18-based kernel used by monok8s.
+
+## Installation
+
+CMM is **not installed by default**. Install it explicitly after the node-control components are available.
+
+With `MKS_ENABLE_NODE_CONTROL` enabled, generate and apply the CMM manifests with:
+
+```sh
+kubectl -n mono-system exec -it ds/node-agent -- ctl create cmm | kubectl apply -f -
+```
+
+This creates the CMM DaemonSet and the supporting objects required to run it on each matching node.
+
+Check that the pod is running:
+
+```sh
+kubectl -n mono-system get pods -l app.kubernetes.io/name=cmm -owide
+```
+
+View logs with:
+
+```sh
+kubectl -n mono-system logs ds/cmm -f
+```
+
+If the DaemonSet name or labels change, inspect the generated YAML from `ctl create cmm` and use the actual object names.
+
+## Accessing the CMM CLI
+
+The CMM CLI is exposed from the pod for debugging and manual inspection. The DaemonSet uses `hostNetwork: true`, but the safest access method is still `kubectl port-forward`; it avoids exposing the CLI beyond your local machine.
+
+First find a CMM pod:
+
+```sh
+kubectl -n mono-system get pods -l app.kubernetes.io/name=cmm
+```
+
+Then forward a local port to the CMM CLI port inside the pod. Kubernetes port-forward syntax is `LOCAL_PORT:REMOTE_PORT`.
+
+For example, if CMM listens on port `12345` in the pod:
+
+```sh
+kubectl -n mono-system port-forward pod/cmm-xxxxx 12345:2103
+```
+
+In another terminal, connect to the local forwarded port:
+
+```sh
+telnet 127.0.0.1 12345
+```
+
+Use `telnet` for this CLI. Plain `ncat` can show leading garbage characters or behave badly with the login prompt because the CMM CLI behaves like a telnet-style interactive console rather than a clean raw TCP protocol.
+
+Default login, if unchanged by the generated config, is usually:
+
+```text
+Username: admin
+Password: admin
+```
+
+Do not expose this port through a Service or LoadBalancer unless you have added proper access control. Treat the CMM CLI as an operator/debug interface.
+
+## Configuration
+
+`ctl create cmm` emits a default configuration suitable for the expected monok8s hardware layout. You can override the generated YAML before applying it.
+
+The vendor's original `fastforward` config is preserved in the image as a reference file, but monok8s uses its own runtime config. Keep those roles separate:
+
+- vendor reference config: useful for comparison and debugging;
+- monok8s runtime config: the config actually consumed by the DaemonSet.
+
+A clear filename for the preserved vendor file is:
+
+```text
+fastforward.vendor.orig
+```
+
+That name is less project-specific than `fastforward.ask.orig` and makes the intent obvious: it is the original vendor-provided config, not the active config.
+
+## Multi-node configuration
+
+If all nodes have the same board and port layout, one shared CMM/DPA config is enough.
+
+If nodes have different port layouts, use node-specific XML config. The recommended pattern is:
+
+1. Mount all supported configs into the CMM pod.
+2. Pass the Kubernetes node name into the pod.
+3. Run a small wrapper script before `dpa_app`.
+4. The wrapper selects the XML files for the current node and exports the corresponding `CDX_*` environment variables.
+5. The wrapper then execs the normal DPA initialization script.
+
+Example DaemonSet fragment:
+
+```yaml
+spec:
+  template:
+    spec:
+      initContainers:
+        - name: dpa-app
+          image: localhost/monok8s/cmm:dev
+          imagePullPolicy: Never
+          command:
+            - /node-config/select-dpa-config.sh
+          env:
+            - name: NODE_NAME
+              valueFrom:
+                fieldRef:
+                  fieldPath: spec.nodeName
+          volumeMounts:
+            - name: node-config
+              mountPath: /node-config
+              readOnly: true
+            - name: dpa-configs
+              mountPath: /etc/monok8s/dpa-configs
+              readOnly: true
+      volumes:
+        - name: node-config
+          configMap:
+            name: cmm-node-config-wrapper
+            defaultMode: 0755
+        - name: dpa-configs
+          configMap:
+            name: cmm-dpa-configs
+```
+
+Example wrapper:
+
+```sh
+#!/bin/sh
+set -eu
+
+CONFIG_DIR="/etc/monok8s/dpa-configs/${NODE_NAME}"
+
+if [ ! -d "${CONFIG_DIR}" ]; then
+    echo "missing DPA config directory for node ${NODE_NAME}: ${CONFIG_DIR}" >&2
+    exit 1
+fi
+
+export CDX_CFG_FILE="${CONFIG_DIR}/cdx_cfg.xml"
+export CDX_PCD_FILE="${CONFIG_DIR}/cdx_pcd.xml"
+export CDX_PDL_FILE="${CONFIG_DIR}/hxs_pdl_v3.xml"
+export CDX_SP_FILE="${CONFIG_DIR}/cdx_sp.xml"
+
+exec /bin/init_dpa.sh
+```
+
+The ConfigMap layout should then look like this conceptually:
+
+```text
+/etc/monok8s/dpa-configs/
+  node-a/
+    cdx_cfg.xml
+    cdx_pcd.xml
+    hxs_pdl_v3.xml
+    cdx_sp.xml
+  node-b/
+    cdx_cfg.xml
+    cdx_pcd.xml
+    hxs_pdl_v3.xml
+    cdx_sp.xml
+```
+
+For production, prefer a naming scheme based on stable node labels or hardware profiles rather than raw node names if multiple nodes share the same layout. Raw node names are fine for early bring-up, but they do not scale well.
+
+## Operational notes
+
+### CMM and Kubernetes conntrack
+
+Do not assume CMM owns the full conntrack table. Kubernetes nodes contain conntrack entries from:
+
+- CNI traffic;
+- kubelet;
+- host-network pods;
+- service routing;
+- node-local traffic;
+- ordinary workloads.
+
+CMM must tolerate unknown entries. If it exits because it encountered a CNI or Kubernetes conntrack entry, that is a bug in the integration layer, not an operator error.
+
+### `hostNetwork: true`
+
+The CMM pod uses `hostNetwork: true` because it needs to interact with host networking and hardware-offload state. This also means any port bound by the pod may be bound in the host network namespace.
+
+For the CLI, prefer `kubectl port-forward` anyway. It gives you a controlled local tunnel and avoids accidentally publishing the CLI on the node network.
+
+### `CMM_MAX_CONNECTIONS`
+
+The default value:
+
+```sh
+CMM_MAX_CONNECTIONS="${CMM_MAX_CONNECTIONS:-131072}"
+```
+
+uses `131072`, which is `128 * 1024`. It is a power-of-two-sized default commonly used for connection-table limits. Treat it as a capacity default, not a magic correctness value.
+
+Lower it if memory pressure is a concern. Raise it only if the hardware, memory budget, and expected traffic justify it.
+
+## Troubleshooting
+
+### The CMM pod is not running
+
+Check the DaemonSet and pod events:
+
+```sh
+kubectl -n mono-system get ds cmm -oyaml
+kubectl -n mono-system describe pod -l app.kubernetes.io/name=cmm
+```
+
+Then check logs:
+
+```sh
+kubectl -n mono-system logs ds/cmm --all-containers=true --tail=200
+```
+
+### The CLI shows strange characters with `ncat`
+
+Use `telnet` instead:
+
+```sh
+telnet 127.0.0.1 2103
+```
+
+The CMM CLI behaves like a telnet-style console. `ncat --telnet` may still not behave exactly like traditional telnet for this CLI.
+
+### Port-forward connects to the wrong port
+
+Remember the syntax:
+
+```text
+LOCAL_PORT:REMOTE_PORT
+```
+
+So this command:
+
+```sh
+kubectl -n mono-system port-forward pod/cmm-xxxxx 12345:2103
+```
+
+means:
+
+```text
+127.0.0.1:12345 on your workstation -> port 2103 inside the pod
+```
+
+Connect to `127.0.0.1:12345`, not `127.0.0.1:2103`.
+
+### `dpa_app` uses the wrong XML files
+
+Confirm the environment seen by the init container or wrapper:
+
+```sh
+kubectl -n mono-system logs pod/cmm-xxxxx -c dpa-app
+```
+
+The wrapper should print enough information to identify the selected config directory and XML paths. If it does not, add explicit logging before `exec /bin/init_dpa.sh`.
+
+## Recommended policy
+
+Keep CMM optional. It is hardware-specific, operationally sharp, and not required for a generic Kubernetes node. The base monok8s node should boot and join the cluster without CMM. Enable CMM only on hardware profiles where the ASK offload stack is expected and tested.