feat(k8s): enforce kind extraMount compatibility on cluster reuse
Kind applies extraMounts only at cluster creation. When a deployment joins an existing shared cluster, any extraMount its kind-config declares that isn't already active on the running control-plane is silently ignored — PVs backed by those mounts fall through to the node's overlay filesystem and lose data on cluster destroy. Validate this up front in create_cluster(): - On cluster reuse, compare the new deployment's extraMounts against the live bind mounts on the control-plane container (via docker inspect). Fail with a DeployerException listing every mismatched mount and pointing at docs/deployment_patterns.md. - On first-time cluster creation without a /mnt umbrella mount (kind-mount-root unset), print a warning that future stacks may require a full recreate to add new host-path mounts. Document the umbrella-mount convention (kind-mount-root) and the migration path for existing clusters in docs/deployment_patterns.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>pull/748/head
parent
eb4704b563
commit
782c71ae36
|
|
@ -164,6 +164,9 @@ To stop a single deployment without affecting the cluster:
|
||||||
laconic-so deployment --dir my-deployment stop --skip-cluster-management
|
laconic-so deployment --dir my-deployment stop --skip-cluster-management
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Stacks sharing a cluster must agree on mount topology. See
|
||||||
|
[Volume Persistence in k8s-kind](#volume-persistence-in-k8s-kind).
|
||||||
|
|
||||||
## Volume Persistence in k8s-kind
|
## Volume Persistence in k8s-kind
|
||||||
|
|
||||||
k8s-kind has 3 storage layers:
|
k8s-kind has 3 storage layers:
|
||||||
|
|
@ -172,7 +175,9 @@ k8s-kind has 3 storage layers:
|
||||||
- **Kind Node**: A Docker container simulating a k8s node
|
- **Kind Node**: A Docker container simulating a k8s node
|
||||||
- **Pod Container**: Your workload
|
- **Pod Container**: Your workload
|
||||||
|
|
||||||
For k8s-kind, volumes with paths are mounted from Docker Host → Kind Node → Pod via extraMounts.
|
Volumes with paths are mounted from Docker Host → Kind Node → Pod via kind
|
||||||
|
`extraMounts`. Kind applies `extraMounts` only at cluster creation — they
|
||||||
|
cannot be added to a running cluster.
|
||||||
|
|
||||||
| spec.yml volume | Storage Location | Survives Pod Restart | Survives Cluster Restart |
|
| spec.yml volume | Storage Location | Survives Pod Restart | Survives Cluster Restart |
|
||||||
|-----------------|------------------|---------------------|-------------------------|
|
|-----------------|------------------|---------------------|-------------------------|
|
||||||
|
|
@ -200,3 +205,51 @@ Empty-path volumes appear persistent because they survive pod restarts (data liv
|
||||||
in Kind Node container). However, this data is lost when the kind cluster is
|
in Kind Node container). However, this data is lost when the kind cluster is
|
||||||
recreated. This "false persistence" has caused data loss when operators assumed
|
recreated. This "false persistence" has caused data loss when operators assumed
|
||||||
their data was safe.
|
their data was safe.
|
||||||
|
|
||||||
|
### Shared Clusters: Use `kind-mount-root`
|
||||||
|
|
||||||
|
Because kind `extraMounts` can only be set at cluster creation, the first
|
||||||
|
deployment to start locks in the mount topology. Later deployments that
|
||||||
|
declare new `extraMounts` have them silently ignored — their PVs fall
|
||||||
|
through to the kind node's overlay filesystem and lose data on cluster
|
||||||
|
destroy.
|
||||||
|
|
||||||
|
The fix is an umbrella mount. Set `kind-mount-root` in the spec, pointing
|
||||||
|
at a host directory all stacks will share:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# spec.yml
|
||||||
|
kind-mount-root: /srv/kind
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
my-data: /srv/kind/my-stack/data # visible at /mnt/my-stack/data in-node
|
||||||
|
```
|
||||||
|
|
||||||
|
SO emits a single `extraMount` (`<kind-mount-root>` → `/mnt`). Any new
|
||||||
|
host subdirectory under the root is visible in the node immediately — no
|
||||||
|
cluster recreate needed to add stacks.
|
||||||
|
|
||||||
|
**All stacks sharing a cluster must agree on `kind-mount-root`** and keep
|
||||||
|
their host paths under it.
|
||||||
|
|
||||||
|
### Mount Compatibility Enforcement
|
||||||
|
|
||||||
|
`laconic-so deployment start` validates mount topology:
|
||||||
|
|
||||||
|
- **On first cluster creation** without an umbrella mount: prints a
|
||||||
|
warning (future stacks may require a full recreate to add mounts).
|
||||||
|
- **On cluster reuse**: compares the new deployment's `extraMounts`
|
||||||
|
against the live mounts on the control-plane container. Any mismatch
|
||||||
|
(wrong host path, or mount missing) fails the deploy.
|
||||||
|
|
||||||
|
### Migrating an Existing Cluster
|
||||||
|
|
||||||
|
If a cluster was created without an umbrella mount and you need to add a
|
||||||
|
stack that requires new host-path mounts, the cluster must be recreated:
|
||||||
|
|
||||||
|
1. Back up ephemeral state (DBs, caches) from PVs that lack host mounts —
|
||||||
|
these are in the kind node overlay FS and do not survive `kind delete`.
|
||||||
|
2. Update every stack's spec to set a shared `kind-mount-root` and place
|
||||||
|
host paths under it.
|
||||||
|
3. Stop all deployments, destroy the cluster, recreate it by starting any
|
||||||
|
stack (umbrella now active), and restore state.
|
||||||
|
|
|
||||||
|
|
@ -15,11 +15,13 @@
|
||||||
|
|
||||||
from kubernetes import client, utils, watch
|
from kubernetes import client, utils, watch
|
||||||
from kubernetes.client.exceptions import ApiException
|
from kubernetes.client.exceptions import ApiException
|
||||||
|
import json
|
||||||
import os
|
import os
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import subprocess
|
import subprocess
|
||||||
import re
|
import re
|
||||||
from typing import Set, Mapping, List, Optional, cast
|
import sys
|
||||||
|
from typing import Dict, Set, Mapping, List, Optional, cast
|
||||||
import yaml
|
import yaml
|
||||||
|
|
||||||
from stack_orchestrator.util import get_k8s_dir, error_exit
|
from stack_orchestrator.util import get_k8s_dir, error_exit
|
||||||
|
|
@ -216,6 +218,142 @@ def _install_caddy_cert_backup(
|
||||||
print("Installed caddy cert backup CronJob")
|
print("Installed caddy cert backup CronJob")
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_kind_extra_mounts(config_file: str) -> List[Dict[str, str]]:
|
||||||
|
"""Return the list of extraMounts declared in a kind config file."""
|
||||||
|
try:
|
||||||
|
with open(config_file) as f:
|
||||||
|
config = yaml.safe_load(f) or {}
|
||||||
|
except (OSError, yaml.YAMLError) as e:
|
||||||
|
if opts.o.debug:
|
||||||
|
print(f"Could not parse kind config {config_file}: {e}")
|
||||||
|
return []
|
||||||
|
mounts = []
|
||||||
|
for node in config.get("nodes", []) or []:
|
||||||
|
for m in node.get("extraMounts", []) or []:
|
||||||
|
host_path = m.get("hostPath")
|
||||||
|
container_path = m.get("containerPath")
|
||||||
|
if host_path and container_path:
|
||||||
|
mounts.append(
|
||||||
|
{"hostPath": host_path, "containerPath": container_path}
|
||||||
|
)
|
||||||
|
return mounts
|
||||||
|
|
||||||
|
|
||||||
|
def _get_control_plane_node(cluster_name: str) -> Optional[str]:
|
||||||
|
"""Return the kind control-plane node container name for a cluster."""
|
||||||
|
result = subprocess.run(
|
||||||
|
["kind", "get", "nodes", "--name", cluster_name],
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
)
|
||||||
|
if result.returncode != 0:
|
||||||
|
return None
|
||||||
|
for line in result.stdout.splitlines():
|
||||||
|
line = line.strip()
|
||||||
|
if line.endswith("control-plane"):
|
||||||
|
return line
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _get_running_cluster_mounts(cluster_name: str) -> Dict[str, str]:
|
||||||
|
"""Return {containerPath: hostPath} for bind mounts on the control-plane."""
|
||||||
|
node = _get_control_plane_node(cluster_name)
|
||||||
|
if not node:
|
||||||
|
return {}
|
||||||
|
result = subprocess.run(
|
||||||
|
["docker", "inspect", node, "--format", "{{json .Mounts}}"],
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
)
|
||||||
|
if result.returncode != 0:
|
||||||
|
return {}
|
||||||
|
try:
|
||||||
|
mounts = json.loads(result.stdout or "[]")
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
return {}
|
||||||
|
return {
|
||||||
|
m["Destination"]: m["Source"]
|
||||||
|
for m in mounts
|
||||||
|
if m.get("Type") == "bind" and m.get("Destination") and m.get("Source")
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _check_mounts_compatible(cluster_name: str, config_file: str) -> None:
|
||||||
|
"""Fail if the new deployment's extraMounts aren't active on the cluster.
|
||||||
|
|
||||||
|
Kind applies extraMounts only at cluster creation. When a deployment
|
||||||
|
joins an existing cluster, any extraMount its kind-config declares that
|
||||||
|
isn't already active on the running node will silently fall through to
|
||||||
|
the node's overlay filesystem — data looks persisted but is lost on
|
||||||
|
cluster destroy. Catch this up front.
|
||||||
|
"""
|
||||||
|
required = _parse_kind_extra_mounts(config_file)
|
||||||
|
if not required:
|
||||||
|
return
|
||||||
|
live = _get_running_cluster_mounts(cluster_name)
|
||||||
|
if not live:
|
||||||
|
# Could not inspect — don't block deployment, but warn.
|
||||||
|
print(
|
||||||
|
f"WARNING: could not inspect mounts on cluster '{cluster_name}'; "
|
||||||
|
"skipping extraMount compatibility check",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
mismatches = []
|
||||||
|
for m in required:
|
||||||
|
dest = m["containerPath"]
|
||||||
|
want = m["hostPath"]
|
||||||
|
have = live.get(dest)
|
||||||
|
if have != want:
|
||||||
|
mismatches.append((dest, want, have))
|
||||||
|
if not mismatches:
|
||||||
|
return
|
||||||
|
lines = [
|
||||||
|
f"This deployment declares extraMounts that are not active on the "
|
||||||
|
f"running cluster '{cluster_name}':",
|
||||||
|
]
|
||||||
|
for dest, want, have in mismatches:
|
||||||
|
lines.append(
|
||||||
|
f" - {dest}: expected host path '{want}', "
|
||||||
|
f"actual '{have or 'NOT MOUNTED'}'"
|
||||||
|
)
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
"",
|
||||||
|
"Kind applies extraMounts only at cluster creation — neither "
|
||||||
|
"kind nor Docker supports adding bind mounts to a running "
|
||||||
|
"container. Without a recreate, any PV backed by one of the "
|
||||||
|
"missing mounts will silently fall through to the node's "
|
||||||
|
"overlay filesystem and lose data on cluster destroy.",
|
||||||
|
"",
|
||||||
|
"Fix: destroy and recreate the cluster with a kind-config that "
|
||||||
|
"includes an umbrella mount via 'kind-mount-root'. All stacks "
|
||||||
|
"sharing the cluster must agree on 'kind-mount-root' and place "
|
||||||
|
"their host paths under it. See docs/deployment_patterns.md.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
raise DeployerException("\n".join(lines))
|
||||||
|
|
||||||
|
|
||||||
|
def _warn_if_no_umbrella(config_file: str) -> None:
|
||||||
|
"""Warn if creating a cluster without a '/mnt' umbrella mount.
|
||||||
|
|
||||||
|
Without an umbrella, future stacks joining this cluster that need new
|
||||||
|
host-path mounts will fail the compatibility check and require a full
|
||||||
|
cluster recreate to add them.
|
||||||
|
"""
|
||||||
|
mounts = _parse_kind_extra_mounts(config_file)
|
||||||
|
if any(m.get("containerPath") == "/mnt" for m in mounts):
|
||||||
|
return
|
||||||
|
print(
|
||||||
|
"WARNING: creating kind cluster without an umbrella mount "
|
||||||
|
"('kind-mount-root' not set). Future stacks added to this cluster "
|
||||||
|
"that require new host-path mounts will not be able to without a "
|
||||||
|
"full cluster recreate. See docs/deployment_patterns.md.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def create_cluster(name: str, config_file: str):
|
def create_cluster(name: str, config_file: str):
|
||||||
"""Create or reuse the single kind cluster for this host.
|
"""Create or reuse the single kind cluster for this host.
|
||||||
|
|
||||||
|
|
@ -232,8 +370,10 @@ def create_cluster(name: str, config_file: str):
|
||||||
existing = get_kind_cluster()
|
existing = get_kind_cluster()
|
||||||
if existing:
|
if existing:
|
||||||
print(f"Using existing cluster: {existing}")
|
print(f"Using existing cluster: {existing}")
|
||||||
|
_check_mounts_compatible(existing, config_file)
|
||||||
return existing
|
return existing
|
||||||
|
|
||||||
|
_warn_if_no_umbrella(config_file)
|
||||||
print(f"Creating new cluster: {name}")
|
print(f"Creating new cluster: {name}")
|
||||||
result = _run_command(f"kind create cluster --name {name} --config {config_file}")
|
result = _run_command(f"kind create cluster --name {name} --config {config_file}")
|
||||||
if result.returncode != 0:
|
if result.returncode != 0:
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue