diff --git a/.pebbles/events.jsonl b/.pebbles/events.jsonl index 9ff2a68d..0cf36fce 100644 --- a/.pebbles/events.jsonl +++ b/.pebbles/events.jsonl @@ -46,3 +46,9 @@ {"type":"comment","timestamp":"2026-04-17T08:13:32.753112339Z","issue_id":"so-o2o","payload":{"body":"Tested the version-detection fix (commit 832ab66d) locally. Fix works for its scope but surfaces two more bugs downstream. Current approach is broken at the architectural level, not just one-bug-fixable.\n\nWhat 832ab66d does: captures etcd image ref from crictl after cluster create, writes to {backup_dir}/etcd-image.txt, reads it on subsequent cleanup runs. Self-adapts to Kind upgrades. No more hardcoded v3.5.9. Confirmed locally: etcd-image.txt is written after first create, cleanup on second start uses it, member.backup-YYYYMMDD-HHMMSS dir is produced (proves cleanup ran end-to-end).\n\nWhat still fails after version fix: kubeadm init on cluster recreate. apiserver comes up but returns:\n- 403 Forbidden: User \"kubernetes-admin\" cannot get path /livez\n- 500: Body was not decodable ... json: cannot unmarshal array into Go value of type struct\n- eventually times out waiting for apiserver /livez\n\nTwo new bugs behind those:\n\n(a) Restore step corrupts binary values. In _clean_etcd_keeping_certs the restore loop is:\n key=$(echo $encoded | base64 -d | jq -r .key | base64 -d)\n val=$(echo $encoded | base64 -d | jq -r .value | base64 -d)\n echo \"$val\" | /backup/etcdctl put \"$key\"\nk8s stores objects as protobuf. Piping raw protobuf through bash variable expansion + echo mangles non-printable bytes, truncates at null bytes, and appends a trailing newline. Explains the \"cannot unmarshal\" from apiserver — the kubernetes Service/Endpoints objects in /registry are corrupted on re-put.\n\n(b) Whitelist is too narrow. We keep only /registry/secrets/caddy-system and the /registry/services entries for kubernetes. Everything else is deleted — including /registry/clusterrolebindings (cluster-admin is gone), /registry/serviceaccounts, /registry/secrets/kube-system (bootstrap tokens), RBAC roles, apiserver's auth config. Explains the 403 for kubernetes-admin — cluster-admin binding doesn't exist yet and kubeadm's pre-addon health check can't authorize.\n\nFixing (a) would mean rewriting the restore step to not use shell piping — either use a proper etcdctl-based Go tool, or write directly to the on-disk snapshot format. Fixing (b) means exhaustively whitelisting everything kubeadm/apiserver bootstrapping needs — a moving target across k8s versions. Both together are a significant undertaking for the actual requirement (\"keep 4 Caddy secrets across cluster recreate\").\n\nDecision: merge 832ab66d for the narrow version-detection fix + diagnosis trail, then implement the kubectl-level backup/restore on a separate branch. The etcd approach is not salvageable at reasonable cost."}} {"type":"comment","timestamp":"2026-04-17T11:04:26.542659482Z","issue_id":"so-o2o","payload":{"body":"Shipped in PR #746. Etcd-persistence approach replaced with a kubectl-level Caddy Secret backup/restore gated on kind-mount-root.\n\nSummary of what landed:\n- components/ingress/caddy-cert-backup.yaml: SA/Role/RoleBinding + CronJob (alpine/kubectl:1.35.3) firing every 5min, writes {kind-mount-root}/caddy-cert-backup/caddy-secrets.yaml via atomic tmp+rename.\n- install_ingress_for_kind splits into 3 phases: pre-Deployment manifests → _restore_caddy_certs (kubectl apply from backup file) → Caddy Deployment → _install_caddy_cert_backup. Caddy pod can't exist until phase 3, so certs are always in place before secret_store startup.\n- Deleted _clean_etcd_keeping_certs, _get_etcd_host_path_from_kind_config, _capture_etcd_image, _read_etcd_image_ref, _etcd_image_ref_path and the etcd+PKI block in _generate_kind_mounts.\n- No new spec keys.\n\nTest coverage in tests/k8s-deploy/run-deploy-test.sh: install assertion after first --perform-cluster-management start, plus full E2E (seed fake manager=caddy Secret → trigger CronJob → verify backup file → stop/start --perform-cluster-management for cluster recreate → assert secret restored with matching decoded value).\n\nWoodburn migration: one-shot host-kubectl export to seed {kind-mount-root}/caddy-cert-backup/caddy-secrets.yaml was done manually on the running cluster (the in-cluster CronJob couldn't reach the host because the /srv/kind → /mnt extraMount was staged in kind-config.yml but never applied to the running cluster — it was added after cluster creation). File is in place for the eventual cluster recreate."}} {"type":"close","timestamp":"2026-04-17T11:04:26.999711375Z","issue_id":"so-o2o","payload":{}} +{"type":"create","timestamp":"2026-04-20T13:14:26.312724048Z","issue_id":"so-7fc","payload":{"description":"## Problem\n\nFile-level host-path compose volumes (e.g. `../config/foo.sh:/opt/foo.sh`) were synthesized into a kind extraMount + k8s hostPath PV chain with a sanitized containerPath (`/mnt/host-path-\u003csanitized\u003e`).\n\n- On kind: two deployments of the same stack sharing a cluster collide at that containerPath — kind only honors the first deployment's bind, so subsequent deployments' pods silently read the first's file. No error, no warning.\n- On real k8s: the same code emits `hostPath: /mnt/host-path-*` but nothing populates that path on worker nodes — effectively broken.\n\nFile-level host-path binds are conceptually k8s ConfigMaps. The `snowballtools-base-backend` stack already uses the ConfigMap-backed named-volume pattern manually; this issue is to make that automatic for all stacks.\n\n## Resolution\n\nImplemented on branch `feat/so-b86-auto-configmap-host-path` (commit `cb84388d`), stacked on top of `feat/kind-mount-invariant-check`.\n\n**No deployment-dir file rewriting.** Compose files, spec.yml, and `{deployment_dir}/config/\u003cpod\u003e/` are untouched — trivially diffable against stack source, no synthetic volume names. ConfigMaps are materialized at deploy start and visible only in k8s (`kubectl get cm -n \u003cns\u003e`).\n\n### Deploy create — validation only\n\n| Source shape | Behavior |\n|---|---|\n| Single file | Accepted |\n| Flat directory, no subdirs, ≤ ~700 KiB | Accepted |\n| Directory with subdirs | `DeployerException` — guidance: embed in image / split configmaps / initContainer |\n| File or directory \u003e ~700 KiB | `DeployerException` — ConfigMap budget (accounts for base64 + metadata) |\n| `:rw` on any host-path bind | `DeployerException` — use a named volume for writable data |\n\n### Deploy start — k8s object generation\n\n- `cluster_info.get_configmaps()` walks pod + job compose volumes and emits a `V1ConfigMap` per host-path bind (deduped by sanitized name), content read from `{deployment_dir}/config/\u003cpod\u003e/\u003cfile\u003e`.\n- `volumes_for_pod_files` emits `V1ConfigMapVolumeSource` instead of `V1HostPathVolumeSource` for host-path binds.\n- `volume_mounts_for_service` stats the source and sets `V1VolumeMount.sub_path` to the filename when source is a regular file.\n- `_generate_kind_mounts` no longer emits `/mnt/host-path-*` extraMounts — ConfigMap path bypasses the kind node FS entirely.\n\n### Transition\n\nThe `/mnt/host-path-*` skip in `check_mounts_compatible` is retained as a transition tolerance for deployments created before this change. Test coverage in `tests/k8s-deploy/run-deploy-test.sh` asserts host-path ConfigMaps exist in the namespace, compose/spec in deployment dir unchanged, and no `/mnt/host-path-*` entries in kind-config.yml.","priority":"2","title":"File-level host-path compose volumes alias across deployments sharing a kind cluster","type":"bug"}} +{"type":"status_update","timestamp":"2026-04-20T13:14:26.833816262Z","issue_id":"so-7fc","payload":{"status":"closed"}} +{"type":"comment","timestamp":"2026-04-21T05:57:12.476299839Z","issue_id":"so-n1n","payload":{"body":"Already merged: 929bdab8 is an ancestor of origin/main; all four extraMount emit sites in helpers.py carry `propagation: HostToContainer` (umbrella, per-volume named, per-volume host-path, high-memlock spec)."}} +{"type":"status_update","timestamp":"2026-04-21T05:57:12.928842469Z","issue_id":"so-n1n","payload":{"status":"closed"}} +{"type":"comment","timestamp":"2026-04-21T06:08:13.933886638Z","issue_id":"so-ad7","payload":{"body":"Fixed in PR #744 (cf8b7533). get_services() now includes the maintenance pod in the container-ports map so its per-pod Service is built and available for the Ingress swap."}} +{"type":"status_update","timestamp":"2026-04-21T06:08:14.457815115Z","issue_id":"so-ad7","payload":{"status":"closed"}} diff --git a/docs/deployment_patterns.md b/docs/deployment_patterns.md index 9fd7ed0b..525622aa 100644 --- a/docs/deployment_patterns.md +++ b/docs/deployment_patterns.md @@ -164,6 +164,44 @@ To stop a single deployment without affecting the cluster: laconic-so deployment --dir my-deployment stop --skip-cluster-management ``` +Stacks sharing a cluster must agree on mount topology. See +[Volume Persistence in k8s-kind](#volume-persistence-in-k8s-kind). + +### cluster-id vs deployment-id + +Each deployment's `deployment.yml` carries two identifiers with +different roles: + +- **`cluster-id`** — which kind cluster this deployment attaches to. + Used for the kube-config context name (`kind-{cluster-id}`) and for + kind lifecycle ops. Inherited from the running cluster at + `deploy create` time when one exists; freshly generated otherwise. + Shared across every deployment that joins the same cluster. +- **`deployment-id`** — this particular deployment's identity. + Generated fresh on every `deploy create` and never inherited. Flows + into `app_name`, the prefix on every k8s resource name this + deployment creates (PVs, ConfigMaps, Deployments, PVCs, …). Distinct + per deployment even when the cluster is shared. + +The split prevents silent resource-name collisions between +deployments sharing a cluster: two deployments of the same stack, +or any two deployments that happen to declare a volume with the same +name, still produce distinct `{deployment-id}-{vol}` PV names. + +**Backward compatibility**: `deployment.yml` files written before the +`deployment-id` field existed fall back to using `cluster-id` as the +deployment-id. Existing resource names stay stable across this +upgrade — no PV renames, no re-bind, no data orphaning. The next +`deploy create` writes both fields going forward. + +**Namespace ownership**: on top of distinct resource names, SO stamps +the k8s namespace with a `laconic.com/deployment-dir` annotation on +first creation. A subsequent `deployment start` from a different +deployment directory that would land in the same namespace fails +with a `DeployerException` pointing at the `namespace:` spec +override. Catches operator-error cases where the same deployment dir +is effectively registered twice. + ## Volume Persistence in k8s-kind k8s-kind has 3 storage layers: @@ -172,7 +210,9 @@ k8s-kind has 3 storage layers: - **Kind Node**: A Docker container simulating a k8s node - **Pod Container**: Your workload -For k8s-kind, volumes with paths are mounted from Docker Host → Kind Node → Pod via extraMounts. +Volumes with paths are mounted from Docker Host → Kind Node → Pod via kind +`extraMounts`. Kind applies `extraMounts` only at cluster creation — they +cannot be added to a running cluster. | spec.yml volume | Storage Location | Survives Pod Restart | Survives Cluster Restart | |-----------------|------------------|---------------------|-------------------------| @@ -200,3 +240,100 @@ Empty-path volumes appear persistent because they survive pod restarts (data liv in Kind Node container). However, this data is lost when the kind cluster is recreated. This "false persistence" has caused data loss when operators assumed their data was safe. + +### Shared Clusters: Use `kind-mount-root` + +Because kind `extraMounts` can only be set at cluster creation, the first +deployment to start locks in the mount topology. Later deployments that +declare new `extraMounts` have them silently ignored — their PVs fall +through to the kind node's overlay filesystem and lose data on cluster +destroy. + +The fix is an umbrella mount. Set `kind-mount-root` in the spec, pointing +at a host directory all stacks will share: + +```yaml +# spec.yml +kind-mount-root: /srv/kind + +volumes: + my-data: /srv/kind/my-stack/data # visible at /mnt/my-stack/data in-node +``` + +SO emits a single `extraMount` (`` → `/mnt`). Any new +host subdirectory under the root is visible in the node immediately — no +cluster recreate needed to add stacks. + +**All stacks sharing a cluster must agree on `kind-mount-root`** and keep +their host paths under it. + +### Mount Compatibility Enforcement + +`laconic-so deployment start` validates mount topology: + +- **On first cluster creation** without an umbrella mount: prints a + warning (future stacks may require a full recreate to add mounts). +- **On cluster reuse**: compares the new deployment's `extraMounts` + against the live mounts on the control-plane container. Any mismatch + (wrong host path, or mount missing) fails the deploy. + +### Static files in compose volumes → auto-ConfigMap + +Compose volumes that bind a host file or flat directory into a container +(e.g. `../config/test/script.sh:/opt/run.sh`) are used to inject static +content that ships with the stack. k8s doesn't have a native notion of +this — the canonical way to inject static content is a ConfigMap. + +At `deploy start`, laconic-so auto-generates a namespace-scoped +ConfigMap per host-path compose volume (deduped by source) and mounts +it into the pod instead of routing the bind through the kind node: + +| Source shape | Behavior | +|---|---| +| Single file | ConfigMap with one key (the filename); pod mount uses `subPath` so the single key lands at the compose target path | +| Flat directory (no subdirs, ≤ ~700 KiB) | ConfigMap with one key per file; pod mount exposes all keys at the target path | +| Directory with subdirs, or over budget | Rejected at `deploy create` — embed in the container image, split into multiple ConfigMaps, or use an initContainer | +| `:rw` on any host-path bind | Rejected at `deploy create` — use a named volume with a spec-configured host path for writable data | + +The deployment dir layout is unchanged: compose files stay verbatim and +`spec.yml` is not rewritten. Source files remain under +`{deployment_dir}/config/{pod}/` (as copied by `deploy create`); the +ConfigMap is built from them at deploy start and no kind extraMount is +emitted for these paths. + +This works identically on kind and real k8s (ConfigMaps are +cluster-native; no node-side landing pad required), and two deployments +of the same stack sharing a cluster get their own per-namespace +ConfigMaps — no aliasing. + +### Writable / generated data → named volume + host path + +For volumes the workload *writes to* (databases, ledgers, caches, logs), +use a named volume backed by a spec-configured host path under +`kind-mount-root`: + +```yaml +# compose +volumes: + - my-data:/var/lib/foo + +# spec.yml +kind-mount-root: /srv/kind +volumes: + my-data: /srv/kind/my-stack/data +``` + +Works on both kind (via the umbrella mount) and real k8s (operator +provisions `/srv/kind/my-stack/data` on each node). + +### Migrating an Existing Cluster + +If a cluster was created without an umbrella mount and you need to add a +stack that requires new host-path mounts, the cluster must be recreated: + +1. Back up ephemeral state (DBs, caches) from PVs that lack host mounts — + these are in the kind node overlay FS and do not survive `kind delete`. +2. Update every stack's spec to set a shared `kind-mount-root` and place + host paths under it. +3. Stop all deployments, destroy the cluster, recreate it by starting any + stack (umbrella now active), and restore state. diff --git a/stack_orchestrator/constants.py b/stack_orchestrator/constants.py index 2e885431..e5c83698 100644 --- a/stack_orchestrator/constants.py +++ b/stack_orchestrator/constants.py @@ -23,6 +23,7 @@ compose_deploy_type = "compose" k8s_kind_deploy_type = "k8s-kind" k8s_deploy_type = "k8s" cluster_id_key = "cluster-id" +deployment_id_key = "deployment-id" kube_config_key = "kube-config" deploy_to_key = "deploy-to" network_key = "network" diff --git a/stack_orchestrator/deploy/deployment_context.py b/stack_orchestrator/deploy/deployment_context.py index 79fc4bb9..1776699e 100644 --- a/stack_orchestrator/deploy/deployment_context.py +++ b/stack_orchestrator/deploy/deployment_context.py @@ -26,6 +26,7 @@ from stack_orchestrator.deploy.spec import Spec class DeploymentContext: deployment_dir: Path id: str + deployment_id: str spec: Spec stack: Stack @@ -48,8 +49,27 @@ class DeploymentContext: return self.get_compose_dir() / f"docker-compose-{name}.yml" def get_cluster_id(self): + """Identifier of the kind cluster this deployment attaches to. + + Shared across deployments that join the same kind cluster. Used + for the kube-config context name (`kind-{cluster-id}`) and for + kind cluster lifecycle ops. + """ return self.id + def get_deployment_id(self): + """Identifier of this particular deployment's k8s resources. + + Distinct per deployment even when multiple deployments share a + cluster. Used as compose_project_name → app_name → prefix for + all k8s resource names (PVs, ConfigMaps, Deployments, …). + + Backward compat: for deployment.yml files written before this + field existed, falls back to cluster-id so existing on-disk + resource names remain stable (no PV renames, no re-bind). + """ + return self.deployment_id + def init(self, dir: Path): self.deployment_dir = dir.absolute() self.spec = Spec() @@ -60,6 +80,12 @@ class DeploymentContext: if deployment_file_path.exists(): obj = get_yaml().load(open(deployment_file_path, "r")) self.id = obj[constants.cluster_id_key] + # Fallback to cluster-id for deployments created before the + # deployment-id field was introduced. Keeps existing resource + # names stable across this upgrade. + self.deployment_id = obj.get( + constants.deployment_id_key, self.id + ) # Handle the case of a legacy deployment with no file # Code below is intended to match the output from _make_default_cluster_name() # TODO: remove when we no longer need to support legacy deployments @@ -68,6 +94,7 @@ class DeploymentContext: unique_cluster_descriptor = f"{path},{self.get_stack_file()},None,None" hash = hashlib.md5(unique_cluster_descriptor.encode()).hexdigest()[:16] self.id = f"{constants.cluster_name_prefix}{hash}" + self.deployment_id = self.id def modify_yaml(self, file_path: Path, modifier_func): """Load a YAML, apply a modification function, and write it back.""" diff --git a/stack_orchestrator/deploy/deployment_create.py b/stack_orchestrator/deploy/deployment_create.py index be3670ce..fd7ec4f1 100644 --- a/stack_orchestrator/deploy/deployment_create.py +++ b/stack_orchestrator/deploy/deployment_create.py @@ -51,8 +51,10 @@ from stack_orchestrator.util import ( ) from stack_orchestrator.deploy.spec import Spec from stack_orchestrator.deploy.deploy_types import LaconicStackSetupCommand +from stack_orchestrator.deploy.deployer import DeployerException from stack_orchestrator.deploy.deployer_factory import getDeployerConfigGenerator from stack_orchestrator.deploy.deployment_context import DeploymentContext +from stack_orchestrator.deploy.k8s.helpers import is_host_path_mount def _make_default_deployment_dir(): @@ -287,6 +289,113 @@ def call_stack_deploy_start(deployment_context): # Inspect the pod yaml to find config files referenced in subdirectories +# Safety margin under the k8s ConfigMap 1 MiB hard limit. Accounts for +# base64 expansion (~33%) and ConfigMap metadata overhead. +_HOST_PATH_CONFIGMAP_BUDGET_BYTES = 700 * 1024 + + +def _validate_host_path_mounts(parsed_pod_file, pod_name, pod_file_path): + """Fail fast at deploy create on unsupported host-path compose volumes. + + Host-path compose volumes (`:[:opts]` with src starting + with /, ., or ~) flow through auto-generated ConfigMaps at deploy + start. ConfigMaps can't represent: + - directories with subdirectories (flat key space) + - content exceeding ~700 KiB (k8s 1 MiB limit minus base64/overhead) + - writable mounts (ConfigMap mounts are read-only) + + Reject those shapes up front with a clear error so users don't hit + the failure later at start time. + + Source resolution: compose paths like `../config/foo.sh` are + relative to the compose file location in the stack source tree at + deploy create time. At deploy start, the file is read from the + matching copy under `{deployment_dir}/config/{pod}/` that deploy + create lays down. + """ + compose_stack_dir = Path(pod_file_path).resolve().parent + services = parsed_pod_file.get("services") or {} + for service_name, service_info in services.items(): + for volume_str in service_info.get("volumes") or []: + parts = volume_str.split(":") + if len(parts) < 2: + continue + src = parts[0] + if not is_host_path_mount(src): + continue + mount_opts = parts[2] if len(parts) > 2 else None + opt_tokens = ( + [t.strip() for t in mount_opts.split(",") if t.strip()] + if mount_opts + else [] + ) + if "rw" in opt_tokens: + raise DeployerException( + f"Writable host-path bind not supported: " + f"'{volume_str}' in {pod_name}/{service_name}.\n" + "Host-path binds from the deployment directory are " + "static content injected as ConfigMaps (read-only). " + "Use a named volume with a spec-configured host path " + "under 'kind-mount-root' for writable data. See " + "docs/deployment_patterns.md." + ) + + abs_src = (compose_stack_dir / src).resolve() + if not abs_src.exists(): + # Preserve existing behavior — compose-level binds with + # missing sources fail later; don't introduce a new + # early failure mode here. + continue + if abs_src.is_file(): + # Single files are always fine — one-key ConfigMap with + # subPath. Budget check here too in case of huge single + # files. + size = abs_src.stat().st_size + if size > _HOST_PATH_CONFIGMAP_BUDGET_BYTES: + raise DeployerException( + f"Host-path bind '{volume_str}' in " + f"{pod_name}/{service_name} points at a file of " + f"{size} bytes, exceeding the ConfigMap budget " + f"({_HOST_PATH_CONFIGMAP_BUDGET_BYTES} bytes " + f"after base64/overhead).\n\n" + "Embed the file in the container image at build " + "time, or split into multiple smaller files." + ) + continue + if abs_src.is_dir(): + entries = list(abs_src.iterdir()) + if any(p.is_dir() for p in entries): + raise DeployerException( + f"Directory host-path bind '{volume_str}' in " + f"{pod_name}/{service_name} contains " + "subdirectories, which cannot be represented " + "in a k8s ConfigMap.\n\n" + "Restructure the stack to either:\n" + " - embed the directory in the container " + "image at build time,\n" + " - split into multiple ConfigMap entries " + "(one per subdir),\n" + " - or use an initContainer to populate the " + "content at runtime.\n\n" + "See docs/deployment_patterns.md." + ) + total = sum( + p.stat().st_size for p in entries if p.is_file() + ) + if total > _HOST_PATH_CONFIGMAP_BUDGET_BYTES: + raise DeployerException( + f"Directory host-path bind '{volume_str}' in " + f"{pod_name}/{service_name} totals {total} " + f"bytes, exceeding the ConfigMap budget " + f"({_HOST_PATH_CONFIGMAP_BUDGET_BYTES} bytes " + f"after base64/overhead).\n\n" + "Embed the content in the container image at " + "build time, or split into smaller ConfigMaps. " + "See docs/deployment_patterns.md." + ) + + +# _find_extra_config_dirs: Find config dirs referenced in the pod files # other than the one associated with the pod def _find_extra_config_dirs(parsed_pod_file, pod): config_dirs = set() @@ -778,7 +887,15 @@ def _create_deployment_file(deployment_dir: Path, stack_source: Optional[Path] = # Reuse existing Kind cluster if one exists, otherwise generate a timestamp-based ID existing = _get_existing_kind_cluster() cluster = existing if existing else generate_id("laconic") - deployment_content = {constants.cluster_id_key: cluster} + # deployment-id is always fresh per `deploy create`, even when + # cluster-id is inherited from a running cluster. Keeps each + # deployment's k8s resource names (PVs, ConfigMaps, Deployment) + # distinct even when multiple deployments share a cluster. + deployment_id = generate_id("laconic") + deployment_content = { + constants.cluster_id_key: cluster, + constants.deployment_id_key: deployment_id, + } if stack_source: deployment_content["stack-source"] = str(stack_source) with open(deployment_file_path, "w") as output_file: @@ -1058,6 +1175,12 @@ def _write_deployment_files( if pod_file_path is None: continue parsed_pod_file = yaml.load(open(pod_file_path, "r")) + # Reject host-path compose volumes whose shape can't land as a + # ConfigMap (dir-with-subdirs, oversize, writable). File-level + # and flat-dir host-path binds are accepted — they auto-convert + # to ConfigMaps at deploy start via cluster_info.get_configmaps. + if parsed_spec.is_kubernetes_deployment(): + _validate_host_path_mounts(parsed_pod_file, pod, pod_file_path) extra_config_dirs = _find_extra_config_dirs(parsed_pod_file, pod) destination_pod_dir = destination_pods_dir.joinpath(pod) os.makedirs(destination_pod_dir, exist_ok=True) @@ -1138,6 +1261,10 @@ def _write_deployment_files( job_file_path = get_job_file_path(stack_name, parsed_stack, job) if job_file_path and job_file_path.exists(): parsed_job_file = yaml.load(open(job_file_path, "r")) + if parsed_spec.is_kubernetes_deployment(): + _validate_host_path_mounts( + parsed_job_file, job, job_file_path + ) _fixup_pod_file(parsed_job_file, parsed_spec, destination_compose_dir) with open( destination_compose_jobs_dir.joinpath( diff --git a/stack_orchestrator/deploy/k8s/cluster_info.py b/stack_orchestrator/deploy/k8s/cluster_info.py index c50eb5cd..2febf6ad 100644 --- a/stack_orchestrator/deploy/k8s/cluster_info.py +++ b/stack_orchestrator/deploy/k8s/cluster_info.py @@ -15,6 +15,7 @@ import os import base64 +from pathlib import Path from kubernetes import client from typing import Any, List, Optional, Set @@ -22,7 +23,10 @@ from typing import Any, List, Optional, Set from stack_orchestrator.opts import opts from stack_orchestrator.util import env_var_map_from_file from stack_orchestrator.deploy.k8s.helpers import ( + is_host_path_mount, named_volumes_from_pod_files, + resolve_host_path_for_kind, + sanitize_host_path_to_volume_name, volume_mounts_for_service, volumes_for_pod_files, ) @@ -433,8 +437,91 @@ class ClusterInfo: binary_data=data, ) result.append(spec) + + # Auto-generated ConfigMaps for file-level and flat-dir host-path + # compose volumes. Avoids the aliasing failure mode where two + # deployments sharing a cluster would collide at the same kind + # node path — each deployment gets its own namespace-scoped + # ConfigMap instead. See docs/deployment_patterns.md. + result.extend(self._host_path_bind_configmaps()) return result + def _host_path_bind_configmaps(self) -> List[client.V1ConfigMap]: + """Build V1ConfigMap objects for host-path compose volumes. + + Walks every service in every parsed pod/job compose file. For each + volume whose source is a host path (starts with /, ., or ~), + reads the resolved file or flat directory from the deployment + directory and packages it as a V1ConfigMap. + + Dedupes by sanitized name across pods and services — a source + referenced from N places yields one ConfigMap. + """ + if self.spec.file_path is None: + return [] + deployment_dir = Path(self.spec.file_path).parent + seen: Set[str] = set() + result: List[client.V1ConfigMap] = [] + + all_pod_maps = [self.parsed_pod_yaml_map, self.parsed_job_yaml_map] + for pod_map in all_pod_maps: + for _pod_key, pod in pod_map.items(): + services = pod.get("services") or {} + for _svc_name, svc in services.items(): + for mount_string in svc.get("volumes") or []: + parts = mount_string.split(":") + if len(parts) < 2: + continue + src = parts[0] + if not is_host_path_mount(src): + continue + sanitized = sanitize_host_path_to_volume_name(src) + if sanitized in seen: + continue + seen.add(sanitized) + abs_src = resolve_host_path_for_kind( + src, deployment_dir + ) + data = self._read_host_path_source(abs_src, mount_string) + cm = client.V1ConfigMap( + metadata=client.V1ObjectMeta( + name=f"{self.app_name}-{sanitized}", + labels=self._stack_labels( + {"configmap-label": sanitized} + ), + ), + binary_data=data, + ) + result.append(cm) + return result + + def _read_host_path_source( + self, abs_src: Path, mount_string: str + ) -> dict: + """Read file or flat-directory content for a host-path ConfigMap. + + Validates shape at read time as a defensive second check — the + same rules are enforced earlier at `deploy create`, but deploy- + dir content may have been edited since then. + """ + if not abs_src.exists(): + raise RuntimeError( + f"Source for host-path compose volume does not exist: " + f"{abs_src} (volume: '{mount_string}')" + ) + data = {} + if abs_src.is_file(): + with open(abs_src, "rb") as f: + data[abs_src.name] = base64.b64encode(f.read()).decode("ASCII") + elif abs_src.is_dir(): + for entry in abs_src.iterdir(): + if entry.is_file(): + with open(entry, "rb") as f: + data[entry.name] = base64.b64encode(f.read()).decode( + "ASCII" + ) + return data + def get_pvs(self): result = [] spec_volumes = self.spec.get_volumes() @@ -621,7 +708,13 @@ class ClusterInfo: if self.spec.get_image_registry() is not None else image ) - volume_mounts = volume_mounts_for_service(parsed_yaml_map, service_name) + volume_mounts = volume_mounts_for_service( + parsed_yaml_map, + service_name, + Path(self.spec.file_path).parent + if self.spec.file_path + else None, + ) # Handle command/entrypoint from compose file # In docker-compose: entrypoint -> k8s command, command -> k8s args container_command = None diff --git a/stack_orchestrator/deploy/k8s/deploy_k8s.py b/stack_orchestrator/deploy/k8s/deploy_k8s.py index 7e9efed6..84318cde 100644 --- a/stack_orchestrator/deploy/k8s/deploy_k8s.py +++ b/stack_orchestrator/deploy/k8s/deploy_k8s.py @@ -20,10 +20,16 @@ from kubernetes.client.exceptions import ApiException from typing import Any, Dict, List, Optional, cast from stack_orchestrator import constants -from stack_orchestrator.deploy.deployer import Deployer, DeployerConfigGenerator +from stack_orchestrator.deploy.deployer import ( + Deployer, + DeployerConfigGenerator, + DeployerException, +) from stack_orchestrator.deploy.k8s.helpers import ( + check_mounts_compatible, create_cluster, destroy_cluster, + get_kind_cluster, load_images_into_kind, ) from stack_orchestrator.deploy.k8s.helpers import ( @@ -123,27 +129,34 @@ class K8sDeployer(Deployer): return self.deployment_dir = deployment_context.deployment_dir self.deployment_context = deployment_context + # kind cluster name comes from cluster-id — which kind cluster this + # deployment attaches to. Shared across deployments that join the + # same cluster. compose_project_name is kept as a parameter for + # interface compatibility with the compose deployer path. + cluster_id = deployment_context.get_cluster_id() + deployment_id = deployment_context.get_deployment_id() self.kind_cluster_name = ( - deployment_context.spec.get_kind_cluster_name() or compose_project_name - ) - # Use spec namespace if provided, otherwise derive from cluster-id - self.k8s_namespace = ( - deployment_context.spec.get_namespace() or f"laconic-{compose_project_name}" + deployment_context.spec.get_kind_cluster_name() or cluster_id ) self.cluster_info = ClusterInfo() # stack.name may be an absolute path (from spec "stack:" key after # path resolution). Extract just the directory basename for labels. raw_name = deployment_context.stack.name if deployment_context else "" stack_name = Path(raw_name).name if raw_name else "" - # Use spec namespace if provided, otherwise derive from stack name + # Namespace: spec override wins; else derive from stack name; else + # fall back to deployment-id. (On older deployment.yml files without + # deployment-id, get_deployment_id() returns cluster-id — same as + # the pre-decouple behavior.) self.k8s_namespace = deployment_context.spec.get_namespace() or ( - f"laconic-{stack_name}" if stack_name else f"laconic-{compose_project_name}" + f"laconic-{stack_name}" if stack_name else f"laconic-{deployment_id}" ) self.cluster_info = ClusterInfo() + # app_name comes from deployment-id so each deployment owns its own + # k8s resource names, even when multiple deployments share a cluster. self.cluster_info.int( compose_files, compose_env_file, - compose_project_name, + deployment_id, deployment_context.spec, stack_name=stack_name, ) @@ -175,28 +188,74 @@ class K8sDeployer(Deployer): self.custom_obj_api = client.CustomObjectsApi() def _ensure_namespace(self): - """Create the deployment namespace if it doesn't exist.""" + """Create the deployment namespace if it doesn't exist. + + Stamps the namespace with a `laconic.com/deployment-dir` + annotation so that a subsequent `deployment start` from a + different deployment dir — which would otherwise silently + patch this deployment's k8s resources in place — fails with + a clear error directing at the `namespace:` spec override. + """ if opts.o.dry_run: print(f"Dry run: would create namespace {self.k8s_namespace}") return + owner_key = "laconic.com/deployment-dir" + my_dir = str(Path(self.deployment_dir).resolve()) try: - self.core_api.read_namespace(name=self.k8s_namespace) - if opts.o.debug: - print(f"Namespace {self.k8s_namespace} already exists") + existing = self.core_api.read_namespace(name=self.k8s_namespace) except ApiException as e: - if e.status == 404: - # Create the namespace - ns = client.V1Namespace( - metadata=client.V1ObjectMeta( - name=self.k8s_namespace, - labels=self.cluster_info._stack_labels(), - ) - ) - self.core_api.create_namespace(body=ns) - if opts.o.debug: - print(f"Created namespace {self.k8s_namespace}") - else: + if e.status != 404: raise + existing = None + + if existing is None: + ns = client.V1Namespace( + metadata=client.V1ObjectMeta( + name=self.k8s_namespace, + labels=self.cluster_info._stack_labels(), + annotations={owner_key: my_dir}, + ) + ) + self.core_api.create_namespace(body=ns) + if opts.o.debug: + print( + f"Created namespace {self.k8s_namespace} " + f"owned by {my_dir}" + ) + return + + annotations = (existing.metadata.annotations or {}) if existing.metadata else {} + owner = annotations.get(owner_key) + if owner and owner != my_dir: + raise DeployerException( + f"Namespace '{self.k8s_namespace}' is already owned by " + f"another deployment at:\n {owner}\n" + f"\nThis deployment is at:\n {my_dir}\n" + "\nTwo deployments of the same stack sharing a cluster " + "cannot share a namespace — every namespace-scoped " + "resource (Deployment, ConfigMaps, Services, PVCs) " + "would collide and silently patch each other.\n" + "\nFix: add an explicit `namespace:` override to this " + "deployment's spec.yml so it lands in its own " + "namespace. For example:\n" + f" namespace: {self.k8s_namespace}-\n" + "\n(k8s namespace names must be lowercase alphanumeric " + "plus '-', start and end with an alphanumeric character, " + "≤63 chars.)" + ) + if not owner: + # Legacy namespace (pre-dates this check) or user-created. + # Adopt it by stamping the ownership annotation so + # subsequent conflicting deployments fail loudly. + patch = {"metadata": {"annotations": {owner_key: my_dir}}} + self.core_api.patch_namespace(name=self.k8s_namespace, body=patch) + if opts.o.debug: + print( + f"Adopted existing namespace {self.k8s_namespace} " + f"as owned by {my_dir}" + ) + elif opts.o.debug: + print(f"Namespace {self.k8s_namespace} already owned by {my_dir}") def _delete_namespace(self): """Delete the deployment namespace and all resources within it.""" @@ -786,6 +845,39 @@ class K8sDeployer(Deployer): } if local_images: load_images_into_kind(self.kind_cluster_name, local_images) + elif self.is_kind(): + # --skip-cluster-management (default): cluster must already exist. + # Without this check, connect_api() below raises a cryptic + # kubernetes.config.ConfigException when the context is missing. + existing = get_kind_cluster() + if existing is None: + raise DeployerException( + f"No kind cluster is running. This deployment expects " + f"cluster '{self.kind_cluster_name}' to exist.\n" + "\n" + "--skip-cluster-management is the default; pass " + "--perform-cluster-management to have laconic-so " + "create the cluster, or start it manually first." + ) + if existing != self.kind_cluster_name: + raise DeployerException( + f"Running kind cluster '{existing}' does not match the " + f"cluster-id '{self.kind_cluster_name}' in " + f"{self.deployment_dir}/deployment.yml.\n" + "\n" + "Fix by either:\n" + " - editing deployment.yml to set " + f"cluster-id: {existing}, or\n" + " - passing --perform-cluster-management to create a " + "fresh cluster (note: destroys the existing one if " + "names collide)." + ) + # Mount topology applies regardless of who owns cluster + # lifecycle — validate here too. + kind_config = str( + self.deployment_dir.joinpath(constants.kind_config_filename) + ) + check_mounts_compatible(existing, kind_config) self.connect_api() self._ensure_namespace() if self.is_kind() and not self.skip_cluster_management: diff --git a/stack_orchestrator/deploy/k8s/helpers.py b/stack_orchestrator/deploy/k8s/helpers.py index 9f0f2171..44d1e49c 100644 --- a/stack_orchestrator/deploy/k8s/helpers.py +++ b/stack_orchestrator/deploy/k8s/helpers.py @@ -15,11 +15,13 @@ from kubernetes import client, utils, watch from kubernetes.client.exceptions import ApiException +import json import os from pathlib import Path import subprocess import re -from typing import Set, Mapping, List, Optional, cast +import sys +from typing import Dict, Set, Mapping, List, Optional, cast import yaml from stack_orchestrator.util import get_k8s_dir, error_exit @@ -216,6 +218,174 @@ def _install_caddy_cert_backup( print("Installed caddy cert backup CronJob") +def _parse_kind_extra_mounts(config_file: str) -> List[Dict[str, str]]: + """Return the list of extraMounts declared in a kind config file.""" + try: + with open(config_file) as f: + config = yaml.safe_load(f) or {} + except (OSError, yaml.YAMLError) as e: + if opts.o.debug: + print(f"Could not parse kind config {config_file}: {e}") + return [] + mounts = [] + for node in config.get("nodes", []) or []: + for m in node.get("extraMounts", []) or []: + host_path = m.get("hostPath") + container_path = m.get("containerPath") + if host_path and container_path: + mounts.append( + {"hostPath": host_path, "containerPath": container_path} + ) + return mounts + + +def _get_control_plane_node(cluster_name: str) -> Optional[str]: + """Return the kind control-plane node container name for a cluster.""" + result = subprocess.run( + ["kind", "get", "nodes", "--name", cluster_name], + capture_output=True, + text=True, + ) + if result.returncode != 0: + return None + for line in result.stdout.splitlines(): + line = line.strip() + if line.endswith("control-plane"): + return line + return None + + +def _get_running_cluster_mounts(cluster_name: str) -> Dict[str, str]: + """Return {containerPath: hostPath} for bind mounts on the control-plane.""" + node = _get_control_plane_node(cluster_name) + if not node: + return {} + result = subprocess.run( + ["docker", "inspect", node, "--format", "{{json .Mounts}}"], + capture_output=True, + text=True, + ) + if result.returncode != 0: + return {} + try: + mounts = json.loads(result.stdout or "[]") + except json.JSONDecodeError: + return {} + return { + m["Destination"]: m["Source"] + for m in mounts + if m.get("Type") == "bind" and m.get("Destination") and m.get("Source") + } + + +def check_mounts_compatible(cluster_name: str, config_file: str) -> None: + """Fail if the new deployment's extraMounts aren't active on the cluster. + + Kind applies extraMounts only at cluster creation. When a deployment + joins an existing cluster, any extraMount its kind-config declares that + isn't already active on the running node will silently fall through to + the node's overlay filesystem — data looks persisted but is lost on + cluster destroy. Catch this up front. + """ + required = _parse_kind_extra_mounts(config_file) + if not required: + return + live = _get_running_cluster_mounts(cluster_name) + if not live: + # Could not inspect — don't block deployment, but warn. + print( + f"WARNING: could not inspect mounts on cluster '{cluster_name}'; " + "skipping extraMount compatibility check", + file=sys.stderr, + ) + return + # File-level host-path binds (e.g. `./config/x.sh` from compose volumes) + # are emitted per-deployment with containerPath `/mnt/host-path-*` and + # source paths under each deployment's own directory. Two deployments + # of the same stack will always clash here — a pre-existing SO aliasing + # misfeature that's orthogonal to umbrella compatibility. Skip them so + # this check stays focused on the umbrella and named-volume data mounts + # it was designed for. + mismatches = [] + for m in required: + dest = m["containerPath"] + if dest.startswith("/mnt/host-path-"): + continue + want = m["hostPath"] + have = live.get(dest) + if have != want: + mismatches.append((dest, want, have)) + if not mismatches: + return + lines = [ + f"This deployment declares extraMounts incompatible with the " + f"running cluster '{cluster_name}':", + ] + for dest, want, have in mismatches: + lines.append( + f" - {dest}: expected host path '{want}', " + f"actual '{have or 'NOT MOUNTED'}'" + ) + lines.append("") + + cluster_umbrella = live.get("/mnt") + if cluster_umbrella: + lines.extend( + [ + f"The running cluster has an umbrella mount: " + f"'{cluster_umbrella}' -> /mnt.", + "", + f"Fix: set 'kind-mount-root: {cluster_umbrella}' in this " + "deployment's spec and place host paths for its volumes " + f"under '{cluster_umbrella}/'. Kind applies extraMounts " + "only at cluster creation, so new bind mounts cannot be " + "added to the running cluster without a recreate — but " + "the existing umbrella already covers any subdirectory " + "you create on the host.", + ] + ) + else: + lines.extend( + [ + "The running cluster has no umbrella mount " + "(no extraMount with containerPath=/mnt).", + "", + "Kind applies extraMounts only at cluster creation — " + "neither kind nor Docker supports adding bind mounts to " + "a running container. Without a recreate, any PV backed " + "by one of the missing mounts will silently fall through " + "to the node's overlay filesystem and lose data on " + "cluster destroy.", + "", + "Fix: destroy and recreate the cluster with a kind-config " + "that sets 'kind-mount-root' so future stacks can share " + "an umbrella without recreating.", + ] + ) + lines.append("") + lines.append("See docs/deployment_patterns.md.") + raise DeployerException("\n".join(lines)) + + +def _warn_if_no_umbrella(config_file: str) -> None: + """Warn if creating a cluster without a '/mnt' umbrella mount. + + Without an umbrella, future stacks joining this cluster that need new + host-path mounts will fail the compatibility check and require a full + cluster recreate to add them. + """ + mounts = _parse_kind_extra_mounts(config_file) + if any(m.get("containerPath") == "/mnt" for m in mounts): + return + print( + "WARNING: creating kind cluster without an umbrella mount " + "('kind-mount-root' not set). Future stacks added to this cluster " + "that require new host-path mounts will not be able to without a " + "full cluster recreate. See docs/deployment_patterns.md.", + file=sys.stderr, + ) + + def create_cluster(name: str, config_file: str): """Create or reuse the single kind cluster for this host. @@ -232,8 +402,10 @@ def create_cluster(name: str, config_file: str): existing = get_kind_cluster() if existing: print(f"Using existing cluster: {existing}") + check_mounts_compatible(existing, config_file) return existing + _warn_if_no_umbrella(config_file) print(f"Creating new cluster: {name}") result = _run_command(f"kind create cluster --name {name} --config {config_file}") if result.returncode != 0: @@ -435,7 +607,7 @@ def get_kind_pv_bind_mount_path( return f"/mnt/{volume_name}" -def volume_mounts_for_service(parsed_pod_files, service): +def volume_mounts_for_service(parsed_pod_files, service, deployment_dir=None): result = [] # Find the service for pod in parsed_pod_files: @@ -459,11 +631,24 @@ def volume_mounts_for_service(parsed_pod_files, service): mount_options = ( mount_split[2] if len(mount_split) == 3 else None ) - # For host path mounts, use sanitized name + sub_path = None + # For host path mounts, use sanitized name. + # When the source resolves to a single file, + # the auto-generated ConfigMap has one key + # (the file basename). Set subPath so the + # mount lands at the compose target as a + # single file, not as a directory with the + # key as a child entry. if is_host_path_mount(volume_name): k8s_volume_name = sanitize_host_path_to_volume_name( volume_name ) + if deployment_dir is not None: + abs_src = resolve_host_path_for_kind( + volume_name, deployment_dir + ) + if abs_src.is_file(): + sub_path = abs_src.name else: k8s_volume_name = volume_name if opts.o.debug: @@ -471,10 +656,12 @@ def volume_mounts_for_service(parsed_pod_files, service): print(f"k8s_volume_name: {k8s_volume_name}") print(f"mount path: {mount_path}") print(f"mount options: {mount_options}") + print(f"sub_path: {sub_path}") volume_device = client.V1VolumeMount( mount_path=mount_path, name=k8s_volume_name, read_only="ro" == mount_options, + sub_path=sub_path, ) result.append(volume_device) return result @@ -507,7 +694,11 @@ def volumes_for_pod_files(parsed_pod_files, spec, app_name): ) result.append(volume) - # Handle host path mounts from service volumes + # File-level and flat-dir host-path compose volumes flow through + # auto-generated ConfigMaps. Emit a ConfigMap-backed V1Volume so + # the pod reads from the namespace-scoped ConfigMap rather than + # a kind-node hostPath (which would alias across deployments + # sharing a cluster and not work on real k8s at all). if "services" in parsed_pod_file: services = parsed_pod_file["services"] for service_name in services: @@ -522,19 +713,19 @@ def volumes_for_pod_files(parsed_pod_files, spec, app_name): ) if sanitized_name not in seen_host_path_volumes: seen_host_path_volumes.add(sanitized_name) - # Create hostPath volume for mount inside kind node - kind_mount_path = get_kind_host_path_mount_path( - sanitized_name - ) - host_path_source = client.V1HostPathVolumeSource( - path=kind_mount_path, type="FileOrCreate" + config_map = client.V1ConfigMapVolumeSource( + name=f"{app_name}-{sanitized_name}", + default_mode=0o755, ) volume = client.V1Volume( - name=sanitized_name, host_path=host_path_source + name=sanitized_name, config_map=config_map ) result.append(volume) if opts.o.debug: - print(f"Created hostPath volume: {sanitized_name}") + print( + f"Created configmap-backed host-path " + f"volume: {sanitized_name}" + ) return result @@ -553,7 +744,6 @@ def _make_absolute_host_path(data_mount_path: Path, deployment_dir: Path) -> Pat def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context): volume_definitions = [] volume_host_path_map = _get_host_paths_for_volumes(deployment_context) - seen_host_path_mounts = set() # Track to avoid duplicate mounts kind_mount_root = deployment_context.spec.get_kind_mount_root() # When kind-mount-root is set, emit a single extraMount for the root. @@ -590,26 +780,12 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context): mount_path = mount_split[1] if is_host_path_mount(volume_name): - # Host path mount - add extraMount for kind - sanitized_name = sanitize_host_path_to_volume_name( - volume_name - ) - if sanitized_name not in seen_host_path_mounts: - seen_host_path_mounts.add(sanitized_name) - # Resolve path relative to compose directory - host_path = resolve_host_path_for_kind( - volume_name, deployment_dir - ) - container_path = get_kind_host_path_mount_path( - sanitized_name - ) - volume_definitions.append( - f" - hostPath: {host_path}\n" - f" containerPath: {container_path}\n" - f" propagation: HostToContainer\n" - ) - if opts.o.debug: - print(f"Added host path mount: {host_path}") + # File-level host-path binds (e.g. compose + # `../config/foo.sh:/opt/foo.sh`) flow + # through an auto-generated k8s ConfigMap at + # deploy start — no extraMount needed. See + # cluster_info.get_configmaps(). + continue else: # Named volume if opts.o.debug: diff --git a/tests/k8s-deploy/run-deploy-test.sh b/tests/k8s-deploy/run-deploy-test.sh index f16462ac..08a89c6a 100755 --- a/tests/k8s-deploy/run-deploy-test.sh +++ b/tests/k8s-deploy/run-deploy-test.sh @@ -147,7 +147,13 @@ deployment_spec_file=${test_deployment_dir}/spec.yml sed -i 's/^secrets: {}$/secrets:\n test-secret:\n - TEST_SECRET_KEY/' ${deployment_spec_file} # Get the deployment ID and namespace for kubectl queries -deployment_id=$(cat ${test_deployment_dir}/deployment.yml | cut -d ' ' -f 2) +# deployment-id is what flows into app_name → resource name prefix. +# Fall back to cluster-id for deployment.yml files written before the +# deployment-id field existed (pre-decouple compatibility). +deployment_id=$(awk '/^deployment-id:/ {print $2; exit}' ${test_deployment_dir}/deployment.yml) +if [ -z "$deployment_id" ]; then + deployment_id=$(awk '/^cluster-id:/ {print $2; exit}' ${test_deployment_dir}/deployment.yml) +fi # Namespace is derived from stack name: laconic-{stack_name} deployment_ns="laconic-test" @@ -166,6 +172,41 @@ for kind in serviceaccount role rolebinding cronjob; do done echo "caddy-cert-backup install test: passed" +# Host-path compose volumes (../config/test/script.sh, ../config/test/settings.env) +# should flow through auto-generated per-namespace ConfigMaps — no kind +# extraMount, no compose/spec rewriting. The pod mount lands via +# ConfigMap + subPath. +for cm_name in \ + "${deployment_id}-host-path-config-test-script-sh" \ + "${deployment_id}-host-path-config-test-settings-env"; do + if ! kubectl get configmap "$cm_name" -n "$deployment_ns" >/dev/null 2>&1; then + echo "host-path configmap test: ConfigMap $cm_name not found" + cleanup_and_exit + fi +done +echo "host-path configmap test: passed" + +# Deployment dir should be untouched — compose file still has the +# original host-path volume entries and no synthetic configmap dirs. +if ! grep -q '\.\./config/test/script\.sh:/opt/run\.sh' \ + "$test_deployment_dir/compose/docker-compose-test.yml"; then + echo "compose unchanged test: host-path volume entry missing" + cleanup_and_exit +fi +if [ -d "$test_deployment_dir/configmaps/host-path-config-test-script-sh" ]; then + echo "compose unchanged test: unexpected configmaps/host-path-* dir present" + cleanup_and_exit +fi +echo "compose unchanged test: passed" + +# kind-config.yml should NOT contain /mnt/host-path-* extraMounts — +# they are replaced by the ConfigMap mechanism. +if grep -q 'containerPath: /mnt/host-path-' "$test_deployment_dir/kind-config.yml"; then + echo "no-host-path-extramount test: FAILED" + cleanup_and_exit +fi +echo "no-host-path-extramount test: passed" + # Check logs command works wait_for_log_output sleep 1 diff --git a/tests/k8s-deployment-control/run-test.sh b/tests/k8s-deployment-control/run-test.sh index bf003228..4b5ffec7 100755 --- a/tests/k8s-deployment-control/run-test.sh +++ b/tests/k8s-deployment-control/run-test.sh @@ -185,8 +185,15 @@ node-tolerations: value: c EOF -# Get the deployment ID so we can generate low level kubectl commands later -deployment_id=$(cat ${test_deployment_dir}/deployment.yml | cut -d ' ' -f 2) +# cluster-id names the kind cluster (and its worker node names). +# deployment-id is what flows into app_name / resource name prefixes. +# Fall back to cluster-id for deployment.yml files written before the +# deployment-id field existed. +cluster_id=$(awk '/^cluster-id:/ {print $2; exit}' ${test_deployment_dir}/deployment.yml) +deployment_id=$(awk '/^deployment-id:/ {print $2; exit}' ${test_deployment_dir}/deployment.yml) +if [ -z "$deployment_id" ]; then + deployment_id=$cluster_id +fi # Try to start the deployment $TEST_TARGET_SO deployment --dir $test_deployment_dir start --perform-cluster-management @@ -208,7 +215,7 @@ fi # Get get the node onto which the stack pod has been deployed # Namespace is now derived from stack name, not cluster-id deployment_node=$(kubectl get pods -n laconic-test -l app=${deployment_id} -o=jsonpath='{.items..spec.nodeName}') -expected_node=${deployment_id}-worker3 +expected_node=${cluster_id}-worker3 echo "Stack pod deployed to node: ${deployment_node}" if [[ ${deployment_node} == ${expected_node} ]]; then echo "deployment of pod test: passed"