Commit Graph

16 Commits (633feba3ea963b4794d366fc6ad85983e459530b)

Author SHA1 Message Date
pranav befb548a6e make deployments self-sufficient by copying hooks into deployment dir 2026-04-27 19:37:17 +05:30
prathamesh0 4977e3ff43
k8s: manage Caddy ingress image via spec (so-p3p) (#749)
Publish / Gate: k8s deploy e2e (push) Failing after 3s Details
Deploy Test / Run deploy test suite (push) Failing after 0s Details
Publish / Build and publish (push) Has been skipped Details
K8s Deployment Control Test / Run deployment control suite on kind/k8s (push) Failing after 0s Details
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
Lint Checks / Run linter (push) Failing after 0s Details
K8s Deploy Test / Run deploy test suite on kind/k8s (push) Failing after 0s Details
Closes so-p3p:
- New spec key `caddy-ingress-image`: on fresh install, deploys Caddy with this image; on subsequent `deployment start`, patches the running Caddy Deployment if the image differs. Defaults to the manifest's hardcoded image when absent
- When the spec key is absent, SO does **not** touch a running Caddy — avoids silently reverting an image set out-of-band (ansible playbook, another deployment's spec)
- `strategy: Recreate` on the Caddy Deployment manifest (required — hostPort 80/443 deadlocks rolling updates)
- Reconcile runs under both `--perform-cluster-management` and the default `--skip-cluster-management` (it's a k8s-API patch, not a cluster-lifecycle op)
- Image template by container name rather than string match, so the spec override wins regardless of what the shipped manifest hardcodes
- Cluster-scoped caveat documented: `caddy-system` is shared across deployments, so the last `deployment start` that sets the key wins for everyone
2026-04-21 14:40:39 +05:30
prathamesh0 421b83c430
k8s: shared-cluster safety checks and deployment-id decoupling (#748)
- **Kind extraMount compatibility**: fail fast at `deployment start` when a new deployment's mounts don't match the running cluster; warn when the first cluster is created without a `kind-mount-root` umbrella; replace the cryptic `ConfigException` with readable errors when the cluster is missing
- **Auto-ConfigMap for file-level host-path compose volumes** (so-7fc): `../config/foo.sh:/opt/foo.sh`-style binds become per-namespace ConfigMaps at deploy start instead of aliasing via the kind extraMount chain. `deploy create` rejects `:rw`, subdirs, and over-budget sources. Deployment-dir layout unchanged
- **Namespace ownership**: stamp the namespace with `laconic.com/deployment-dir` on create; fail loudly if another deployment tries to land in the same namespace. Pre-existing namespaces adopt ownership on next start
- **deployment-id / cluster-id decoupling**: split the two roles (kube context vs resource-name prefix) into separate `deployment.yml` fields. Backward-compat fallback keeps existing resource names stable
- Close stale pebbles `so-n1n` and `so-ad7`
2026-04-21 12:17:28 +05:30
prathamesh0 eb4704b563
chore(pebbles): close so-o2o (#747)
Deploy Test / Run deploy test suite (push) Failing after 0s Details
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
Lint Checks / Run linter (push) Failing after 0s Details
Publish / Gate: k8s deploy e2e (push) Failing after 3s Details
Publish / Build and publish (push) Has been skipped Details
Implementation shipped in PR #746. Woodburn migration (one-shot
host-kubectl export to seed the backup file) completed manually.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:40:12 +05:30
prathamesh0 1334900407
so-o2o: detect etcd image dynamically + diagnose whitelist cleanup bugs (#745)
Replaces the hardcoded `gcr.io/etcd-development/etcd:v3.5.9` in `_clean_etcd_keeping_certs` with a dynamic ref captured from the running Kind node via `crictl`, persisted to `{backup_dir}/etcd-image.txt` and reused on subsequent cleanup runs. Self-adapts to Kind upgrades, no version table to maintain.

Testing on Kind v0.32 / etcd 3.6 surfaced two additional bugs in the whitelist cleanup that this PR does **not** fix (see so-o2o comments):
(a) the restore step pipes raw protobuf values through bash `echo`, corrupting binary bytes;
(b) the whitelist omits cluster-admin RBAC, SAs, and bootstrap tokens needed by kubeadm's pre-addon health check.

Merging this narrow fix + diagnosis trail; follow-up branch will replace the etcd-surgery approach with a kubectl-level Caddy secret backup/restore.
2026-04-17 13:48:30 +05:30
prathamesh0 cf8b7533fe
so-ad7: build per-pod Service for maintenance container (#744)
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Publish / Gate: k8s deploy e2e (push) Failing after 3s Details
Publish / Build and publish (push) Has been skipped Details
Deploy Test / Run deploy test suite (push) Failing after 0s Details
K8s Deploy Test / Run deploy test suite on kind/k8s (push) Failing after 0s Details
Lint Checks / Run linter (push) Failing after 0s Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
- Maintenance-page swap during `restart` was broken: Ingress got patched to point at `{app_name}-{pod_name}-service` for the maintenance pod, but that Service was never created. Caddy had no valid backend, users saw "site cannot be reached" instead of the maintenance page
- Root cause: `get_services()` only builds per-pod Services for pods referenced by `http-proxy` routes; the maintenance pod has no http-proxy route by design
- Fix: `get_services()` now also includes the container named by `maintenance-service:` in the container-ports map, so its per-pod `Service` gets built and sits idle until the swap window
- Also files `so-b9a` (P4) noting the latent fragility in the resolver/builder contract
2026-04-16 15:07:25 +05:30
prathamesh0 fc5dc80058
so-l2l: in-place stop/restart via label-scoped cleanup (#743)
- `down()` scopes cleanup to a single stack via `app.kubernetes.io/stack` and keeps the namespace `Active` by default
- New `stop/down --delete-namespace` flag for opt-in full teardown
- `down()` is synchronous - waits until resources are actually gone before returning. Callers can drop their own wait loops
- `up()` skip-if-exists for Jobs completes the create-or-replace coverage (other kinds already had it)
- Orphan PVs from a prior `stop --delete-namespace` get cleaned on the next `stop --delete-volumes`
- Every k8s resource SO creates now carries `app.kubernetes.io/stack` via a new `ClusterInfo._stack_labels()` helper
- Closes so-l2l, so-076.2. Also includes pebble audit: closes so-c71, so-b2b, so-k1k; files so-328
2026-04-16 12:10:04 +05:30
prathamesh0 f40913d187
Fix Kind port mappings and configmap source path resolution (#742)
Publish / Gate: k8s deploy e2e (push) Failing after 2s Details
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Publish / Build and publish (push) Has been skipped Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
Lint Checks / Run linter (push) Failing after 0s Details
Deploy Test / Run deploy test suite (push) Failing after 0s Details
- Only map host ports for services with network_mode: host (80/443 for Caddy always mapped). Previously all compose service ports were mapped unconditionally, causing conflicts with local services like postgres and redis
- Use spec configmap values as source paths instead of ignoring them. Fixes configmaps with user-defined paths (e.g. `stack-orchestrator/compose/maintenance`) and home-relative paths (e.g. `~/.credentials/local-certs/s3`)
- Read configmap files from deployment dir (`configmaps/{name}/`) when building k8s ConfigMap objects, not from the spec's source path which doesn't exist in the deployment dir
- File pebbles: `so-c71` (resolved), `so-078`: self-sufficient deployments (hooks should be copied to deployment dir)
2026-04-14 17:33:47 +05:30
A. F. Dudley 549ac8c01d Merge fix/kind-mount-propagation: all local branches unified
Merges 6 local branches into main:
- enya: HostToContainer mount propagation for kind-mount-root
- fix/k8s-port-mappings-v5: port protocol parsing, namespace fix
- peirce: idempotent deploy (create-or-replace), update-envs rename
- prince: etcd cleanup whitelist
- wd-a7b: timestamp cluster IDs, stack-derived namespaces, jobs,
  multi-cert ingress, user secrets, _build_containers refactor
- fix/kind-mount-propagation: deployment prepare command, pebbles

Conflicts resolved keeping main's evolved multi-pod architecture
(get_deployments, per-pod Services, CA cert injection) while
incorporating branch additions (HostToContainer propagation,
user secrets, jobs support).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:26:05 +00:00
A. F. Dudley d50bd2b6d2 Merge wd-a7b: cluster-id/namespace naming, jobs, multi-cert, secrets
Combines timestamp-based cluster IDs, namespace derived from stack name,
_build_containers refactor, jobs support, multi-ingress certificates,
user-declared secrets, and label-based resource cleanup with the existing
idempotent deploy, mount propagation, and port mapping fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:22:07 +00:00
A. F. Dudley 7141dc7637 file so-p3p: laconic-so should manage Caddy ingress image lifecycle
Lint Checks / Run linter (push) Failing after 0s Details
Publish / Build and publish (push) Failing after 0s Details
Deploy Test / Run deploy test suite (push) Failing after 0s Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:30:46 +00:00
A. F. Dudley 24cf22fea5 File pebbles: mount propagation merge + etcd cert backup broken
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:01:20 +00:00
A. F. Dudley 967936e524 Multi-deployment: one k8s Deployment per pod in stack.yml
Lint Checks / Run linter (push) Failing after 0s Details
Deploy Test / Run deploy test suite (push) Failing after 0s Details
Webapp Test / Run webapp test suite (push) Failing after 0s Details
Publish / Build and publish (push) Failing after 0s Details
Smoke Test / Run basic test suite (push) Failing after 0s Details
Each pod entry in stack.yml now creates its own k8s Deployment with
independent lifecycle and update strategy. Pods with PVCs get Recreate,
pods without get RollingUpdate. This enables maintenance services that
survive main pod restarts.

- cluster_info: get_deployments() builds per-pod Deployments, Services
- cluster_info: Ingress routes to correct per-pod Service
- deploy_k8s: _create_deployment() iterates all Deployments/Services
- deployment: restart swaps Ingress to maintenance service during Recreate
- spec: add maintenance-service key

Single-pod stacks are backward compatible (same resource names).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:40:45 +00:00
A. F. Dudley 25e5ff09d9 so-m3m: add credentials-files spec key for on-disk credential injection
_write_config_file() now reads each file listed under the credentials-files
top-level spec key and appends its contents to config.env after config vars.
Paths support ~ expansion. Missing files fail hard with sys.exit(1).

Also adds get_credentials_files() to Spec class following the same pattern
as get_image_registry_config().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 21:55:28 +00:00
A. F. Dudley e5a8ec5f06 fix: rename registry secret to image-pull-secret
The secret name `{app}-registry` is ambiguous — it could be a container
registry credential or a Laconic registry config. Rename to
`{app}-image-pull-secret` which clearly describes its purpose as a
Kubernetes imagePullSecret for private container registries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:33:11 +00:00
A. F. Dudley 9c5b8e3f4e chore: initialize pebbles issue tracker
Track stack-orchestrator work items with pebbles (append-only event log).

Epic so-076: Stack composition — deploy multiple stacks into one kind cluster
with independent lifecycle management per sub-stack.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 06:56:25 +00:00