chore(pebbles): close so-l2l and so-076.2

Both resolved by the so-l2l Part A+B refactor on this branch: restart no longer calls down(); down() scopes cleanup by app.kubernetes.io/stack label and keeps the namespace Active by default. --delete-namespace flag added for opt-in full teardown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:24:50 +00:00 · 2026-04-16 06:24:50 +00:00 · 1ea000e3c8
parent 774b39836e
commit 1ea000e3c8
1 changed files with 4 additions and 0 deletions
--- a/.pebbles/events.jsonl
+++ b/.pebbles/events.jsonl
@ -35,3 +35,7 @@
 {"type":"comment","timestamp":"2026-04-15T06:18:08.436540869Z","issue_id":"so-076.2","payload":{"body":"Partially mitigated by commit cc6acd5f which flipped --skip-cluster-management default to true, so 'deployment stop' no longer destroys the cluster by default. Root fix still open: down() in deploy_k8s.py:904-936 unconditionally calls _delete_namespace() (line 929) and destroy_cluster() (line 936) when --perform-cluster-management is passed. No logic distinguishes shared vs dedicated clusters."}}
 {"type":"comment","timestamp":"2026-04-15T06:18:11.374723274Z","issue_id":"so-l2l","payload":{"body":"Partially addressed. Readiness probes are now generated in cluster_info.py:652-671 (part C of the original fix). Parts A and B still open: up() does not use patch/apply (delete/recreate semantics remain), and down() still calls _delete_namespace() unconditionally at deploy_k8s.py:929 on every restart. A 'fix: never delete namespace on deployment down' commit (ae2cea34) exists on a remote branch but is not merged to main."}}
 {"type":"create","timestamp":"2026-04-15T11:11:15.584733236Z","issue_id":"so-328","payload":{"description":"deployment restart runs create_operation(update=True) which uses copytree(dirs_exist_ok=True) to sync the stack repo into the deployment dir (deployment_create.py:1079, 1130). This is additive only — it overwrites and adds files, but never removes them. Two resulting gaps:\n\n1. Deletions don't propagate. If a script, configmap file, or compose service is removed from the stack repo, the deployment dir keeps it, and up_operation keeps applying it. The k8s ConfigMap retains removed keys; removed Deployments/Services are not cleaned up (up() is create/patch, not full reconciliation). Operators see stale files and orphan workloads that won't disappear without manual kubectl intervention or a full teardown.\n\n2. stack.yml structural changes don't auto-surface in the spec. If a stack.yml gains a new configmap declaration or a new compose file reference, restart doesn't pull it in unless the operator's spec.yml already references it. The spec is the contract, so additions to the stack aren't applied to live deployments just by pulling the repo.\n\nTeardown + redeploy is the only reliable way to clean up today, which is not practical in production.\n\nFix direction: create_operation(update=True) should treat the deployment dir as reconciled state — diff the desired tree (from the stack repo + spec) against what's on disk and remove files that no longer exist upstream. up_operation should then delete k8s resources (Deployments, Services, ConfigMaps) that are no longer declared by any compose/configmap source, likely scoped by an 'app.kubernetes.io/managed-by: laconic-so' label to avoid nuking unrelated resources. For new stack.yml entries, consider whether the spec needs operator action or whether restart can auto-detect and warn.","priority":"3","title":"deployment restart does not propagate repo deletions or new stack.yml entries","type":"bug"}}
 {"type":"comment","timestamp":"2026-04-16T06:24:38.826132538Z","issue_id":"so-l2l","payload":{"body":"Fixed in so-l2l Parts A and B on this branch:\n\n**Part A (up() as create-or-update):** Deployments, Services, ConfigMaps, Secrets, Ingresses, and Endpoints already used create-or-replace in up(). Completed coverage by adding 409 skip-if-exists for Jobs (one-shot, re-run unwanted). Readiness probes (Part C) were already present.\n\n**Part B (down() preserves namespace):** _delete_labeled_resources now deletes by 'app.kubernetes.io/stack' label and keeps the namespace Active. Full-teardown option is a new --delete-namespace flag on stop/down. down() is synchronous (waits for resources to actually be gone before returning) so tests and ansible can assume clean state on return. Orphan PVs from prior --delete-namespace runs are also cleaned on subsequent stop --delete-volumes.\n\nrestart no longer calls down() at all (deployment.py:421-468), so the original wd-d92-style namespace termination race is structurally impossible. In-cluster rolling updates work via k8s native semantics when Deployment pod specs change via replace."}}
 {"type":"close","timestamp":"2026-04-16T06:24:39.175431401Z","issue_id":"so-l2l","payload":{}}
 {"type":"comment","timestamp":"2026-04-16T06:24:41.70556861Z","issue_id":"so-076.2","payload":{"body":"Fixed on chore/pebble-status-audit. stop now uses label-scoped cleanup (app.kubernetes.io/stack=\u003cstack\u003e) and keeps the namespace Active by default. The Kind cluster is not destroyed unless --perform-cluster-management is passed. Full namespace teardown is opt-in via the new --delete-namespace flag. Multiple stacks sharing a namespace/cluster are now cleaned up independently, not blown away en masse."}}
 {"type":"close","timestamp":"2026-04-16T06:24:42.153940477Z","issue_id":"so-076.2","payload":{}}