fix: ashburn relay playbooks and document DZ tunnel ACL root cause

Playbook fixes from testing: - ashburn-relay-biscayne: insert DNAT rules at position 1 before Docker's ADDRTYPE LOCAL rule (was being swallowed at position 3+) - ashburn-relay-mia-sw01: add inbound route for 137.239.194.65 via egress-vrf vrf1 (nexthop only, no interface — EOS silently drops cross-VRF routes that specify a tunnel interface) - ashburn-relay-was-sw01: replace PBR with static route, remove Loopback101 Bug doc (bug-ashburn-tunnel-port-filtering.md): root cause is the DoubleZero agent on mia-sw01 overwrites SEC-USER-500-IN ACL, dropping outbound gossip with src 137.239.194.65. The DZ agent controls Tunnel500's lifecycle. Fix requires a separate GRE tunnel using mia-sw01's free LAN IP (209.42.167.137) to bypass DZ infrastructure. Also adds all repo docs, scripts, inventory, and remaining playbooks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 01:44:25 +00:00 · 2026-03-07 01:44:25 +00:00 · 0b52fc99d7
parent 6841d5e3c3
commit 0b52fc99d7
41 changed files with 40587 additions and 134 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,3 @@
 .venv/
 sessions.duckdb
 sessions.duckdb.wal
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,204 @@
 # Biscayne Agave Runbook
 ## Cluster Operations
 ### Shutdown Order
 The agave validator runs inside a kind-based k8s cluster managed by `laconic-so`.
 The kind node is a Docker container. **Never restart or kill the kind node container
 while the validator is running.** Agave uses `io_uring` for async I/O, and on ZFS,
 killing the process can produce unkillable kernel threads (D-state in
 `io_wq_put_and_exit` blocked on ZFS transaction commits). This deadlocks the
 container's PID namespace, making `docker stop`, `docker restart`, `docker exec`,
 and even `reboot` hang.
 Correct shutdown sequence:
 1. Scale the deployment to 0 and wait for the pod to terminate:
   ```
   kubectl scale deployment laconic-70ce4c4b47e23b85-deployment \
     -n laconic-laconic-70ce4c4b47e23b85 --replicas=0
   kubectl wait --for=delete pod -l app=laconic-70ce4c4b47e23b85-deployment \
     -n laconic-laconic-70ce4c4b47e23b85 --timeout=120s
   ```
 2. Only then restart the kind node if needed:
   ```
   docker restart laconic-70ce4c4b47e23b85-control-plane
   ```
 3. Scale back up:
   ```
   kubectl scale deployment laconic-70ce4c4b47e23b85-deployment \
     -n laconic-laconic-70ce4c4b47e23b85 --replicas=1
   ```
 ### Ramdisk
 The accounts directory must be on a ramdisk for performance. `/dev/ram0` loses its
 filesystem on reboot and must be reformatted before mounting.
 **Boot ordering is handled by systemd units** (installed by `biscayne-boot.yml`):
 - `format-ramdisk.service`: runs `mkfs.xfs -f /dev/ram0` before `local-fs.target`
 - fstab entry: mounts `/dev/ram0` at `/srv/solana/ramdisk` with
  `x-systemd.requires=format-ramdisk.service`
 - `ramdisk-accounts.service`: creates `/srv/solana/ramdisk/accounts` and sets
  ownership after the mount
 These units run before docker, so the kind node's bind mounts always see the
 ramdisk. **No manual intervention is needed after reboot.**
 **Mount propagation**: The kind node bind-mounts `/srv/kind` → `/mnt`. Because
 the ramdisk is mounted at `/srv/solana/ramdisk` and symlinked/overlaid through
 `/srv/kind/solana/ramdisk`, mount propagation makes it visible inside the kind
 node at `/mnt/solana/ramdisk` without restarting the kind node. **Do NOT restart
 the kind node just to pick up a ramdisk mount.**
 ### KUBECONFIG
 kubectl must be told where the kubeconfig is when running as root or via ansible:
 ```
 KUBECONFIG=/home/rix/.kube/config kubectl ...
 ```
 The ansible playbooks set `environment: KUBECONFIG: /home/rix/.kube/config`.
 ### SSH Agent
 SSH to biscayne goes through a ProxyCommand jump host (abernathy.ch2.vaasl.io).
 The SSH agent socket rotates when the user reconnects. Find the current one:
 ```
 ls -t /tmp/ssh-*/agent.* | head -1
 ```
 Then export it:
 ```
 export SSH_AUTH_SOCK=/tmp/ssh-XXXX/agent.NNNN
 ```
 ### io_uring/ZFS Deadlock — Root Cause
 When agave-validator is killed while performing I/O against ZFS-backed paths (not
 the ramdisk), io_uring worker threads get stuck in D-state:
 ```
 io_wq_put_and_exit → dsl_dir_tempreserve_space (ZFS module)
 ```
 These threads are unkillable (SIGKILL has no effect on D-state processes). They
 prevent the container's PID namespace from being reaped (`zap_pid_ns_processes`
 waits forever), which breaks `docker stop`, `docker restart`, `docker exec`, and
 even `reboot`. The only fix is a hard power cycle.
 **Prevention**: Always scale the deployment to 0 and wait for the pod to terminate
 before any destructive operation (namespace delete, kind restart, host reboot).
 The `biscayne-stop.yml` playbook enforces this.
 ### laconic-so Architecture
 `laconic-so` manages kind clusters atomically — `deployment start` creates the
 kind cluster, namespace, PVs, PVCs, and deployment in one shot. There is no way
 to create the cluster without deploying the pod.
 Key code paths in stack-orchestrator:
 - `deploy_k8s.py:up()` — creates everything atomically
 - `cluster_info.py:get_pvs()` — translates host paths using `kind-mount-root`
 - `helpers_k8s.py:get_kind_pv_bind_mount_path()` — strips `kind-mount-root`
  prefix and prepends `/mnt/`
 - `helpers_k8s.py:_generate_kind_mounts()` — when `kind-mount-root` is set,
  emits a single `/srv/kind` → `/mnt` mount instead of individual mounts
 The `kind-mount-root: /srv/kind` setting in `spec.yml` means all data volumes
 whose host paths start with `/srv/kind` get translated to `/mnt/...` inside the
 kind node via a single bind mount.
 ### Key Identifiers
 - Kind cluster: `laconic-70ce4c4b47e23b85`
 - Namespace: `laconic-laconic-70ce4c4b47e23b85`
 - Deployment: `laconic-70ce4c4b47e23b85-deployment`
 - Kind node container: `laconic-70ce4c4b47e23b85-control-plane`
 - Deployment dir: `/srv/deployments/agave`
 - Snapshot dir: `/srv/solana/snapshots`
 - Ledger dir: `/srv/solana/ledger`
 - Accounts dir: `/srv/solana/ramdisk/accounts`
 - Log dir: `/srv/solana/log`
 - Host bind mount root: `/srv/kind` -> kind node `/mnt`
 - laconic-so: `/home/rix/.local/bin/laconic-so` (editable install)
 ### PV Mount Paths (inside kind node)
 | PV Name              | hostPath                      |
 |----------------------|-------------------------------|
 | validator-snapshots  | /mnt/solana/snapshots         |
 | validator-ledger     | /mnt/solana/ledger            |
 | validator-accounts   | /mnt/solana/ramdisk/accounts  |
 | validator-log        | /mnt/solana/log               |
 ### Snapshot Freshness
 If the snapshot is more than **20,000 slots behind** the current mainnet tip, it is
 too old. Stop the validator, download a fresh snapshot, and restart. Do NOT let it
 try to catch up from an old snapshot — it will take too long and may never converge.
 Check with:
 ```
 # Snapshot slot (from filename)
 ls /srv/solana/snapshots/snapshot-*.tar.*
 # Current mainnet slot
 curl -s -X POST -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"getSlot","params":[{"commitment":"finalized"}]}' \
  https://api.mainnet-beta.solana.com
 ```
 ### Snapshot Leapfrog Recovery
 When the validator is stuck in a repair-dependent gap (incomplete shreds from a
 relay outage or insufficient turbine coverage), "grinding through" doesn't work.
 At 0.4 slots/sec replay through incomplete blocks vs 2.5 slots/sec chain
 production, the gap grows faster than it shrinks.
 **Strategy**: Download a fresh snapshot whose slot lands *past* the incomplete zone,
 into the range where turbine+relay shreds are accumulating in the blockstore.
 **Keep the existing ledger** — it has those shreds. The validator replays from
 local blockstore data instead of waiting on repair.
 **Steps**:
 1. Let the validator run — turbine+relay accumulate shreds at the tip
 2. Monitor shred completeness at the tip:
   `scripts/check-shred-completeness.sh 500`
 3. When there's a contiguous run of complete blocks (>100 slots), note the
   starting slot of that run
 4. Scale to 0, wipe accounts (ramdisk), wipe old snapshots
 5. **Do NOT wipe ledger** — it has the turbine shreds
 6. Download a fresh snapshot (its slot should be within the complete run)
 7. Scale to 1 — validator replays from local blockstore at 3-5 slots/sec
 **Why this works**: Turbine delivers ~60% of shreds in real-time. Repair fills
 the rest for recent slots quickly (peers prioritize recent data). The only
 problem is repair for *old* slots (minutes/hours behind) which peers deprioritize.
 By snapshotting past the gap, we skip the old-slot repair bottleneck entirely.
 ### Shred Relay (Ashburn)
 The TVU shred relay from laconic-was-sw01 provides ~4,000 additional shreds/sec.
 Without it, turbine alone delivers ~60% of blocks. With it, completeness improves
 but still requires repair for full coverage.
 **Current state**: Old pipeline (monitor session + socat + shred-unwrap.py).
 The traffic-policy redirect was never committed (auto-revert after 5 min timer).
 See `docs/tvu-shred-relay.md` for the traffic-policy config that needs to be
 properly applied.
 **Boot dependency**: `shred-unwrap.py` must be running on biscayne for the old
 pipeline to work. It is NOT persistent across reboots. The iptables DNAT rule
 for the new pipeline IS persistent (iptables-persistent installed).
 ### Redeploy Flow
 See `playbooks/biscayne-redeploy.yml`. The scale-to-0 pattern is required because
 `laconic-so` creates the cluster and deploys the pod atomically:
 1. Delete namespace (teardown)
 2. Optionally wipe data
 3. `laconic-so deployment start` (creates cluster + pod)
 4. Immediately scale to 0
 5. Download snapshot via aria2c
 6. Scale to 1
 7. Verify
--- a/README.md
+++ b/README.md
@ -0,0 +1,3 @@
 # biscayne-agave-runbook
 Ansible playbooks for operating the kind-based agave-stack deployment on biscayne.vaasl.io.
--- a/ansible.cfg
+++ b/ansible.cfg
@ -0,0 +1,13 @@
 [defaults]
 inventory = inventory/
 stdout_callback = ansible.builtin.default
 result_format = yaml
 callbacks_enabled = profile_tasks
 retry_files_enabled = false
 [privilege_escalation]
 become = true
 become_method = sudo
 [ssh_connection]
 pipelining = true
--- a/docs/arista-eos-reference.md
+++ b/docs/arista-eos-reference.md
@ -0,0 +1,114 @@
 # Arista EOS Reference Notes
 Collected from live switch CLI (`?` help) and Arista documentation search
 results. Switch platform: 7280CR3A, EOS 4.34.0F.
 ## PBR (Policy-Based Routing)
 EOS uses `policy-map type pbr` — NOT `traffic-policy` (which is a different
 feature for ASIC-level traffic policies, not available on all platforms/modes).
 ### Syntax
 ```
 ! ACL to match traffic
 ip access-list <ACL-NAME>
   10 permit <proto> <src> <dst> [ports]
 ! Class-map referencing the ACL
 class-map type pbr match-any <CLASS-NAME>
   match ip access-group <ACL-NAME>
 ! Policy-map with nexthop redirect
 policy-map type pbr <POLICY-NAME>
   class <CLASS-NAME>
      set nexthop <A.B.C.D>          ! direct nexthop IP
      set nexthop recursive <A.B.C.D> ! recursive resolution
      ! set nexthop-group <NAME>      ! nexthop group
      ! set ttl <value>               ! TTL override
 ! Apply on interface
 interface <INTF>
   service-policy type pbr input <POLICY-NAME>
 ```
 ### PBR `set` options (from CLI `?`)
 ```
 set ?
  nexthop        Next hop IP address for forwarding
  nexthop-group  next hop group name
  ttl            TTL effective with nexthop/nexthop-group
 ```
 ```
 set nexthop ?
  A.B.C.D          next hop IP address
  A:B:C:D:E:F:G:H  next hop IPv6 address
  recursive        Enable Recursive Next hop resolution
 ```
 **No VRF qualifier on `set nexthop`.** The nexthop must be reachable in the
 VRF where the policy is applied. For cross-VRF PBR, use a static inter-VRF
 route to make the nexthop reachable (see below).
 ## Static Inter-VRF Routes
 Source: [EOS 4.34.0F - Static Inter-VRF Route](https://www.arista.com/en/um-eos/eos-static-inter-vrf-route)
 Allows configuring a static route in one VRF with a nexthop evaluated in a
 different VRF. Uses the `egress-vrf` keyword.
 ### Syntax
 ```
 ip route vrf <ingress-vrf> <prefix>/<mask> egress-vrf <egress-vrf> <nexthop-ip>
 ip route vrf <ingress-vrf> <prefix>/<mask> egress-vrf <egress-vrf> <interface>
 ```
 ### Examples (from Arista docs)
 ```
 ! Route in vrf1 with nexthop resolved in default VRF
 ip route vrf vrf1 1.0.1.0/24 egress-vrf default 1.0.0.2
 ! show ip route vrf vrf1 output:
 ! S 1.0.1.0/24 [1/0] via 1.0.0.2, Vlan2180 (egress VRF default)
 ```
 ### Key points
 - For bidirectional traffic, static inter-VRF routes must be configured in
  both VRFs.
 - ECMP next-hop sets across same or heterogeneous egress VRFs are supported.
 - The `show ip route vrf` output displays the egress VRF name when it differs
  from the source VRF.
 ## Inter-VRF Local Route Leaking
 Source: [EOS 4.35.1F - Inter-VRF Local Route Leaking](https://www.arista.com/en/um-eos/eos-inter-vrf-local-route-leaking)
 An alternative to static inter-VRF routes that leaks routes dynamically from
 one VRF (source) to another VRF (destination) on the same router.
 ## Config Sessions
 ```
 configure session <name>           ! enter named session
 show session-config diffs          ! MUST be run from inside the session
 commit timer HH:MM:SS              ! commit with auto-revert timer
 abort                              ! discard session
 ```
 From enable mode:
 ```
 configure session <name> commit    ! finalize a pending session
 ```
 ## Checkpoints and Rollback
 ```
 configure checkpoint save <name>
 rollback running-config checkpoint <name>
 write memory
 ```
--- a/docs/arista-scraped/acls-and-route-maps.md
+++ b/docs/arista-scraped/acls-and-route-maps.md
--- a/docs/arista-scraped/ingress-and-egress-per-port-for-ipv4-and-ipv6-counters.md
+++ b/docs/arista-scraped/ingress-and-egress-per-port-for-ipv4-and-ipv6-counters.md
@ -0,0 +1,181 @@
 <!-- Source: https://www.arista.com/um-eos/eos-ingress-and-egress-per-port-for-ipv4-and-ipv6-counters -->
 <!-- Scraped: 2026-03-06T20:50:41.080Z -->
 # Ingress and Egress Per-Port for IPv4 and IPv6 Counters
 This feature supports per-interface ingress and egress packet and byte counters for IPv4
 and IPv6.
 This section describes Ingress and Egress per-port for IPv4 and IPv6 counters, including
 configuration instructions and command descriptions.
 Topics covered by this chapter include:
 - Configuration
 - Show commands
 - Dedicated ARP Entry for TX IPv4 and IPv6 Counters
 - Considerations
 ## Configuration
 IPv4 and IPv6 ingress counters (count **bridged and routed**
 traffic, supported only on front-panel ports) can be enabled and disabled using the
 **hardware counter feature ip in**
 command:
 ```
 `**[no] hardware counter feature ip in**`
 ```
 For IPv4 and IPv6 ingress and egress counters that include only
 **routed** traffic (supported on Layer3 interfaces such as
 routed ports and L3 subinterfaces only), use the following commands:
 Note: The DCS-7300X, DCS-7250X, DCS-7050X, and DCS-7060X platforms
 do not require configuration for IPv4 and IPv6 packet counters for only routed
 traffic. They are collected by default. Other platforms (DCS-7280SR, DCS-7280CR, and
 DCS-7500-R) need the feature enabled.
 ```
 `**[no] hardware counter feature ip in layer3**`
 ```
 ```
 `**[no] hardware counter feature ip out layer3**`
 ```
 ### hardware counter feature ip
 Use the **hardware counter feature ip** command to enable ingress
 and egress counters at Layer 3. The **no** and **default** forms of the command
 disables the feature. The feature is enabled by default.
 **Command Mode**
 Configuration mode
 **Command Syntax**
 **hardware counter feature ip in|out layer3**
 **no hardware counter feature ip in|out layer3**
 **default hardware counter feature in|out layer3**
 **Example**
 This example enables ingress and egress ip counters for Layer 3.
 ```
 `**switch(config)# hardware counter feature in layer3**`
 ```
 ```
 `**switch(config)# hardware counter feature out layer3**`
 ```
 ## Show commands
 Use the [**show interfaces counters ip**](/um-eos/eos-ethernet-ports#xzx_RbdvgrfI6B) command to
 display IPv4, IPv6 packets, and octets.
 **Example**
 ```
 `switch# **show interfaces counters ip**
 Interface   IPv4InOctets    IPv4InPkts     IPv6InOctets    IPv6InPkts
 Et1/1            0               0               0               0
 Et1/2            0               0               0               0
 Et1/3            0               0               0               0
 Et1/4            0               0               0               0
 ...
 Interface  IPv4OutOctets  IPv4OutPkts    IPv6OutOctets   IPv6OutPkts
 Et1/1            0               0               0               0
 Et1/2            0               0               0               0
 Et1/3            0               0               0               0
 Et1/4            0               0               0               0
 ...`
 ```
 You can also query the output from the **show interfaces counters
 ip** command through snmp via the ARISTA-IP-MIB.
 To clear the IPv4 or IPv6 counters, use the [**clear
 counters**](/um-eos/eos-ethernet-ports#topic_dnd_1nm_vnb) command.
 **Example**
 ```
 `switch# **clear counters**`
 ```
 ## Dedicated ARP Entry for TX IPv4 and IPv6 Counters
 IPv4/IPv6 egress Layer 3 (**hardware counter feature ip out layer3**)
 counting on DCS-7280SR, DCS-7280CR, and DCS-7500-R platforms work based on ARP entry of
 the next hop. By default, IPv4's next-hop and IPv6's next-hop resolve to the same MAC
 address and interface that shared the ARP entry.
 To differentiate the counters between IPv4 and IPv6, disable
 **arp** entry sharing with the following command:
 ```
 `**ip hardware fib next-hop arp dedicated**`
 ```
            Note: This command is required for IPv4 and IPv6 egress counters
                to operate on the DCS-7280SR, DCS-7280CR, and DCS-7500-R platforms.
 ## Considerations
                - Packet sizes greater than 9236 bytes are not counted by per-port IPv4 and IPv6 counters.
                - Only the DCS-7260X3, DCS-7368, DCS-7300, DCS-7050SX3, DCS-7050CX3, DCS-7280SR,
                    DCS-7280CR and DCS-7500-R platforms support the **hardware counter feature ip in** command.
                - Only the DCS-7280SR, DCS-7280CR and DCS-7500-R platforms support the **hardware counter feature ip [in|out] layer3** command.
--- a/docs/arista-scraped/inter-vrf-local-route-leaking.md
+++ b/docs/arista-scraped/inter-vrf-local-route-leaking.md
@ -0,0 +1,305 @@
 <!-- Source: https://www.arista.com/en/um-eos/eos-inter-vrf-local-route-leaking -->
 <!-- Scraped: 2026-03-06T20:43:28.363Z -->
 # Inter-VRF Local Route Leaking
 Inter-VRF local route leaking allows the leaking of routes from one VRF (the source VRF) to
 another VRF (the destination VRF) on the same router.
 Inter-VRF routes can exist in any VRF (including the
 default VRF) on the system. Routes can be leaked using the
 following methods:
 - Inter-VRF Local Route Leaking using BGP
 VPN
 - Inter-VRF Local Route Leaking using VRF-leak
 Agent
 ## Inter-VRF Local Route Leaking using BGP VPN
 Inter-VRF local route leaking allows the user to export and import routes from one VRF to another
 on the same device. This is implemented by exporting routes from a VRF to the local VPN table
 using the route target extended community list and importing the same route target extended
 community lists from the local VPN table into the target VRF. VRF route leaking is supported
 on VPN-IPv4, VPN-IPv6, and EVPN types.
 Figure 1. Inter-VRF Local Route Leaking using Local VPN Table
 ### Accessing Shared Resources Across VPNs
 To access shared resources across VPNs, all the routes from the shared services VRF must be
 leaked into each of the VPN VRFs, and customer routes must be leaked into the shared
 services VRF for return traffic. Accessing shared resources allows the route target of the
 shared services VRF to be exported into all customer VRFs, and allows the shared services
 VRF to import route targets from customers A and B. The following figure shows how to
 provide customers, corresponding to multiple VPN domains, access to services like DHCP
 available in the shared VRF.
 Route leaking across the VRFs is supported
 on VPN-IPv4, VPN-IPv6, and EVPN.
 Figure 2. Accessing Shared Resources Across VPNs
 ### Configuring Inter-VRF Local Route Leaking
 Inter-VRF local route leaking is configured using VPN-IPv4, VPN-IPv6, and EVPN. Prefixes can be
 exported and imported using any of the configured VPN types. Ensure that the same VPN
 type that is exported is used while importing.
 Leaking unicast IPv4 or IPv6 prefixes is supported and achieved by exporting prefixes locally to
 the VPN table and importing locally from the VPN table into the target VRF on the same
 device as shown in the figure titled **Inter-VRF Local Route Leaking using Local VPN
 Table** using the **route-target** command.
 Exporting or importing the routes to or from the EVPN table is accomplished with the following
 two methods:
 - Using VXLAN for encapsulation
 - Using MPLS for encapsulation
 #### Using VXLAN for Encapsulation
 To use VXLAN encapsulation type, make sure that VRF to VNI mapping is present and the interface
 status for the VXLAN interface is up. This is the default encapsulation type for
 EVPN.
 **Example**
 The configuration for VXLAN encapsulation type is as
 follows:
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **address-family evpn**
 switch(config-router-bgp-af)# **neighbor default encapsulation VXLAN next-hop-self source-interface Loopback0**
 switch(config)# **hardware tcam**
 switch(config-hw-tcam)# **system profile VXLAN-routing**
 switch(config-hw-tcam)# **interface VXLAN1**
 switch(config-hw-tcam-if-Vx1)# **VXLAN source-interface Loopback0**
 switch(config-hw-tcam-if-Vx1)# **VXLAN udp-port 4789**
 switch(config-hw-tcam-if-Vx1)# **VXLAN vrf vrf-blue vni 20001**
 switch(config-hw-tcam-if-Vx1)# **VXLAN vrf vrf-red vni 10001**`
 ```
 #### Using MPLS for Encapsulation
 To use MPLS encapsulation type to export
 to the EVPN table, MPLS needs to be enabled globally on the device and
 the encapsulation method needs to be changed from default type, that
 is VXLAN to MPLS under the EVPN address-family sub-mode.
 **Example**
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **address-family evpn**
 switch(config-router-bgp-af)# **neighbor default encapsulation mpls next-hop-self source-interface Loopback0**`
 ```
 ### Route-Distinguisher
 Route-Distinguisher (RD) uniquely identifies routes from a particular VRF.
 Route-Distinguisher is configured for every VRF from which routes are exported from or
 imported into.
 The following commands are used to configure Route-Distinguisher for a VRF.
 ```
 `switch(config-router-bgp)# **vrf vrf-services**
 switch(config-router-bgp-vrf-vrf-services)# **rd 1.0.0.1:1**
 switch(config-router-bgp)# **vrf vrf-blue**
 switch(config-router-bgp-vrf-vrf-blue)# **rd 2.0.0.1:2**`
 ```
 ### Exporting Routes from a VRF
 Use the **route-target export** command to export routes from a VRF to the
 local VPN or EVPN table using the route target
 extended community list.
 **Examples**
 - These commands export routes from
 **vrf-red** to the local VPN
 table.
 ```
 `switch(config)# **service routing protocols model multi-agent**
 switch(config)# **mpls ip**
 switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-red**
 switch(config-router-bgp-vrf-vrf-red)# **rd 1:1**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv4 10:10**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv6 10:20**`
 ```
 - These commands export routes from
 **vrf-red** to the EVPN
 table.
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-red**
 switch(config-router-bgp-vrf-vrf-red)# **rd 1:1**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export evpn 10:1**`
 ```
 ### Importing Routes into a VRF
 Use the **route-target import** command to import the exported routes from
 the local VPN or EVPN table to the target VRF
 using the route target extended community
 list.
 **Examples**
 - These commands import routes from the VPN
 table to
 **vrf-blue**.
 ```
 `switch(config)# **service routing protocols model multi-agent**
 switch(config)# **mpls ip**
 switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-blue**
 switch(config-router-bgp-vrf-vrf-blue)# **rd 2:2**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv4 10:10**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv6 10:20**`
 ```
 - These commands import routes from the EVPN
 table to
 **vrf-blue**.
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-blue**
 switch(config-router-bgp-vrf-vrf-blue)# **rd 2:2**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import evpn 10:1**`
 ```
 ### Exporting and Importing Routes using Route
 Map
 To manage VRF route leaking, control the export and import prefixes with route-map export or
 import commands. The route map is effective only if the VRF or the VPN
 paths are already candidates for export or import. The route-target
 export or import commandmust be configured first. Setting BGP
 attributes using route maps is effective only on the export end.
 Note: Prefixes that are leaked are not re-exported to the VPN table from the target VRF.
 **Examples**
 - These commands export routes from
 **vrf-red** to the local VPN
 table.
 ```
 `switch(config)# **service routing protocols model multi-agent**
 switch(config)# **mpls ip**
 switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-red**
 switch(config-router-bgp-vrf-vrf-red)# **rd 1:1**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv4 10:10**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv6 10:20**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv4 route-map EXPORT_V4_ROUTES_T0_VPN_TABLE**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export vpn-ipv6 route-map EXPORT_V6_ROUTES_T0_VPN_TABLE**`
 ```
 - These commands export routes to from
 **vrf-red** to the EVPN
 table.
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-red**
 switch(config-router-bgp-vrf-vrf-red)# **rd 1:1**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export evpn 10:1**
 switch(config-router-bgp-vrf-vrf-red)# **route-target export evpn route-map EXPORT_ROUTES_T0_EVPN_TABLE**`
 ```
 - These commands import routes from the VPN table to
 **vrf-blue**.
 ```
 `switch(config)# **service routing protocols model multi-agent**
 switch(config)# **mpls ip**
 switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-blue**
 switch(config-router-bgp-vrf-vrf-blue)# **rd 1:1**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv4 10:10**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv6 10:20**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv4 route-map IMPORT_V4_ROUTES_VPN_TABLE**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import vpn-ipv6 route-map IMPORT_V6_ROUTES_VPN_TABLE**`
 ```
 - These commands import routes from the EVPN table to
 **vrf-blue**.
 ```
 `switch(config)# **router bgp 65001**
 switch(config-router-bgp)# **vrf vrf-blue**
 switch(config-router-bgp-vrf-vrf-blue)# **rd 2:2**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import evpn 10:1**
 switch(config-router-bgp-vrf-vrf-blue)# **route-target import evpn route-map IMPORT_ROUTES_FROM_EVPN_TABLE**`
 ```
 ## Inter-VRF Local Route Leaking using VRF-leak
 Agent
 Inter-VRF local route leaking allows routes to leak from one VRF to another using a route
 map as a VRF-leak agent. VRFs are leaked based on the preferences assigned to each
 VRF.
 ### Configuring Route Maps
 To leak routes from one VRF to another using a route map, use the [router general](/um-eos/eos-evpn-and-vcs-commands#xx1351777) command to enter Router-General
 Configuration Mode, then enter the VRF submode for the destination VRF, and use the
 [leak routes](/um-eos/eos-evpn-and-vcs-commands#reference_g2h_2z3_hwb) command to specify the source
 VRF and the route map to be used. Routes in the source VRF that match the policy in the
 route map will then be considered for leaking into the configuration-mode VRF. If two or
 more policies specify leaking the same prefix to the same destination VRF, the route
 with a higher (post-set-clause) distance and preference is chosen.
 **Example**
 These commands configure a route map to leak routes from **VRF1**
 to **VRF2** using route map
 **RM1**.
 ```
 `switch(config)# **router general**
 switch(config-router-general)# **vrf VRF2**
 switch(config-router-general-vrf-VRF2)# **leak routes source-vrf VRF1 subscribe-policy RM1**
 switch(config-router-general-vrf-VRF2)#`
 ```
--- a/docs/arista-scraped/ipv4.md
+++ b/docs/arista-scraped/ipv4.md
--- a/docs/arista-scraped/nexthop-groups.md
+++ b/docs/arista-scraped/nexthop-groups.md
--- a/docs/arista-scraped/static-inter-vrf-route.md
+++ b/docs/arista-scraped/static-inter-vrf-route.md
@ -0,0 +1,82 @@
 <!-- Source: https://www.arista.com/en/um-eos/eos-static-inter-vrf-route -->
 <!-- Scraped: 2026-03-06T20:43:17.977Z -->
 # Static Inter-VRF Route
 The Static Inter-VRF Route feature adds support for static inter-VRF routes. This enables the configuration of routes to destinations in one ingress VRF with an ability to specify a next-hop in a different egress VRF through a static configuration.
 You can configure static inter-VRF routes in default and non-default VRFs. A different
 egress VRF is achieved by “tagging” the **next-hop** or **forwarding
 via** with a reference to an egress VRF (different from the source
 VRF) in which that next-hop should be evaluated. Static inter-VRF routes
 with ECMP next-hop sets in the same egress VRF or heterogenous egress VRFs
 can be specified.
 The Static Inter-VRF Route feature is independent and complementary to other mechanisms that can be used to setup local inter-VRF routes. The other supported mechanisms in EOS and the broader use-cases they support are documented here:
 - [Inter-VRF Local Route Leaking using BGP VPN](/um-eos/eos-inter-vrf-local-route-leaking#xx1348142)
 - [Inter-VRF Local Route Leaking using VRF-leak Agent](/um-eos/eos-inter-vrf-local-route-leaking#xx1346287)
 ## Configuration
 The configuration to setup static-Inter VRF routes in an ingress (source) VRF to forward IP traffic to a different egress (target) VRF can be done in the following modes:
 - This command creates a static route in one ingress VRF that points to a next-hop
 in a different egress VRF.
 ip | ipv6
 route [vrf
 vrf-name
 destination-prefix [egress-vrf
 egress-next-hop-vrf-name]
 next-hop]
 ## Show Commands
 Use the **show ip route vrf** to display the egress VRF name if it
 differs from the source VRF.
 **Example**
 ```
 `switch# **show ip route vrf vrf1**
 VRF: vrf1
 Codes: C - connected, S - static, K - kernel,
       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
       NG - Nexthop Group Static Route, V - VXLAN Control Service,
       DH - DHCP client installed default route, M - Martian,
       DP - Dynamic Policy Route, L - VRF Leaked
 Gateway of last resort is not set
 S        1.0.1.0/24 [1/0] via 1.0.0.2, Vlan2180 (egress VRF default)
 S        1.0.7.0/24 [1/0] via 1.0.6.2, Vlan2507 (egress VRF vrf3)`
 ```
 ## Limitations
            - For bidirectional traffic to work correctly between a pair of VRFs, static inter-VRF
                routes in both VRFs must be configured.
            - Static Inter-VRF routing is supported only in multi-agent routing protocol mode.
--- a/docs/arista-scraped/traffic-management.md
+++ b/docs/arista-scraped/traffic-management.md
--- a/docs/ashburn-validator-relay.md
+++ b/docs/ashburn-validator-relay.md
@ -0,0 +1,275 @@
 # Ashburn Validator Relay — Full Traffic Redirect
 ## Overview
 All validator traffic (gossip, repair, TVU, TPU) enters and exits from
 `137.239.194.65` (laconic-was-sw01, Ashburn). Peers see the validator as an
 Ashburn node. This improves repair peer count and slot catchup rate by reducing
 RTT to the TeraSwitch/Pittsburgh cluster from ~30ms (direct Miami) to ~5ms
 (Ashburn).
 Supersedes the previous TVU-only shred relay (see `tvu-shred-relay.md`).
 ## Architecture
 ```
                 OUTBOUND (validator → peers)
 agave-validator (kind pod, ports 8001, 9000-9025)
       ↓ Docker bridge → host FORWARD chain
 biscayne host (186.233.184.235)
       ↓ mangle PREROUTING: fwmark 100 on sport 8001,9000-9025 from 172.20.0.0/16
       ↓ nat POSTROUTING: SNAT → src 137.239.194.65
       ↓ policy route: fwmark 100 → table ashburn → via 169.254.7.6 dev doublezero0
 laconic-mia-sw01 (209.42.167.133, Miami)
       ↓ traffic-policy VALIDATOR-OUTBOUND: src 137.239.194.65 → nexthop 172.16.1.188
       ↓ backbone Et4/1 (25.4ms)
 laconic-was-sw01 Et4/1 (Ashburn)
       ↓ default route via 64.92.84.80 out Et1/1
 Internet (peers see src 137.239.194.65)
                 INBOUND (peers → validator)
 Solana peers → 137.239.194.65:8001,9000-9025
       ↓ internet routing to was-sw01
 laconic-was-sw01 Et1/1 (Ashburn)
       ↓ traffic-policy VALIDATOR-RELAY: ASIC redirect, line rate
       ↓ nexthop 172.16.1.189 via Et4/1 backbone (25.4ms)
 laconic-mia-sw01 Et4/1 (Miami)
       ↓ L3 forward → biscayne via doublezero0 GRE or ISP routing
 biscayne (186.233.184.235)
       ↓ nat PREROUTING: DNAT dst 137.239.194.65:* → 172.20.0.2:* (kind node)
       ↓ Docker bridge → validator pod
 agave-validator
 ```
 RPC traffic (port 8899) is NOT relayed — clients connect directly to biscayne.
 ## Switch Config: laconic-was-sw01
 SSH: `install@137.239.200.198`
 ### Pre-change
 ```
 configure checkpoint save pre-validator-relay
 ```
 Rollback: `rollback running-config checkpoint pre-validator-relay` then `write memory`.
 ### Config session with auto-revert
 ```
 configure session validator-relay
 ! Loopback for 137.239.194.65 (do NOT touch Loopback100 which has .64)
 interface Loopback101
   ip address 137.239.194.65/32
 ! ACL covering all validator ports
 ip access-list VALIDATOR-RELAY-ACL
   10 permit udp any any eq 8001
   20 permit udp any any range 9000 9025
   30 permit tcp any any eq 8001
 ! Traffic-policy: ASIC redirect to backbone (mia-sw01)
 traffic-policy VALIDATOR-RELAY
   match VALIDATOR-RELAY-ACL
      set nexthop 172.16.1.189
 ! Replace old SHRED-RELAY on Et1/1
 interface Ethernet1/1
   no traffic-policy input SHRED-RELAY
   traffic-policy input VALIDATOR-RELAY
 ! system-rule overriding-action redirect (already present from SHRED-RELAY)
 show session-config diffs
 commit timer 00:05:00
 ```
 After verification: `configure session validator-relay commit` then `write memory`.
 ### Cleanup (after stable)
 Old SHRED-RELAY policy and ACL can be removed once VALIDATOR-RELAY is confirmed:
 ```
 configure session cleanup-shred-relay
 no traffic-policy SHRED-RELAY
 no ip access-list SHRED-RELAY-ACL
 show session-config diffs
 commit
 write memory
 ```
 ## Switch Config: laconic-mia-sw01
 ### Pre-flight checks
 Before applying config, verify:
 1. Which EOS interface terminates the doublezero0 GRE from biscayne
   (endpoint 209.42.167.133). Check with `show interfaces tunnel` or
   `show ip interface brief | include Tunnel`.
 2. Whether `system-rule overriding-action redirect` is already configured.
   Check with `show running-config | include system-rule`.
 3. Whether EOS traffic-policy works on tunnel interfaces. If not, apply on
   the physical interface where GRE packets arrive (likely Et<X> facing
   biscayne's ISP network or the DZ infrastructure).
 ### Config session
 ```
 configure checkpoint save pre-validator-outbound
 configure session validator-outbound
 ! ACL matching outbound validator traffic (source = Ashburn IP)
 ip access-list VALIDATOR-OUTBOUND-ACL
   10 permit ip 137.239.194.65/32 any
 ! Redirect to was-sw01 via backbone
 traffic-policy VALIDATOR-OUTBOUND
   match VALIDATOR-OUTBOUND-ACL
      set nexthop 172.16.1.188
 ! Apply on the interface where biscayne GRE traffic arrives
 ! Replace Tunnel<X> with the actual interface from pre-flight check #1
 interface Tunnel<X>
   traffic-policy input VALIDATOR-OUTBOUND
 ! Add system-rule if not already present (pre-flight check #2)
 system-rule overriding-action redirect
 show session-config diffs
 commit timer 00:05:00
 ```
 After verification: commit + `write memory`.
 ## Host Config: biscayne
 Automated via ansible playbook `playbooks/ashburn-validator-relay.yml`.
 ### Manual equivalent
 ```bash
 # 1. Accept packets destined for 137.239.194.65
 sudo ip addr add 137.239.194.65/32 dev lo
 # 2. Inbound DNAT to kind node (172.20.0.2)
 sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 8001 \
  -j DNAT --to-destination 172.20.0.2:8001
 sudo iptables -t nat -A PREROUTING -p tcp -d 137.239.194.65 --dport 8001 \
  -j DNAT --to-destination 172.20.0.2:8001
 sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 \
  -j DNAT --to-destination 172.20.0.2
 # 3. Outbound: mark validator traffic
 sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 \
  -j MARK --set-mark 100
 sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 \
  -j MARK --set-mark 100
 sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 \
  -j MARK --set-mark 100
 # 4. Outbound: SNAT to Ashburn IP (INSERT before Docker MASQUERADE)
 sudo iptables -t nat -I POSTROUTING 1 -m mark --mark 100 \
  -j SNAT --to-source 137.239.194.65
 # 5. Policy routing table
 echo "100 ashburn" | sudo tee -a /etc/iproute2/rt_tables
 sudo ip rule add fwmark 100 table ashburn
 sudo ip route add default via 169.254.7.6 dev doublezero0 table ashburn
 # 6. Persist
 sudo netfilter-persistent save
 # ip rule + ip route persist via /etc/network/if-up.d/ashburn-routing
 ```
 ### Docker NAT port preservation
 **Must verify before going live:** Docker masquerade must preserve source ports
 for kind's hostNetwork pods. If Docker rewrites the source port, the mangle
 PREROUTING match on `--sport 8001,9000-9025` will miss traffic.
 Test: `tcpdump -i br-cf46a62ab5b2 -nn 'udp src port 8001'` — if you see
 packets with sport 8001 from 172.20.0.2, port preservation works.
 If Docker does NOT preserve ports, the mark must be set inside the kind node
 container (on the pod's veth) rather than on the host.
 ## Execution Order
 1. **was-sw01**: checkpoint → config session with 5min auto-revert → verify counters → commit
 2. **biscayne**: add 137.239.194.65/32 to lo, add inbound DNAT rules
 3. **Verify inbound**: `ping 137.239.194.65` from external host, check DNAT counters
 4. **mia-sw01**: pre-flight checks → config session with 5min auto-revert → commit
 5. **biscayne**: add outbound fwmark + policy routing + SNAT rules
 6. **Test outbound**: from biscayne, send UDP from port 8001, verify src 137.239.194.65 on was-sw01
 7. **Verify**: traffic-policy counters on both switches, iptables hit counts on biscayne
 8. **Restart validator** if needed (gossip should auto-refresh, but restart ensures clean state)
 9. **was-sw01 + mia-sw01**: `write memory` to persist
 10. **Cleanup**: remove old SHRED-RELAY and 64.92.84.81:20000 DNAT after stable
 ## Verification
 1. `show traffic-policy counters` on was-sw01 — VALIDATOR-RELAY-ACL matches
 2. `show traffic-policy counters` on mia-sw01 — VALIDATOR-OUTBOUND-ACL matches
 3. `sudo iptables -t nat -L -v -n` on biscayne — DNAT and SNAT hit counts
 4. `sudo iptables -t mangle -L -v -n` on biscayne — fwmark hit counts
 5. `ip rule show` on biscayne — fwmark 100 lookup ashburn
 6. Validator gossip ContactInfo shows 137.239.194.65 for ALL addresses (gossip, repair, TVU, TPU)
 7. Repair peer count increases (target: 20+ peers)
 8. Slot catchup rate improves from ~0.9 toward ~2.5 slots/sec
 9. `traceroute --sport=8001 <remote_peer>` from biscayne routes via doublezero0/was-sw01
 ## Rollback
 ### biscayne
 ```bash
 sudo ip addr del 137.239.194.65/32 dev lo
 sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
 sudo iptables -t nat -D PREROUTING -p tcp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
 sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 -j DNAT --to-destination 172.20.0.2
 sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 -j MARK --set-mark 100
 sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 -j MARK --set-mark 100
 sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 -j MARK --set-mark 100
 sudo iptables -t nat -D POSTROUTING -m mark --mark 100 -j SNAT --to-source 137.239.194.65
 sudo ip rule del fwmark 100 table ashburn
 sudo ip route del default table ashburn
 sudo netfilter-persistent save
 ```
 ### was-sw01
 ```
 rollback running-config checkpoint pre-validator-relay
 write memory
 ```
 ### mia-sw01
 ```
 rollback running-config checkpoint pre-validator-outbound
 write memory
 ```
 ## Key Details
 | Item | Value |
 |------|-------|
 | Ashburn relay IP | `137.239.194.65` (Loopback101 on was-sw01) |
 | Ashburn LAN block | `137.239.194.64/29` on was-sw01 Et1/1 |
 | Biscayne IP | `186.233.184.235` |
 | Kind node IP | `172.20.0.2` (Docker bridge br-cf46a62ab5b2) |
 | Validator ports | 8001 (gossip), 9000-9025 (TVU/repair/TPU) |
 | Excluded ports | 8899 (RPC), 8900 (WebSocket) — direct to biscayne |
 | GRE tunnel | doublezero0: 169.254.7.7 ↔ 169.254.7.6, remote 209.42.167.133 |
 | Backbone | was-sw01 Et4/1 172.16.1.188/31 ↔ mia-sw01 Et4/1 172.16.1.189/31 |
 | Policy routing table | 100 ashburn |
 | Fwmark | 100 |
 | was-sw01 SSH | `install@137.239.200.198` |
 | EOS version | 4.34.0F |
--- a/docs/blue-green-upgrades.md
+++ b/docs/blue-green-upgrades.md
@ -0,0 +1,416 @@
 # Blue-Green Upgrades for Biscayne
 Zero-downtime upgrade procedures for the agave-stack deployment on biscayne.
 Uses ZFS clones for instant data duplication, Caddy health-check routing for
 traffic shifting, and k8s native sidecars for independent container upgrades.
 ## Architecture
 ```
                    Caddy ingress (biscayne.vaasl.io)
                    ├── upstream A: localhost:8899  ← health: /health
                    └── upstream B: localhost:8897  ← health: /health
                              │
            ┌─────────────────┴──────────────────┐
            │         kind cluster                │
            │                                     │
            │  Deployment A        Deployment B   │
            │  ┌─────────────┐   ┌─────────────┐  │
            │  │ agave :8899 │   │ agave :8897 │  │
            │  │ doublezerod │   │ doublezerod │  │
            │  └──────┬──────┘   └──────┬──────┘  │
            └─────────┼─────────────────┼─────────┘
                      │                 │
              ZFS dataset A      ZFS clone B
              (original)         (instant CoW copy)
 ```
 Both deployments run in the same kind cluster with `hostNetwork: true`.
 Caddy active health checks route traffic to whichever deployment has a
 healthy `/health` endpoint.
 ## Storage Layout
 | Data | Path | Type | Survives restart? |
 |------|------|------|-------------------|
 | Ledger | `/srv/solana/ledger` | ZFS zvol (xfs) | Yes |
 | Snapshots | `/srv/solana/snapshots` | ZFS zvol (xfs) | Yes |
 | Accounts | `/srv/solana/ramdisk/accounts` | `/dev/ram0` (xfs) | Until host reboot |
 | Validator config | `/srv/deployments/agave/data/validator-config` | ZFS | Yes |
 | DZ config | `/srv/deployments/agave/data/doublezero-config` | ZFS | Yes |
 The ZFS zvol `biscayne/DATA/volumes/solana` backs `/srv/solana` (ledger, snapshots).
 The ramdisk at `/dev/ram0` holds accounts — it's a block device, not tmpfs, so it
 survives process restarts but not host reboots.
 ---
 ## Procedure 1: DoubleZero Binary Upgrade (zero downtime, single pod)
 The GRE tunnel (`doublezero0`) and BGP routes live in kernel space. They persist
 across doublezerod process restarts. Upgrading the DZ binary does not require
 tearing down the tunnel or restarting the validator.
 ### Prerequisites
 - doublezerod is defined as a k8s native sidecar (`spec.initContainers` with
  `restartPolicy: Always`). See [Required Changes](#required-changes) below.
 - k8s 1.29+ (biscayne runs 1.35.1)
 ### Steps
 1. Build or pull the new doublezero container image.
 2. Patch the pod's sidecar image:
   ```bash
   kubectl -n <ns> patch pod <pod> --type='json' -p='[
     {"op": "replace", "path": "/spec/initContainers/0/image",
      "value": "laconicnetwork/doublezero:new-version"}
   ]'
   ```
 3. Only the doublezerod container restarts. The agave container is unaffected.
   The GRE tunnel interface and BGP routes remain in the kernel throughout.
 4. Verify:
   ```bash
   kubectl -n <ns> exec <pod> -c doublezerod -- doublezero --version
   kubectl -n <ns> exec <pod> -c doublezerod -- doublezero status
   ip route | grep doublezero0   # routes still present
   ```
 ### Rollback
 Patch the image back to the previous version. Same process, same zero downtime.
 ---
 ## Procedure 2: Agave Version Upgrade (zero RPC downtime, blue-green)
 Agave is the main container and must be restarted for a version change. To maintain
 zero RPC downtime, we run two deployments simultaneously and let Caddy shift traffic
 based on health checks.
 ### Prerequisites
 - Caddy ingress configured with dual upstreams and active health checks
 - A parameterized spec.yml that accepts alternate ports and volume paths
 - ZFS snapshot/clone scripts
 ### Steps
 #### Phase 1: Prepare (no downtime, no risk)
 1. **ZFS snapshot** for rollback safety:
   ```bash
   zfs snapshot -r biscayne/DATA@pre-upgrade-$(date +%Y%m%d)
   ```
 2. **ZFS clone** the validator volumes:
   ```bash
   zfs clone biscayne/DATA/volumes/solana@pre-upgrade-$(date +%Y%m%d) \
     biscayne/DATA/volumes/solana-blue
   ```
   This is instant (copy-on-write). No additional storage until writes diverge.
 3. **Clone the ramdisk accounts** (not on ZFS):
   ```bash
   mkdir -p /srv/solana-blue/ramdisk/accounts
   cp -a /srv/solana/ramdisk/accounts/* /srv/solana-blue/ramdisk/accounts/
   ```
   This is the slow step — 460GB on ramdisk. Consider `rsync` with `--inplace`
   to minimize copy time, or investigate whether the ramdisk can move to a ZFS
   dataset for instant cloning in future deployments.
 4. **Build or pull** the new agave container image.
 #### Phase 2: Start blue deployment (no downtime)
 5. **Create Deployment B** in the same kind cluster, pointing at cloned volumes,
   with RPC on port 8897:
   ```bash
   # Apply the blue deployment manifest (parameterized spec)
   kubectl apply -f deployment/k8s-manifests/agave-blue.yaml
   ```
 6. **Deployment B catches up.** It starts from the snapshot point and replays.
   Monitor progress:
   ```bash
   kubectl -n <ns> exec <blue-pod> -c agave-validator -- \
     solana -u http://127.0.0.1:8897 slot
   ```
 7. **Validate** the new version works:
   - RPC responds: `curl -sf http://localhost:8897/health`
   - Correct version: `kubectl -n <ns> exec <blue-pod> -c agave-validator -- agave-validator --version`
   - doublezerod connected (if applicable)
   Take as long as needed. Deployment A is still serving all traffic.
 #### Phase 3: Traffic shift (zero downtime)
 8. **Caddy routes traffic to B.** Once B's `/health` returns 200, Caddy's active
   health check automatically starts routing to it. Alternatively, update the
   Caddy upstream config to prefer B.
 9. **Verify** B is serving live traffic:
   ```bash
   curl -sf https://biscayne.vaasl.io/health
   # Check Caddy access logs for requests hitting port 8897
   ```
 #### Phase 4: Cleanup
 10. **Stop Deployment A:**
    ```bash
    kubectl -n <ns> delete deployment agave-green
    ```
 11. **Reconfigure B to use standard port** (8899) if desired, or update Caddy
    to only route to 8897.
 12. **Clean up ZFS clone** (or keep as rollback):
    ```bash
    zfs destroy biscayne/DATA/volumes/solana-blue
    ```
 ### Rollback
 At any point before Phase 4:
 - Deployment A is untouched and still serving traffic (or can be restarted)
 - Delete Deployment B: `kubectl -n <ns> delete deployment agave-blue`
 - Destroy the ZFS clone: `zfs destroy biscayne/DATA/volumes/solana-blue`
 After Phase 4 (A already stopped):
 - `zfs rollback` to restore original data
 - Redeploy A with old image
 ---
 ## Required Changes to agave-stack
 ### 1. Move doublezerod to native sidecar
 In the pod spec generation (laconic-so or compose override), doublezerod must be
 defined as a native sidecar container instead of a regular container:
 ```yaml
 spec:
  initContainers:
    - name: doublezerod
      image: laconicnetwork/doublezero:local
      restartPolicy: Always          # makes it a native sidecar
      securityContext:
        privileged: true
        capabilities:
          add: [NET_ADMIN]
      env:
        - name: DOUBLEZERO_RPC_ENDPOINT
          value: https://api.mainnet-beta.solana.com
      volumeMounts:
        - name: doublezero-config
          mountPath: /root/.config/doublezero
  containers:
    - name: agave-validator
      image: laconicnetwork/agave:local
      # ... existing config
 ```
 This change means:
 - doublezerod starts before agave and stays running
 - Patching the doublezerod image restarts only that container
 - agave can be restarted independently without affecting doublezerod
 This requires a laconic-so change to support `initContainers` with `restartPolicy`
 in compose-to-k8s translation — or a post-deployment patch.
 ### 2. Caddy dual-upstream config
 Add health-checked upstreams for both blue and green deployments:
 ```caddyfile
 biscayne.vaasl.io {
    reverse_proxy {
        to localhost:8899 localhost:8897
        health_uri /health
        health_interval 5s
        health_timeout 3s
        lb_policy first
    }
 }
 ```
 `lb_policy first` routes to the first healthy upstream. When only A is running,
 all traffic goes to :8899. When B comes up healthy, traffic shifts.
 ### 3. Parameterized deployment spec
 Create a parameterized spec or kustomize overlay that accepts:
 - RPC port (8899 vs 8897)
 - Volume paths (original vs ZFS clone)
 - Deployment name suffix (green vs blue)
 ### 4. Delete DaemonSet workaround
 Remove `deployment/k8s-manifests/doublezero-daemonset.yaml` from agave-stack.
 ### 5. Fix container DZ identity
 Copy the registered identity into the container volume:
 ```bash
 sudo cp /home/solana/.config/doublezero/id.json \
  /srv/deployments/agave/data/doublezero-config/id.json
 ```
 ### 6. Disable host systemd doublezerod
 After the container sidecar is working:
 ```bash
 sudo systemctl stop doublezerod
 sudo systemctl disable doublezerod
 ```
 ---
 ## Implementation Order
 This is a spec-driven, test-driven plan. Each step produces a testable artifact.
 ### Step 1: Fix existing DZ bugs (no code changes to laconic-so)
 Fixes BUG-1 through BUG-5 from [doublezero-status.md](doublezero-status.md).
 **Spec:** Container doublezerod shows correct identity, connects to laconic-mia-sw01,
 host systemd doublezerod is disabled.
 **Test:**
 ```bash
 kubectl -n <ns> exec <pod> -c doublezerod -- doublezero address
 # assert: 3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr
 kubectl -n <ns> exec <pod> -c doublezerod -- doublezero status
 # assert: BGP Session Up, laconic-mia-sw01
 systemctl is-active doublezerod
 # assert: inactive
 ```
 **Changes:**
 - Copy `id.json` to container volume
 - Update `DOUBLEZERO_RPC_ENDPOINT` in spec.yml
 - Deploy with hostNetwork-enabled stack-orchestrator
 - Stop and disable host doublezerod
 - Delete DaemonSet manifest from agave-stack
 ### Step 2: Native sidecar for doublezerod
 **Spec:** doublezerod image can be patched without restarting the agave container.
 GRE tunnel and routes persist across doublezerod restart.
 **Test:**
 ```bash
 # Record current agave container start time
 BEFORE=$(kubectl -n <ns> get pod <pod> -o jsonpath='{.status.containerStatuses[?(@.name=="agave-validator")].state.running.startedAt}')
 # Patch DZ image
 kubectl -n <ns> patch pod <pod> --type='json' -p='[
  {"op":"replace","path":"/spec/initContainers/0/image","value":"laconicnetwork/doublezero:test"}
 ]'
 # Wait for DZ container to restart
 sleep 10
 # Verify agave was NOT restarted
 AFTER=$(kubectl -n <ns> get pod <pod> -o jsonpath='{.status.containerStatuses[?(@.name=="agave-validator")].state.running.startedAt}')
 [ "$BEFORE" = "$AFTER" ]  # assert: same start time
 # Verify tunnel survived
 ip route | grep doublezero0  # assert: routes present
 ```
 **Changes:**
 - laconic-so: support `initContainers` with `restartPolicy: Always` in
  compose-to-k8s translation (or: define doublezerod as native sidecar in
  compose via `x-kubernetes-init-container` extension or equivalent)
 - Alternatively: post-deploy kubectl patch to move doublezerod to initContainers
 ### Step 3: Caddy dual-upstream routing
 **Spec:** Caddy routes RPC traffic to whichever backend is healthy. Adding a second
 healthy backend on :8897 causes traffic to shift without configuration changes.
 **Test:**
 ```bash
 # Start a test HTTP server on :8897 with /health
 python3 -c "
 from http.server import HTTPServer, BaseHTTPRequestHandler
 class H(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200); self.end_headers(); self.wfile.write(b'ok')
 HTTPServer(('', 8897), H).serve_forever()
 " &
 # Verify Caddy discovers it
 sleep 10
 curl -sf https://biscayne.vaasl.io/health
 # assert: 200
 kill %1
 ```
 **Changes:**
 - Update Caddy ingress config with dual upstreams and health checks
 ### Step 4: ZFS clone and blue-green tooling
 **Spec:** A script creates a ZFS clone, starts a blue deployment on alternate ports
 using the cloned data, and the deployment catches up and becomes healthy.
 **Test:**
 ```bash
 # Run the clone + deploy script
 ./scripts/blue-green-prepare.sh --target-version v2.2.1
 # assert: ZFS clone exists
 zfs list biscayne/DATA/volumes/solana-blue
 # assert: blue deployment exists and is catching up
 kubectl -n <ns> get deployment agave-blue
 # assert: blue RPC eventually becomes healthy
 timeout 600 bash -c 'until curl -sf http://localhost:8897/health; do sleep 5; done'
 ```
 **Changes:**
 - `scripts/blue-green-prepare.sh` — ZFS snapshot, clone, deploy B
 - `scripts/blue-green-promote.sh` — tear down A, optional port swap
 - `scripts/blue-green-rollback.sh` — destroy B, restore A
 - Parameterized deployment spec (kustomize overlay or env-driven)
 ### Step 5: End-to-end upgrade test
 **Spec:** Full upgrade cycle completes with zero dropped RPC requests.
 **Test:**
 ```bash
 # Start continuous health probe in background
 while true; do
  curl -sf -o /dev/null -w "%{http_code} %{time_total}\n" \
    https://biscayne.vaasl.io/health || echo "FAIL $(date)"
  sleep 0.5
 done > /tmp/health-probe.log &
 # Execute full blue-green upgrade
 ./scripts/blue-green-prepare.sh --target-version v2.2.1
 # wait for blue to sync...
 ./scripts/blue-green-promote.sh
 # Stop probe
 kill %1
 # assert: no FAIL lines in probe log
 grep -c FAIL /tmp/health-probe.log
 # assert: 0
 ```
--- a/docs/bug-ashburn-tunnel-port-filtering.md
+++ b/docs/bug-ashburn-tunnel-port-filtering.md
@ -0,0 +1,85 @@
 # Bug: Ashburn Relay — 137.239.194.65 Not Routable from Public Internet
 ## Summary
 `--gossip-host 137.239.194.65` correctly advertises the Ashburn relay IP in
 ContactInfo for all sockets (gossip, TVU, repair, TPU). However, 137.239.194.65
 is a DoubleZero overlay IP (137.239.192.0/19, IS-IS only) that is NOT announced
 via BGP to the public internet. Public peers cannot route to it, so TVU shreds,
 repair requests, and TPU traffic never arrive at was-sw01.
 ## Evidence
 - Gossip traffic arrives on `doublezero0` interface:
  ```
  doublezero0 In  IP 64.130.58.70.8001 > 137.239.194.65.8001: UDP, length 132
  ```
 - Zero TVU/repair traffic arrives:
  ```
  tcpdump -i doublezero0 'dst host 137.239.194.65 and udp and not port 8001'
  0 packets captured
  ```
 - ContactInfo correctly advertises all sockets on 137.239.194.65:
  ```json
  {
    "gossip": "137.239.194.65:8001",
    "tvu": "137.239.194.65:9000",
    "serveRepair": "137.239.194.65:9011",
    "tpu": "137.239.194.65:9002"
  }
  ```
 - Outbound gossip from biscayne exits via `doublezero0` with source
  137.239.194.65 — SNAT and routing work correctly in the outbound direction.
 ## Root Cause
 **137.239.194.0/24 is not routable from the public internet.** The prefix
 belongs to DoubleZero's overlay address space (137.239.192.0/19, Momentum
 Telecom, WHOIS OriginAS: empty). It is advertised only via IS-IS within the
 DoubleZero switch mesh. There is no eBGP session on was-sw01 to advertise it
 to the ISP — all BGP peers are iBGP AS 65342 (DoubleZero internal).
 When the validator advertises `tvu: 137.239.194.65:9000` in ContactInfo,
 public internet peers attempt to send turbine shreds to that IP, but the
 packets have no route through the global BGP table to reach was-sw01. Only
 DoubleZero-connected peers could potentially reach it via the overlay.
 The old shred relay pipeline worked because it used `--public-tvu-address
 64.92.84.81:20000` — was-sw01's Et1/1 ISP uplink IP, which IS publicly
 routable. The `--gossip-host 137.239.194.65` approach advertises a
 DoubleZero-only IP for ALL sockets, making TVU/repair/TPU unreachable from
 non-DoubleZero peers.
 The original hypothesis (ACL/PBR port filtering) was wrong. The tunnel and
 switch routing work correctly — the problem is upstream: traffic never arrives
 at was-sw01 in the first place.
 ## Impact
 The validator cannot receive turbine shreds or serve repair requests via the
 low-latency Ashburn path. It falls back to the Miami public IP (186.233.184.235)
 for all shred/repair traffic, negating the benefit of `--gossip-host`.
 ## Fix Options
 1. **Use 64.92.84.81 (was-sw01 Et1/1) for ContactInfo sockets.** This is the
   publicly routable Ashburn IP. Requires `--gossip-host 64.92.84.81` (or
   equivalent `--bind-address` config) and DNAT/forwarding on was-sw01 to relay
   traffic through the backbone → mia-sw01 → Tunnel500 → biscayne. The old
   `--public-tvu-address` pipeline used this IP successfully.
 2. **Get DoubleZero to announce 137.239.194.0/24 via eBGP to the ISP.** This
   would make the current `--gossip-host 137.239.194.65` setup work, but
   requires coordination with DoubleZero operations.
 3. **Hybrid approach**: Use 64.92.84.81 for public-facing sockets (TVU, repair,
   TPU) and 137.239.194.65 for gossip (which works via DoubleZero overlay).
   Requires agave to support per-protocol address binding, which it does not
   (`--gossip-host` sets ALL sockets to the same IP).
 ## Previous Workaround
 The old `--public-tvu-address` pipeline used socat + shred-unwrap.py to relay
 shreds from 64.92.84.81:20000 to the validator. That pipeline is not persistent
 across reboots and was superseded by the `--gossip-host` approach (which turned
 out to be broken for non-DoubleZero peers).
--- a/docs/bug-laconic-so-etcd-cleanup.md
+++ b/docs/bug-laconic-so-etcd-cleanup.md
@ -0,0 +1,51 @@
 # Bug: laconic-so etcd cleanup wipes core kubernetes service
 ## Summary
 `_clean_etcd_keeping_certs()` in laconic-stack-orchestrator 1.1.0 deletes the `kubernetes` service from etcd, breaking cluster networking on restart.
 ## Component
 `stack_orchestrator/deploy/k8s/helpers.py` — `_clean_etcd_keeping_certs()`
 ## Reproduction
 1. Deploy with `laconic-so` to a k8s-kind target with persisted etcd (hostPath mount in kind-config.yml)
 2. `laconic-so deployment --dir <dir> stop` (destroys cluster)
 3. `laconic-so deployment --dir <dir> start` (recreates cluster with cleaned etcd)
 ## Symptoms
 - `kindnet` pods enter CrashLoopBackOff with: `panic: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined`
 - `kubectl get svc kubernetes -n default` returns `NotFound`
 - coredns, caddy, local-path-provisioner stuck in Pending (no CNI without kindnet)
 - No pods can be scheduled
 ## Root Cause
 `_clean_etcd_keeping_certs()` uses a whitelist that only preserves `/registry/secrets/caddy-system` keys. All other etcd keys are deleted, including `/registry/services/specs/default/kubernetes` — the core `kubernetes` ClusterIP service that kube-apiserver auto-creates.
 When the kind cluster starts with the cleaned etcd, kube-apiserver sees the existing etcd data and does not re-create the `kubernetes` service. kindnet depends on the `KUBERNETES_SERVICE_HOST` environment variable which is injected by the kubelet from this service — without it, kindnet panics.
 ## Fix Options
 1. **Expand the whitelist** to include `/registry/services/specs/default/kubernetes` and other core cluster resources
 2. **Fully wipe etcd** instead of selective cleanup — let the cluster bootstrap fresh (simpler, but loses Caddy TLS certs)
 3. **Don't persist etcd at all** — ephemeral etcd means clean state every restart (recommended for kind deployments)
 ## Workaround
 Fully delete the kind cluster before `start`:
 ```bash
 kind delete cluster --name <cluster-name>
 laconic-so deployment --dir <dir> start
 ```
 This forces fresh etcd bootstrap. Downside: all other services deployed to the cluster (DaemonSets, other namespaces) are destroyed.
 ## Impact
 - Affects any k8s-kind deployment with persisted etcd
 - Cluster is unrecoverable without full destroy+recreate
 - All non-laconic-so-managed workloads in the cluster are lost
--- a/docs/bug-laconic-so-ingress-conflict.md
+++ b/docs/bug-laconic-so-ingress-conflict.md
@ -0,0 +1,75 @@
 # Bug: laconic-so crashes on re-deploy when caddy ingress already exists
 ## Summary
 `laconic-so deployment start` crashes with `FailToCreateError` when the kind cluster already has caddy ingress resources installed. The deployer uses `create_from_yaml()` which fails on `AlreadyExists` conflicts instead of applying idempotently. This prevents the application deployment from ever being reached — the crash happens before any app manifests are applied.
 ## Component
 `stack_orchestrator/deploy/k8s/deploy_k8s.py:366` — `up()` method
 `stack_orchestrator/deploy/k8s/helpers.py:369` — `install_ingress_for_kind()`
 ## Reproduction
 1. `kind delete cluster --name laconic-70ce4c4b47e23b85`
 2. `laconic-so deployment --dir /srv/deployments/agave start` — creates cluster, loads images, installs caddy ingress, but times out or is interrupted before app deployment completes
 3. `laconic-so deployment --dir /srv/deployments/agave start` — crashes immediately after image loading
 ## Symptoms
 - Traceback ending in:
  ```
  kubernetes.utils.create_from_yaml.FailToCreateError:
    Error from server (Conflict): namespaces "caddy-system" already exists
    Error from server (Conflict): serviceaccounts "caddy-ingress-controller" already exists
    Error from server (Conflict): clusterroles.rbac.authorization.k8s.io "caddy-ingress-controller" already exists
    ...
  ```
 - Namespace `laconic-laconic-70ce4c4b47e23b85` exists but is empty — no pods, no deployments, no events
 - Cluster is healthy, images are loaded, but no app manifests are applied
 ## Root Cause
 `install_ingress_for_kind()` calls `kubernetes.utils.create_from_yaml()` which uses `POST` (create) semantics. If the resources already exist (from a previous partial run), every resource returns `409 Conflict` and `create_from_yaml` raises `FailToCreateError`, aborting the entire `up()` method before the app deployment step.
 The first `laconic-so start` after a fresh `kind delete` works because:
 1. Image loading into the kind node takes 5-10 minutes (images are ~10GB+)
 2. Caddy ingress is installed successfully
 3. App deployment begins
 But if that first run is interrupted (timeout, Ctrl-C, ansible timeout), the second run finds caddy already installed and crashes.
 ## Fix Options
 1. **Use server-side apply** instead of `create_from_yaml()` — `kubectl apply` is idempotent
 2. **Check if ingress exists before installing** — skip `install_ingress_for_kind()` if caddy-system namespace exists
 3. **Catch `AlreadyExists` and continue** — treat 409 as success for infrastructure resources
 ## Workaround
 Delete the caddy ingress resources before re-running:
 ```bash
 kubectl delete namespace caddy-system
 kubectl delete clusterrole caddy-ingress-controller
 kubectl delete clusterrolebinding caddy-ingress-controller
 kubectl delete ingressclass caddy
 laconic-so deployment --dir /srv/deployments/agave start
 ```
 Or nuke the entire cluster and start fresh:
 ```bash
 kind delete cluster --name laconic-70ce4c4b47e23b85
 laconic-so deployment --dir /srv/deployments/agave start
 ```
 ## Interaction with ansible timeout
 The `biscayne-redeploy.yml` playbook sets a 600s timeout on the `laconic-so deployment start` task. Image loading alone can exceed this on a fresh cluster (images must be re-loaded into the new kind node). When ansible kills the process at 600s, the caddy ingress is already installed but the app is not — putting the cluster into the broken state described above. Subsequent playbook runs hit this bug on every attempt.
 ## Impact
 - Blocks all re-deploys on biscayne without manual cleanup
 - The playbook cannot recover automatically — every retry hits the same conflict
 - Discovered 2026-03-05 during full wipe redeploy of biscayne validator
--- a/docs/doublezero-multicast-access.md
+++ b/docs/doublezero-multicast-access.md
@ -0,0 +1,121 @@
 # DoubleZero Multicast Access Requests
 ## Status (2026-03-06)
 DZ multicast is **still in testnet** (client v0.2.2). Multicast groups are defined
 on the DZ ledger with on-chain access control (publishers/subscribers). The testnet
 allocates addresses from 233.84.178.0/24 (AS21682). Not yet available for production
 Solana shred delivery.
 ## Biscayne Connection Details
 Provide these details when requesting subscriber access:
 | Field | Value |
 |-------|-------|
 | Client IP | 186.233.184.235 |
 | Validator identity | 4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr |
 | DZ identity | 3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr |
 | DZ device | laconic-mia-sw01 |
 | Contributor / tenant | laconic |
 ## Jito ShredStream
 **Not a DZ multicast group.** ShredStream is Jito's own shred delivery service,
 independent of DoubleZero multicast. It provides low-latency shreds from leaders
 on the Solana network via a proxy client that connects to the Jito Block Engine.
 | Property | Value |
 |----------|-------|
 | What it does | Delivers shreds from Jito-connected leaders with low latency. Provides a redundant shred path for servers in remote locations. |
 | How it works | `shredstream-proxy` authenticates to a Jito Block Engine via keypair, receives shreds, forwards them to configured UDP destinations (e.g. validator TVU port). |
 | Cost | **Unknown.** Docs don't list pricing. Was previously "complimentary" for searchers (2024). May require approval. |
 | Requirements | Approved Solana pubkey (form submission), auth keypair, firewall open on UDP 20000, TVU port of your node. |
 | Regions | Amsterdam, Dublin, Frankfurt, London, New York, Salt Lake City, Singapore, Tokyo. Max 2 regions selectable. |
 | Limitations | No NAT support. Bridge networking incompatible with multicast mode. |
 | Repo | https://github.com/jito-labs/shredstream-proxy |
 | Docs | https://docs.jito.wtf/lowlatencytxnfeed/ |
 | Status for biscayne | **Not yet requested.** Need to submit pubkey for approval. |
 ShredStream is relevant to our shred completeness problem — it provides an additional
 shred source beyond turbine and the Ashburn relay. It would run as a sidecar process
 forwarding shreds to the validator's TVU port.
 ## DZ Multicast Groups
 DZ multicast uses PIM (Protocol Independent Multicast) and MSDP (Multicast Source
 Discovery Protocol). Group owners define allowed publishers and subscribers on the
 DZ ledger. Switch ASICs handle packet replication — no CPU overhead.
 ### bebop
 Listed in earlier notes as a multicast shred distribution group. **No public
 documentation found.** Cannot confirm this exists as a DZ multicast group.
 - **Owner:** Unknown
 - **Status:** Unverified — may not exist as described
 ### turbine (future)
 Solana's native shred propagation via DZ multicast. Jito has expressed interest
 in leveraging multicast for shred delivery. Not yet available for production use.
 - **Owner:** Solana Foundation / Anza (native turbine), Jito (shredstream)
 - **Status:** Testnet only (DZ client v0.2.2)
 ## bloXroute OFR (Optimized Feed Relay)
 Commercial shred delivery service. Runs a gateway docker container on your node that
 connects to bloXroute's BDN (Blockchain Distribution Network) to receive shreds
 faster than default turbine (~30-50ms improvement, beats turbine ~98% of the time).
 | Property | Value |
 |----------|-------|
 | What it does | Delivers shreds via bloXroute's BDN with optimized relay topologies. Not just a different turbine path — uses their own distribution network. |
 | How it works | Docker gateway container on your node, communicates with bloXroute OFR relay over UDP 18888. Forwards shreds to your validator. |
 | Cost | **$300/mo** (Professional, 1500 tx/day), **$1,250/mo** (Enterprise, unlimited tx). OFR gateway without local node requires Enterprise Elite ($5,000+/mo). |
 | Requirements | Docker, UDP port 18888 open, bloXroute subscription. |
 | Open source | Gateway at https://github.com/bloXroute-Labs/solana-gateway |
 | Docs | https://docs.bloxroute.com/solana/optimized-feed-relay |
 | Status for biscayne | **Not yet evaluated.** Monthly cost may not be justified. |
 bloXroute's value proposition: they operate nodes at multiple turbine tree positions
 across their network, aggregate shreds, and redistribute via their BDN. This is the
 "multiple identities collecting different shreds" approach — but operated by bloXroute,
 not by us.
 ## How These Services Get More Shreds
 Turbine tree position is determined by validator identity (pubkey). A single validator
 gets shreds from one position in the tree per slot. Services like Jito ShredStream
 and bloXroute OFR operate many nodes with different identities across the turbine
 tree, aggregate the shreds they each receive, and redistribute the combined set to
 subscribers. This is why they can deliver shreds the subscriber's own turbine position
 would never see.
 **An open-source equivalent would require running multiple lightweight validator
 identities (non-voting, minimal stake) at different locations, each collecting shreds
 from their unique turbine tree position, and forwarding them to the main validator.**
 No known open-source project implements this pattern.
 ## Sources
 - [Jito ShredStream docs](https://docs.jito.wtf/lowlatencytxnfeed/)
 - [shredstream-proxy repo](https://github.com/jito-labs/shredstream-proxy)
 - [bloXroute OFR docs](https://docs.bloxroute.com/solana/optimized-feed-relay)
 - [bloXroute pricing](https://bloxroute.com/pricing/)
 - [bloXroute OFR intro](https://bloxroute.com/pulse/introducing-ofrs-faster-shreds-better-performance-on-solana/)
 - [DZ multicast announcement](https://doublezero.xyz/journal/doublezero-introduces-multicast-support-smarter-faster-data-delivery-for-distributed-systems)
 ## Request Template
 When contacting a group owner, use something like:
 > We'd like to subscribe to your DoubleZero multicast group for our Solana
 > validator. Our details:
 >
 > - Validator: 4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr
 > - DZ identity: 3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr
 > - Client IP: 186.233.184.235
 > - Device: laconic-mia-sw01
 > - Tenant: laconic
--- a/docs/doublezero-status.md
+++ b/docs/doublezero-status.md
@ -0,0 +1,121 @@
 # DoubleZero Current State and Bug Fixes
 ## Biscayne Connection Details
 | Field | Value |
 |-------|-------|
 | Host | biscayne.vaasl.io (186.233.184.235) |
 | DZ identity | `3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr` |
 | Validator identity | `4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr` |
 | Nearest device | laconic-mia-sw01 (0.3ms) |
 | DZ version (host) | 0.8.10 |
 | DZ version (container) | 0.8.11 |
 | k8s version | 1.35.1 (kind) |
 ## Current State (2026-03-03)
 The host systemd `doublezerod` is connected and working. The container sidecar
 doublezerod is broken. Both are running simultaneously.
 | Instance | Identity | Status |
 |----------|----------|--------|
 | Host systemd | `3Bw6v7...` (correct) | BGP Session Up, IBRL to laconic-mia-sw01 |
 | Container sidecar | `Cw9qun...` (wrong) | Disconnected, error loop |
 | DaemonSet manifest | N/A | Never applied, dead code |
 ### Access pass
 The access pass for 186.233.184.235 is registered and connected:
 ```
 type: prepaid
 payer: 3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr
 status: connected
 owner: DZfLKFDgLShjY34WqXdVVzHUvVtrYXb7UtdrALnGa8jw
 ```
 ## Bugs
 ### BUG-1: Container doublezerod has wrong identity
 The entrypoint script (`entrypoint.sh`) auto-generates a new `id.json` if one isn't
 found. The volume at `/srv/deployments/agave/data/doublezero-config/` was empty at
 first boot, so it generated `Cw9qun...` instead of using the registered identity.
 **Root cause:** The real `id.json` lives at `/home/solana/.config/doublezero/id.json`
 (created by the host-level DZ install). The container volume is a separate path that
 was never seeded.
 **Fix:**
 ```bash
 sudo cp /home/solana/.config/doublezero/id.json \
  /srv/deployments/agave/data/doublezero-config/id.json
 ```
 ### BUG-2: Container doublezerod can't resolve DZ passport program
 `DOUBLEZERO_RPC_ENDPOINT` in `spec.yml` is `http://127.0.0.1:8899` — the local
 validator. But the local validator hasn't replayed enough slots to have the DZ
 passport program accounts (`ser2VaTMAcYTaauMrTSfSrxBaUDq7BLNs2xfUugTAGv`).
 doublezerod calls `GetProgramAccounts` every 30 seconds and gets empty results.
 **Fix in `deployment/spec.yml`:**
 ```yaml
 # Use public RPC for DZ bootstrapping until local validator is caught up
 DOUBLEZERO_RPC_ENDPOINT: https://api.mainnet-beta.solana.com
 ```
 Switch back to `http://127.0.0.1:8899` once the local validator is synced.
 ### BUG-3: Container doublezerod lacks hostNetwork
 laconic-so was not translating `network_mode: host` from compose files to
 `hostNetwork: true` in generated k8s pod specs. Without host network access, the
 container can't create GRE tunnels (IP proto 47) or run BGP (tcp/179 on
 169.254.0.0/16).
 **Fix:** Deploy with stack-orchestrator branch `fix/k8s-port-mappings-hostnetwork-v2`
 (commit `fb69cc58`, 2026-03-03) which adds automatic hostNetwork detection.
 ### BUG-4: DaemonSet workaround is dead code
 `deployment/k8s-manifests/doublezero-daemonset.yaml` was a workaround for BUG-3.
 Now that laconic-so supports hostNetwork natively, it should be deleted.
 **Fix:** Remove `deployment/k8s-manifests/doublezero-daemonset.yaml` from agave-stack.
 ### BUG-5: Two doublezerod instances running simultaneously
 The host systemd `doublezerod` and the container sidecar are both running. Once the
 container is fixed (BUG-1 through BUG-3), the host service must be disabled to avoid
 two processes fighting over the GRE tunnel.
 **Fix:**
 ```bash
 sudo systemctl stop doublezerod
 sudo systemctl disable doublezerod
 ```
 ## Diagnostic Commands
 Always use `sudo -u solana` for host-level DZ commands — the identity is under
 `/home/solana/.config/doublezero/`.
 ```bash
 # Host
 sudo -u solana doublezero address          # expect 3Bw6v7...
 sudo -u solana doublezero status           # tunnel state
 sudo -u solana doublezero latency          # device reachability
 sudo -u solana doublezero access-pass list | grep 186.233.184  # access pass
 sudo -u solana doublezero balance          # credits
 ip route | grep doublezero0                # BGP routes
 # Container (from kind node)
 kubectl -n <ns> exec <pod> -c doublezerod -- doublezero address
 kubectl -n <ns> exec <pod> -c doublezerod -- doublezero status
 kubectl -n <ns> exec <pod> -c doublezerod -- doublezero --version
 # Logs
 kubectl -n <ns> logs <pod> -c doublezerod --tail=30
 sudo journalctl -u doublezerod -f          # host systemd logs
 ```
--- a/docs/feature-kind-local-registry.md
+++ b/docs/feature-kind-local-registry.md
@ -0,0 +1,65 @@
 # Feature: Use local registry for kind image loading
 ## Summary
 `laconic-so deployment start` uses `kind load docker-image` to copy container images from the host Docker daemon into the kind node's containerd. This serializes the full image (`docker save`), pipes it through `docker exec`, and deserializes it (`ctr image import`). For biscayne's ~837MB agave image plus the doublezero image, this takes 5-10 minutes on every cluster recreate — copying between two container runtimes on the same machine.
 ## Current behavior
 ```
 docker build → host Docker daemon (image stored once)
 kind load docker-image → docker save | docker exec kind-node ctr import (full copy)
 ```
 This happens in `stack_orchestrator/deploy/k8s/deploy_k8s.py` every time `laconic-so deployment start` runs and the image isn't already present in the kind node.
 ## Proposed behavior
 Run a persistent local registry (`registry:2`) on the host. `laconic-so` pushes images there after build. Kind's containerd is configured to pull from it.
 ```
 docker build → docker tag localhost:5001/image → docker push localhost:5001/image
 kind node containerd → pulls from localhost:5001 (fast, no serialization)
 ```
 The registry container persists across kind cluster deletions. Images are always available without reloading.
 ## Implementation
 1. **Registry container**: `docker run -d --restart=always -p 5001:5000 --name kind-registry registry:2`
 2. **Kind config** — add registry mirror to `containerdConfigPatches` in kind-config.yml:
   ```yaml
   containerdConfigPatches:
     - |-
       [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5001"]
         endpoint = ["http://kind-registry:5000"]
   ```
 3. **Connect registry to kind network**: `docker network connect kind kind-registry`
 4. **laconic-so change** — in `deploy_k8s.py`, replace `kind load docker-image` with:
   ```python
   # Tag and push to local registry instead of kind load
   docker tag image:local localhost:5001/image:local
   docker push localhost:5001/image:local
   ```
 5. **Compose files** — image references change from `laconicnetwork/agave:local` to `localhost:5001/laconicnetwork/agave:local`
 Kind documents this pattern: https://kind.sigs.k8s.io/docs/user/local-registry/
 ## Impact
 - Eliminates 5-10 minute image loading step on every cluster recreate
 - Registry persists across `kind delete cluster` — no re-push needed unless the image itself changes
 - `docker push` to a local registry is near-instant (shared filesystem, layer dedup)
 - Unblocks faster iteration on redeploy cycles
 ## Scope
 This is a `stack-orchestrator` change, specifically in `deploy_k8s.py`. The kind-config.yml also needs the registry mirror config, which `laconic-so` generates from `spec.yml`.
 ## Discovered
 2026-03-05 — during biscayne full wipe redeploy, `laconic-so start` spent most of its runtime on `kind load docker-image`, causing ansible timeouts and cascading failures (caddy ingress conflict bug).
--- a/docs/known-issues.md
+++ b/docs/known-issues.md
@ -0,0 +1,78 @@
 # Known Issues
 ## BUG-6: Validator logging not configured, only stdout available
 **Observed:** 2026-03-03
 The validator only logs to stdout. kubectl logs retains ~2 minutes of history
 at current log volume before the buffer fills. When diagnosing a replay stall,
 the startup logs (snapshot load, initial replay, error conditions) were gone.
 **Impact:** Cannot determine why the validator replay stage stalled — the
 startup logs that would show the root cause are not available.
 **Fix:** Configure the `--log` flag in the validator start script to write to
 a persistent volume, so logs survive container restarts and aren't limited
 to the kubectl buffer.
 ## BUG-7: Metrics endpoint unreachable from validator pod
 **Observed:** 2026-03-03
 ```
 WARN solana_metrics::metrics submit error: error sending request for url
 (http://localhost:8086/write?db=agave_metrics&u=admin&p=admin&precision=n)
 ```
 The validator is configured with `SOLANA_METRICS_CONFIG` pointing to
 `http://172.20.0.1:8086` (the kind docker bridge gateway), but the logs show
 it trying `localhost:8086`. The InfluxDB container (`solana-monitoring-influxdb-1`)
 is running on the host, but the validator can't reach it.
 **Impact:** No metrics collection. Cannot use Grafana dashboards to diagnose
 performance issues or track sync progress over time.
 ## BUG-8: sysctl values not visible inside kind container
 **Observed:** 2026-03-03
 ```
 ERROR solana_core::system_monitor_service Failed to query value for net.core.rmem_max: no such sysctl
 WARN  solana_core::system_monitor_service net.core.rmem_max: recommended=134217728, current=-1 too small
 ```
 The host has correct sysctl values (`net.core.rmem_max = 134217728`), but
 `/proc/sys/net/core/` does not exist inside the kind node container. The
 validator reads `-1` and reports the buffer as too small.
 The network buffers themselves may still be effective (they're set on the
 host network namespace which the pod shares via `hostNetwork: true`), but
 this is unverified. If the buffers are not effective, it could limit shred
 ingestion throughput and contribute to slow repair.
 **Fix options:**
 - Set sysctls on the kind node container at creation time
  (`kind` supports `kubeadmConfigPatches` and sysctl configuration)
 - Verify empirically whether the host sysctls apply to hostNetwork pods
  by checking actual socket buffer sizes from inside the pod
 ## Validator replay stall (under investigation)
 **Observed:** 2026-03-03
 The validator root has been stuck at slot 403,892,310 for 55+ minutes.
 The gap to the cluster tip is ~120,000 slots and growing.
 **Observed symptoms:**
 - Zero `Frozen` banks in log history — replay stage is not processing slots
 - All incoming slots show `bank_status: Unprocessed`
 - Repair only requests tip slots and two specific old slots (403,892,310,
  403,909,228) — not the ~120k slot gap
 - Repair peer count is 3-12 per cycle (vs 1,000+ gossip peers)
 - Startup logs have rotated out (BUG-6), so initialization context is lost
 **Unknown:**
 - What snapshot the validator loaded at boot
 - Whether replay ever started or was blocked from the beginning
 - Whether the sysctl issue (BUG-8) is limiting repair throughput
 - Whether the missing metrics (BUG-7) would show what's happening internally
--- a/docs/shred-collector-relay.md
+++ b/docs/shred-collector-relay.md
@ -0,0 +1,191 @@
 # Shred Collector Relay
 ## Problem
 Turbine assigns each validator a single position in the shred distribution tree
 per slot, determined by its pubkey. A validator in Miami with one identity receives
 shreds from one set of tree neighbors — typically ~60-70% of shreds for any given
 slot. The remaining 30-40% must come from the repair protocol, which is too slow
 to keep pace with chain production (see analysis below).
 Commercial services (Jito ShredStream, bloXroute OFR) solve this by running many
 nodes with different identities across the turbine tree, aggregating shreds, and
 redistributing the combined set to subscribers. This works but costs $300-5,000/mo
 and adds a dependency on a third party.
 ## Concept
 Run lightweight **shred collector** nodes at multiple geographic locations on
 the Laconic network (Ashburn, Dallas, etc.). Each collector has its own keypair,
 joins gossip with a unique identity, receives turbine shreds from its unique tree
 position, and forwards raw shred packets to the main validator in Miami. The main
 validator inserts these shreds into its blockstore alongside its own turbine shreds,
 increasing completeness toward 100% without relying on repair.
 ```
                    Turbine Tree
                   /     |      \
                  /      |       \
    collector-ash    collector-dfw    biscayne (main validator)
    (Ashburn)        (Dallas)         (Miami)
    identity A       identity B       identity C
    ~60% shreds      ~60% shreds      ~60% shreds
         \               |               /
          \              |              /
           → UDP forward via DZ backbone →
                         |
                    biscayne blockstore
                    ~95%+ shreds (union of A∪B∪C)
 ```
 Each collector sees a different ~60% slice of the turbine tree. The union of
 three independent positions yields ~94% coverage (1 - 0.4³ = 0.936). Four
 collectors yield ~97%. The main validator fills the remaining few percent via
 repair, which is fast when only 3-6% of shreds are missing.
 ## Why This Works
 The math from biscayne's recovery (2026-03-06):
 | Metric | Value |
 |--------|-------|
 | Compute-bound replay (complete blocks) | 5.2 slots/sec |
 | Repair-bound replay (incomplete blocks) | 0.5 slots/sec |
 | Chain production rate | 2.5 slots/sec |
 | Turbine + relay delivery per identity | ~60-70% |
 | Repair bandwidth | ~600 shreds/sec (estimated) |
 | Repair needed to converge at 60% delivery | 5x current bandwidth |
 | Repair needed to converge at 95% delivery | Easily sufficient |
 At 60% shred delivery, repair must fill 40% per slot — too slow to converge.
 At 95% delivery (3 collectors), repair fills 5% per slot — well within capacity.
 The validator replays at near compute-bound speed (5+ slots/sec) and converges.
 ## Infrastructure
 Laconic already has DZ-connected switches at multiple sites:
 | Site | Device | Latency to Miami | Backbone |
 |------|--------|-------------------|----------|
 | Miami | laconic-mia-sw01 | 0.24ms | local |
 | Ashburn | laconic-was-sw01 | ~29ms | Et4/1 25.4ms |
 | Dallas | laconic-dfw-sw01 | ~30ms | TBD |
 The DZ backbone carries traffic between sites at line rate. Shred packets are
 ~1280 bytes each. At ~3,000 shreds/slot and 2.5 slots/sec, each collector
 forwards ~7,500 packets/sec (~10 MB/s) — trivial bandwidth for the backbone.
 ## Collector Architecture
 The collector does NOT need to be a full validator. It needs to:
 1. **Join gossip** — advertise a ContactInfo with its own pubkey and a TVU
   address (the site's IP)
 2. **Receive turbine shreds** — UDP packets on the advertised TVU port
 3. **Forward shreds** — retransmit raw UDP packets to biscayne's TVU port
 It does NOT need to: replay transactions, maintain accounts state, store a
 ledger, load a snapshot, vote, or run RPC.
 ### Option A: Firedancer Minimal Build
 Firedancer (Apache 2, C) has a tile-based architecture where each function
 (net, gossip, shred, bank, store, etc.) runs as an independent Linux process.
 A minimal build using only the networking + gossip + shred tiles would:
 - Join gossip and advertise a TVU address
 - Receive turbine shreds via the shred tile
 - Forward shreds to a configured destination instead of to bank/store
 This requires modifying the shred tile to add a UDP forwarder output instead
 of (or in addition to) the normal bank handoff. The rest of the tile pipeline
 (bank, pack, poh, store) is simply not started.
 **Estimated effort:** Moderate. Firedancer's tile architecture is designed for
 this kind of composition. The main work is adding a forwarder sink to the shred
 tile and testing gossip participation without the full validator stack.
 **Source:** https://github.com/firedancer-io/firedancer
 ### Option B: Agave Non-Voting Minimal
 Run `agave-validator --no-voting` with `--limit-ledger-size 0` and minimal
 config. Agave still requires a snapshot to start and runs the full process, but
 with no voting and minimal ledger it would be lighter than a full node.
 **Downside:** Agave is monolithic — you can't easily disable replay/accounts.
 It still loads a snapshot, builds the accounts index, and runs replay. This
 defeats the purpose of a lightweight collector.
 ### Option C: Custom Gossip + TVU Receiver
 Write a minimal Rust binary using agave's `solana-gossip` and `solana-streamer`
 crates to:
 1. Bootstrap into gossip via entrypoints
 2. Advertise ContactInfo with TVU socket
 3. Receive shred packets on TVU
 4. Forward them via UDP
 **Estimated effort:** Significant. Gossip protocol participation is complex
 (CRDS protocol, pull/push protocol, protocol versioning). Using the agave
 crates directly is possible but poorly documented for standalone use.
 ### Option D: Run Collectors on Biscayne
 Run the collector processes on biscayne itself, each advertising a TVU address
 at a remote site. The switches at each site forward inbound TVU traffic to
 biscayne via the DZ backbone using traffic-policy redirects (same pattern as
 `ashburn-validator-relay.md`).
 **Advantage:** No compute needed at remote sites. Just switch config + loopback
 IPs. All collector processes run in Miami.
 **Risk:** Gossip advertises IP + port. If the collector runs on biscayne but
 advertises an Ashburn IP, gossip protocol interactions (pull requests, pings)
 arrive at the Ashburn IP and must be forwarded back to biscayne. This adds
 ~58ms RTT to gossip protocol messages, which may cause timeouts or peer
 quality degradation. Needs testing.
 ## Recommendation
 Option A (Firedancer minimal build) is the correct long-term approach. It
 produces a single binary that does exactly one thing: collect shreds from a
 unique turbine tree position and forward them. It runs on minimal hardware
 (a small VM or container at each site, or on biscayne with remote TVU
 addresses).
 Option D (collectors on biscayne with switch forwarding) is the fastest to
 test since it needs no new software — just switch config and multiple
 agave-validator instances with `--no-voting`. The question is whether agave
 can start without a snapshot if we only care about gossip + TVU.
 ## Deployment Topology
 ```
 biscayne (186.233.184.235)
 ├── agave-validator (main, identity C, TVU 186.233.184.235:9000)
 ├── collector-ash (identity A, TVU 137.239.194.65:9000)
 │   └── shreds forwarded via was-sw01 traffic-policy
 ├── collector-dfw (identity B, TVU <dfw-ip>:9000)
 │   └── shreds forwarded via dfw-sw01 traffic-policy
 └── blockstore receives union of A∪B∪C shreds
 was-sw01 (Ashburn)
 └── Loopback: 137.239.194.65
 └── traffic-policy: UDP dst 137.239.194.65:9000 → nexthop mia-sw01
 dfw-sw01 (Dallas)
 └── Loopback: <assigned IP>
 └── traffic-policy: UDP dst <assigned IP>:9000 → nexthop mia-sw01
 ```
 ## Open Questions
 1. Can agave-validator start in gossip-only mode without a snapshot?
 2. Does Firedancer's shred tile work standalone without bank/replay?
 3. What is the gossip protocol timeout for remote TVU addresses (Option D)?
 4. How does the turbine tree handle multiple identities from the same IP
   (if running all collectors on biscayne)?
 5. Do we need stake on collector identities to be placed in the turbine tree,
   or do unstaked nodes still participate?
 6. What IP block is available on dfw-sw01 for a collector loopback?
--- a/docs/tvu-shred-relay.md
+++ b/docs/tvu-shred-relay.md
@ -0,0 +1,161 @@
 # TVU Shred Relay — Data-Plane Redirect
 ## Overview
 Biscayne's agave validator advertises `64.92.84.81:20000` (laconic-was-sw01 Et1/1) as its TVU
 address. Turbine shreds arrive as normal UDP to the switch's front-panel IP. The 7280CR3A ASIC
 handles front-panel traffic without punting to Linux userspace — it sees a local interface IP
 with no service and drops at the hardware level.
 ### Previous approach (monitor + socat)
 EOS monitor session mirrored matched packets to CPU (mirror0 interface). socat read from mirror0
 and relayed to biscayne. shred-unwrap.py on biscayne stripped encapsulation headers.
 Fragile: socat ran as a foreground process, died on disconnect.
 ### New approach (traffic-policy redirect)
 EOS `traffic-policy` with `set nexthop` and `system-rule overriding-action redirect` overrides
 the ASIC's "local IP, handle myself" decision. The ASIC forwards matched packets to the
 specified next-hop at line rate. Pure data plane, no CPU involvement, persists in startup-config.
 Available since EOS 4.28.0F on R3 platforms. Confirmed on 4.34.0F.
 ## Architecture
 ```
 Turbine peers (hundreds of validators)
       |
       v UDP shreds to 64.92.84.81:20000
 laconic-was-sw01 Et1/1 (Ashburn)
       |  ASIC matches traffic-policy SHRED-RELAY
       |  Redirects to nexthop 172.16.1.189 (data plane, line rate)
       v  Et4/1 backbone (25.4ms)
 laconic-mia-sw01 Et4/1 (Miami)
       |  forwards via default route (same metro)
       v  0.13ms
 biscayne (186.233.184.235, Miami)
       |  iptables DNAT: dst 64.92.84.81:20000 -> 127.0.0.1:9000
       v
 agave-validator TVU port (localhost:9000)
 ```
 ## Production Config: laconic-was-sw01
 ### Pre-change safety
 ```
 configure checkpoint save pre-shred-relay
 ```
 Rollback: `rollback running-config checkpoint pre-shred-relay` then `write memory`.
 ### Config session with auto-revert
 ```
 configure session shred-relay
 ! ACL for traffic-policy match
 ip access-list SHRED-RELAY-ACL
   10 permit udp any any eq 20000
 ! Traffic policy: redirect matched packets to backbone next-hop
 traffic-policy SHRED-RELAY
   match SHRED-RELAY-ACL
      set nexthop 172.16.1.189
 ! Override ASIC punt-to-CPU for redirected traffic
 system-rule overriding-action redirect
 ! Apply to Et1/1 ingress
 interface Ethernet1/1
   traffic-policy input SHRED-RELAY
 ! Remove old monitor session and its ACL
 no monitor session 1
 no ip access-list SHRED-RELAY
 ! Review before committing
 show session-config diffs
 ! Commit with 5-minute auto-revert safety net
 commit timer 00:05:00
 ```
 After verification: `configure session shred-relay commit` then `write memory`.
 ### Linux cleanup on was-sw01
 ```bash
 # Kill socat relay (PID 27743)
 kill 27743
 # Remove Linux kernel route
 ip route del 186.233.184.235/32
 ```
 The EOS static route `ip route 186.233.184.235/32 172.16.1.189` stays (general reachability).
 ## Production Config: biscayne
 ### iptables DNAT
 Traffic-policy sends normal L3-forwarded UDP packets (no mirror encapsulation). Packets arrive
 with dst `64.92.84.81:20000` containing clean shred payloads directly in the UDP body.
 ```bash
 sudo iptables -t nat -A PREROUTING -p udp -d 64.92.84.81 --dport 20000 \
  -j DNAT --to-destination 127.0.0.1:9000
 # Persist across reboot
 sudo apt install -y iptables-persistent
 sudo netfilter-persistent save
 ```
 ### Cleanup
 ```bash
 # Kill shred-unwrap.py (PID 2497694)
 kill 2497694
 rm /tmp/shred-unwrap.py
 ```
 ## Verification
 1. `show traffic-policy interface Ethernet1/1` — policy applied
 2. `show traffic-policy counters` — packets matching and redirected
 3. `sudo iptables -t nat -L PREROUTING -v -n` — DNAT rule with packet counts
 4. Validator logs: slot replay rate should maintain ~3.3 slots/sec
 5. `ss -unp | grep 9000` — validator receiving on TVU port
 ## What was removed
 | Component | Host |
 |-----------|------|
 | monitor session 1 | was-sw01 |
 | SHRED-RELAY ACL (old) | was-sw01 |
 | socat relay process | was-sw01 |
 | Linux kernel static route | was-sw01 |
 | shred-unwrap.py | biscayne |
 ## What was added
 | Component | Host | Persistent? |
 |-----------|------|-------------|
 | traffic-policy SHRED-RELAY | was-sw01 | Yes (startup-config) |
 | SHRED-RELAY-ACL | was-sw01 | Yes (startup-config) |
 | system-rule overriding-action redirect | was-sw01 | Yes (startup-config) |
 | iptables DNAT rule | biscayne | Yes (iptables-persistent) |
 ## Key Details
 | Item | Value |
 |------|-------|
 | Biscayne validator identity | `4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr` |
 | Biscayne IP | `186.233.184.235` |
 | laconic-was-sw01 public IP | `64.92.84.81` (Et1/1) |
 | laconic-was-sw01 backbone IP | `172.16.1.188` (Et4/1) |
 | laconic-was-sw01 SSH | `install@137.239.200.198` |
 | laconic-mia-sw01 backbone IP | `172.16.1.189` (Et4/1) |
 | Backbone RTT (WAS-MIA) | 25.4ms |
 | EOS version | 4.34.0F |
--- a/inventory/biscayne.yml
+++ b/inventory/biscayne.yml
@ -0,0 +1,14 @@
 all:
  hosts:
    biscayne:
      ansible_host: biscayne.vaasl.io
      ansible_user: rix
      ansible_become: true
      # DoubleZero identities
      dz_identity: 3Bw6v7EruQvTwoY79h2QjQCs2KBQFzSneBdYUbcXK1Tr
      validator_identity: 4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr
      client_ip: 186.233.184.235
      dz_device: laconic-mia-sw01
      dz_tenant: laconic
      dz_environment: mainnet-beta
--- a/inventory/switches.yml
+++ b/inventory/switches.yml
@ -0,0 +1,23 @@
 all:
  children:
    switches:
      vars:
        ansible_connection: ansible.netcommon.network_cli
        ansible_network_os: arista.eos.eos
        ansible_user: install
        ansible_become: true
        ansible_become_method: enable
      hosts:
        was-sw01:
          ansible_host: 137.239.200.198
          # Et1/1: 64.92.84.81 (Ashburn uplink)
          # Et4/1: 172.16.1.188 (backbone to mia-sw01)
          # Loopback100: 137.239.194.64/32
          backbone_ip: 172.16.1.188
          backbone_peer: 172.16.1.189
          uplink_gateway: 64.92.84.80
        mia-sw01:
          ansible_host: 209.42.167.133
          # Et4/1: 172.16.1.189 (backbone to was-sw01)
          backbone_ip: 172.16.1.189
          backbone_peer: 172.16.1.188
--- a/playbooks/ashburn-relay-biscayne.yml
+++ b/playbooks/ashburn-relay-biscayne.yml
@ -156,73 +156,62 @@
      failed_when: "add_ip.rc != 0 and 'RTNETLINK answers: File exists' not in add_ip.stderr"
      tags: [inbound]
-    - name: Add DNAT for gossip UDP
+    - name: Add DNAT rules (inserted before DOCKER chain)
-      ansible.builtin.iptables:
+      ansible.builtin.shell:
-        table: nat
+        cmd: |
-        chain: PREROUTING
+          set -o pipefail
-        protocol: udp
+          # DNAT rules must be before Docker's ADDRTYPE LOCAL rule, otherwise
-        destination: "{{ ashburn_ip }}"
+          # Docker's PREROUTING chain swallows traffic to 137.239.194.65 (which
-        destination_port: "{{ gossip_port }}"
+          # is on loopback and therefore type LOCAL).
-        jump: DNAT
+          for rule in \
-        to_destination: "{{ kind_node_ip }}:{{ gossip_port }}"
+            "-p udp -d {{ ashburn_ip }} --dport {{ gossip_port }} -j DNAT --to-destination {{ kind_node_ip }}:{{ gossip_port }}" \
            "-p tcp -d {{ ashburn_ip }} --dport {{ gossip_port }} -j DNAT --to-destination {{ kind_node_ip }}:{{ gossip_port }}" \
            "-p udp -d {{ ashburn_ip }} --dport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j DNAT --to-destination {{ kind_node_ip }}" \
          ; do
            if ! iptables -t nat -C PREROUTING $rule 2>/dev/null; then
              iptables -t nat -I PREROUTING 1 $rule
              echo "added: $rule"
            else
              echo "exists: $rule"
            fi
          done
        executable: /bin/bash
      register: dnat_result
      changed_when: "'added' in dnat_result.stdout"
      tags: [inbound]
-    - name: Add DNAT for gossip TCP
+    - name: Show DNAT result
-      ansible.builtin.iptables:
+      ansible.builtin.debug:
-        table: nat
+        var: dnat_result.stdout_lines
        chain: PREROUTING
        protocol: tcp
        destination: "{{ ashburn_ip }}"
        destination_port: "{{ gossip_port }}"
        jump: DNAT
        to_destination: "{{ kind_node_ip }}:{{ gossip_port }}"
      tags: [inbound]
    - name: Add DNAT for dynamic ports (UDP 9000-9025)
      ansible.builtin.iptables:
        table: nat
        chain: PREROUTING
        protocol: udp
        destination: "{{ ashburn_ip }}"
        destination_port: "{{ dynamic_port_range_start }}:{{ dynamic_port_range_end }}"
        jump: DNAT
        to_destination: "{{ kind_node_ip }}"
      tags: [inbound]
    # ------------------------------------------------------------------
    # Outbound: fwmark + SNAT + policy routing
    # ------------------------------------------------------------------
-    - name: Mark outbound validator UDP gossip traffic
+    - name: Mark outbound validator traffic (mangle PREROUTING)
-      ansible.builtin.iptables:
+      ansible.builtin.shell:
-        table: mangle
+        cmd: |
-        chain: PREROUTING
+          set -o pipefail
-        protocol: udp
+          for rule in \
-        source: "{{ kind_network }}"
+            "-p udp -s {{ kind_network }} --sport {{ gossip_port }} -j MARK --set-mark {{ fwmark }}" \
-        source_port: "{{ gossip_port }}"
+            "-p udp -s {{ kind_network }} --sport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j MARK --set-mark {{ fwmark }}" \
-        jump: MARK
+            "-p tcp -s {{ kind_network }} --sport {{ gossip_port }} -j MARK --set-mark {{ fwmark }}" \
-        set_mark: "{{ fwmark }}"
+          ; do
            if ! iptables -t mangle -C PREROUTING $rule 2>/dev/null; then
              iptables -t mangle -A PREROUTING $rule
              echo "added: $rule"
            else
              echo "exists: $rule"
            fi
          done
        executable: /bin/bash
      register: mangle_result
      changed_when: "'added' in mangle_result.stdout"
      tags: [outbound]
-    - name: Mark outbound validator UDP dynamic port traffic
+    - name: Show mangle result
-      ansible.builtin.iptables:
+      ansible.builtin.debug:
-        table: mangle
+        var: mangle_result.stdout_lines
        chain: PREROUTING
        protocol: udp
        source: "{{ kind_network }}"
        source_port: "{{ dynamic_port_range_start }}:{{ dynamic_port_range_end }}"
        jump: MARK
        set_mark: "{{ fwmark }}"
      tags: [outbound]
    - name: Mark outbound validator TCP gossip traffic
      ansible.builtin.iptables:
        table: mangle
        chain: PREROUTING
        protocol: tcp
        source: "{{ kind_network }}"
        source_port: "{{ gossip_port }}"
        jump: MARK
        set_mark: "{{ fwmark }}"
      tags: [outbound]
    - name: SNAT marked traffic to Ashburn IP (before Docker MASQUERADE)
@ -337,7 +326,7 @@
          nat_rules: "{{ nat_rules.stdout_lines }}"
          mangle_rules: "{{ mangle_rules.stdout_lines | default([]) }}"
          routing: "{{ routing_info.stdout_lines | default([]) }}"
-          loopback: "{{ lo_addrs.stdout_lines }}"
+          loopback: "{{ lo_addrs.stdout_lines | default([]) }}"
      tags: [inbound, outbound]
    - name: Summary
--- a/playbooks/ashburn-relay-mia-sw01.yml
+++ b/playbooks/ashburn-relay-mia-sw01.yml
@ -1,14 +1,19 @@
 ---
-# Configure laconic-mia-sw01 for outbound validator traffic redirect
+# Configure laconic-mia-sw01 for validator traffic relay (inbound + outbound)
 #
-# Redirects outbound traffic from biscayne (src 137.239.194.65) arriving
+# Outbound: Redirects outbound traffic from biscayne (src 137.239.194.65)
-# via the doublezero0 GRE tunnel to was-sw01 via the backbone, preventing
+# arriving via the doublezero0 GRE tunnel to was-sw01 via the backbone,
-# BCP38 drops at mia-sw01's ISP uplink.
+# preventing BCP38 drops at mia-sw01's ISP uplink.
 #
 # Inbound: Routes traffic destined to 137.239.194.65 from the default VRF
 # to biscayne via Tunnel500 in vrf1. Without this, mia-sw01 sends
 # 137.239.194.65 out the ISP uplink back to was-sw01 (routing loop).
 #
 # Approach: The existing per-tunnel ACL (SEC-USER-500-IN) controls what
 # traffic enters vrf1 from Tunnel500. We add 137.239.194.65 to the ACL
 # and add a default route in vrf1 via egress-vrf default pointing to
-# was-sw01's backbone IP. No PBR needed — the ACL is the filter.
+# was-sw01's backbone IP. For inbound, an inter-VRF static route in the
 # default VRF forwards 137.239.194.65/32 to biscayne via Tunnel500.
 #
 # The other vrf1 tunnels (502, 504, 505) have their own ACLs that only
 # permit their specific source IPs, so the default route won't affect them.
@ -39,6 +44,7 @@
    tunnel_interface: Tunnel500
    tunnel_vrf: vrf1
    tunnel_acl: SEC-USER-500-IN
    tunnel_nexthop: 169.254.7.7  # biscayne's end of the Tunnel500 /31
    backbone_interface: Ethernet4/1
    session_name: validator-outbound
    checkpoint_name: pre-validator-outbound
@ -117,6 +123,7 @@
          - "show ip route vrf {{ tunnel_vrf }} 0.0.0.0/0"
          - "show ip route vrf {{ tunnel_vrf }} {{ backbone_peer }}"
          - "show ip route {{ backbone_peer }}"
          - "show ip route {{ ashburn_ip }}"
      register: vrf_routing
      tags: [preflight]
@ -163,6 +170,11 @@
          # Default route in vrf1 via backbone to was-sw01 (egress-vrf default)
          # Safe because per-tunnel ACLs already restrict what enters vrf1
          - command: "ip route vrf {{ tunnel_vrf }} 0.0.0.0/0 egress-vrf default {{ backbone_interface }} {{ backbone_peer }}"
          # Inbound: route traffic for ashburn IP from default VRF to biscayne via tunnel.
          # Without this, mia-sw01 sends 137.239.194.65 out the ISP uplink → routing loop.
          # NOTE: nexthop only, no interface — EOS silently drops cross-VRF routes that
          # specify a tunnel interface (accepts in config but never installs in RIB).
          - command: "ip route {{ ashburn_ip }}/32 egress-vrf {{ tunnel_vrf }} {{ tunnel_nexthop }}"
    - name: Show session diff
      arista.eos.eos_command:
@ -189,6 +201,7 @@
        commands:
          - "show running-config | section ip access-list {{ tunnel_acl }}"
          - "show ip route vrf {{ tunnel_vrf }} 0.0.0.0/0"
          - "show ip route {{ ashburn_ip }}"
      register: verify
    - name: Display verification
@ -205,6 +218,7 @@
          Changes applied:
          1. ACL {{ tunnel_acl }}: added "45 permit ip host {{ ashburn_ip }} any"
          2. Default route in {{ tunnel_vrf }}: 0.0.0.0/0 egress-vrf default {{ backbone_interface }} {{ backbone_peer }}
          3. Inbound route: {{ ashburn_ip }}/32 egress-vrf {{ tunnel_vrf }} {{ tunnel_nexthop }}
          The config will auto-revert in 5 minutes unless committed.
          Verify on the switch, then commit:
--- a/playbooks/ashburn-relay-was-sw01.yml
+++ b/playbooks/ashburn-relay-was-sw01.yml
@ -1,15 +1,20 @@
 ---
-# Configure laconic-was-sw01 for full validator traffic relay
+# Configure laconic-was-sw01 for inbound validator traffic relay
 #
-# Replaces the old SHRED-RELAY (TVU-only, port 20000) with VALIDATOR-RELAY
+# Routes all traffic destined to 137.239.194.65 to mia-sw01 via backbone.
-# covering all validator ports (8001, 9000-9025). Adds Loopback101 for
+# A single static route replaces the previous Loopback101 + PBR approach.
 # 137.239.194.65.
 #
-# Uses EOS config session with 5-minute auto-revert for safety.
+# 137.239.194.65 is already routed to was-sw01 by its covering prefix
-# After verification, run with -e commit=true to finalize.
+# (advertised via IS-IS on Loopback100). No loopback needed — the static
 # route forwards traffic before the switch tries to deliver it locally.
 #
 # This playbook also removes the old PBR config if present (Loopback101,
 # VALIDATOR-RELAY-ACL, VALIDATOR-RELAY-CLASS, VALIDATOR-RELAY policy-map,
 # service-policy on Et1/1).
 #
 # Usage:
 #   ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-was-sw01.yml
 #   ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-was-sw01.yml -e apply=true
 #   ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-was-sw01.yml -e commit=true
 #   ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-was-sw01.yml -e rollback=true
@ -19,10 +24,11 @@
  vars:
    ashburn_ip: 137.239.194.65
    apply: false
    commit: false
    rollback: false
-    session_name: validator-relay
+    session_name: validator-relay-v2
-    checkpoint_name: pre-validator-relay
+    checkpoint_name: pre-validator-relay-v2
  tasks:
    # ------------------------------------------------------------------
@ -66,77 +72,78 @@
          ansible.builtin.meta: end_play
    # ------------------------------------------------------------------
-    # Pre-checks
+    # Pre-flight checks
    # ------------------------------------------------------------------
-    - name: Show current traffic-policy on Et1/1
+    - name: Show current Et1/1 config
      arista.eos.eos_command:
        commands:
          - show running-config interfaces Ethernet1/1
      register: et1_config
      tags: [preflight]
-    - name: Show current config
+    - name: Display Et1/1 config
      ansible.builtin.debug:
        var: et1_config.stdout_lines
      tags: [preflight]
-    - name: Show existing PBR policy on Et1/1
+    - name: Check for existing Loopback101 and PBR
      arista.eos.eos_command:
        commands:
          - "show running-config interfaces Loopback101"
          - "show running-config | include service-policy"
-      register: existing_pbr
+          - "show running-config section policy-map type pbr"
          - "show ip route {{ ashburn_ip }}"
      register: existing_config
      tags: [preflight]
-    - name: Show existing PBR config
+    - name: Display existing config
      ansible.builtin.debug:
-        var: existing_pbr.stdout_lines
+        var: existing_config.stdout_lines
      tags: [preflight]
    - name: Pre-flight summary
      when: not (apply | bool)
      ansible.builtin.debug:
        msg: |
          === Pre-flight complete ===
          Review the output above:
          1. Does Loopback101 exist with {{ ashburn_ip }}? (will be removed)
          2. Is service-policy VALIDATOR-RELAY on Et1/1? (will be removed)
          3. Current route for {{ ashburn_ip }}
          To apply config:
            ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-was-sw01.yml \
              -e apply=true
      tags: [preflight]
    - name: End play if not applying
      when: not (apply | bool)
      ansible.builtin.meta: end_play
    # ------------------------------------------------------------------
-    # Save checkpoint
+    # Apply config via session with 5-minute auto-revert
    # ------------------------------------------------------------------
-    - name: Save checkpoint for rollback
+    - name: Save checkpoint
      arista.eos.eos_command:
        commands:
          - "configure checkpoint save {{ checkpoint_name }}"
      register: checkpoint_result
-    - name: Show checkpoint result
+    - name: Apply config session
      ansible.builtin.debug:
        var: checkpoint_result.stdout_lines
    # ------------------------------------------------------------------
    # Apply via config session with 5-minute auto-revert
    #
    # eos_config writes directly to running-config, bypassing sessions.
    # Use eos_command with raw CLI to get the safety net.
    # ------------------------------------------------------------------
    - name: Apply config session with auto-revert
      arista.eos.eos_command:
        commands:
          # Enter named config session
          - command: "configure session {{ session_name }}"
-          # Loopback101 for Ashburn IP
+          # Remove old PBR service-policy from Et1/1
          - command: interface Loopback101
          - command: "ip address {{ ashburn_ip }}/32"
          - command: exit
          # ACL covering all validator ports
          - command: ip access-list VALIDATOR-RELAY-ACL
          - command: 10 permit udp any any eq 8001
          - command: 20 permit udp any any range 9000 9025
          - command: 30 permit tcp any any eq 8001
          - command: exit
          # PBR class-map referencing the ACL
          - command: class-map type pbr match-any VALIDATOR-RELAY-CLASS
          - command: match ip access-group VALIDATOR-RELAY-ACL
          - command: exit
          # PBR policy-map with nexthop redirect
          - command: policy-map type pbr VALIDATOR-RELAY
          - command: class VALIDATOR-RELAY-CLASS
          - command: "set nexthop {{ backbone_peer }}"
          - command: exit
          - command: exit
          # Apply PBR policy on Et1/1
          - command: interface Ethernet1/1
-          - command: service-policy type pbr input VALIDATOR-RELAY
+          - command: no service-policy type pbr input VALIDATOR-RELAY
          - command: exit
-      tags: [config]
+          # Remove old PBR policy-map, class-map, ACL
          - command: no policy-map type pbr VALIDATOR-RELAY
          - command: no class-map type pbr match-any VALIDATOR-RELAY-CLASS
          - command: no ip access-list VALIDATOR-RELAY-ACL
          # Remove Loopback101
          - command: no interface Loopback101
          # Add static route to forward all traffic for ashburn IP to mia-sw01
          - command: "ip route {{ ashburn_ip }}/32 {{ backbone_peer }}"
    - name: Show session diff
      arista.eos.eos_command:
@ -154,32 +161,20 @@
      arista.eos.eos_command:
        commands:
          - "configure session {{ session_name }} commit timer 00:05:00"
      tags: [config]
    # ------------------------------------------------------------------
    # Verify
    # ------------------------------------------------------------------
-    - name: Show PBR policy on Et1/1
+    - name: Verify config
      arista.eos.eos_command:
        commands:
          - "show ip route {{ ashburn_ip }}"
          - show running-config interfaces Ethernet1/1
-          - show running-config section policy-map
+      register: verify
          - show ip interface Loopback101
      register: pbr_interface
    - name: Display verification
      ansible.builtin.debug:
-        var: pbr_interface.stdout_lines
+        var: verify.stdout_lines
    - name: Show Loopback101
      arista.eos.eos_command:
        commands:
          - show ip interface Loopback101
      register: lo101
    - name: Display Loopback101
      ansible.builtin.debug:
        var: lo101.stdout_lines
    - name: Reminder
      ansible.builtin.debug:
@ -188,8 +183,12 @@
          Session: {{ session_name }}
          Checkpoint: {{ checkpoint_name }}
          Changes applied:
          1. Removed: Loopback101, VALIDATOR-RELAY PBR (ACL, class-map, policy-map, service-policy)
          2. Added: ip route {{ ashburn_ip }}/32 {{ backbone_peer }}
          The config will auto-revert in 5 minutes unless committed.
-          Verify PBR policy is applied, then commit from the switch CLI:
+          Verify on the switch, then commit:
            configure session {{ session_name }} commit
            write memory
--- a/playbooks/biscayne-boot.yml
+++ b/playbooks/biscayne-boot.yml
@ -0,0 +1,107 @@
 ---
 # Configure biscayne OS-level services for agave validator
 #
 # Installs a systemd unit that formats and mounts the ramdisk on boot.
 # /dev/ram0 loses its filesystem on reboot, so mkfs.xfs must run before
 # the fstab mount. This unit runs before docker, ensuring the kind node's
 # bind mounts always see the ramdisk.
 #
 # This playbook is idempotent — safe to run multiple times.
 #
 # Usage:
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-boot.yml
 #
 - name: Configure OS-level services for agave
  hosts: all
  gather_facts: false
  become: true
  vars:
    ramdisk_device: /dev/ram0
    ramdisk_mount: /srv/solana/ramdisk
    accounts_dir: /srv/solana/ramdisk/accounts
  tasks:
    - name: Install ramdisk format service
      copy:
        dest: /etc/systemd/system/format-ramdisk.service
        mode: "0644"
        content: |
          [Unit]
          Description=Format /dev/ram0 as XFS for Solana accounts
          DefaultDependencies=no
          Before=local-fs.target
          After=systemd-modules-load.service
          ConditionPathExists={{ ramdisk_device }}
          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/sbin/mkfs.xfs -f {{ ramdisk_device }}
          [Install]
          WantedBy=local-fs.target
      register: unit_file
    - name: Install ramdisk post-mount service
      copy:
        dest: /etc/systemd/system/ramdisk-accounts.service
        mode: "0644"
        content: |
          [Unit]
          Description=Create Solana accounts directory on ramdisk
          After=srv-solana-ramdisk.mount
          Requires=srv-solana-ramdisk.mount
          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/bin/bash -c 'mkdir -p {{ accounts_dir }} && chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}'
          [Install]
          WantedBy=multi-user.target
      register: accounts_unit
    - name: Ensure fstab entry uses nofail
      lineinfile:
        path: /etc/fstab
        regexp: '^{{ ramdisk_device }}\s+{{ ramdisk_mount }}'
        line: '{{ ramdisk_device }} {{ ramdisk_mount }} xfs noatime,nodiratime,nofail,x-systemd.requires=format-ramdisk.service 0 0'
      register: fstab_entry
    - name: Reload systemd
      systemd:
        daemon_reload: true
      when: unit_file.changed or accounts_unit.changed or fstab_entry.changed
    - name: Enable ramdisk services
      systemd:
        name: "{{ item }}"
        enabled: true
      loop:
        - format-ramdisk.service
        - ramdisk-accounts.service
    # ---- apply now if ramdisk not mounted ------------------------------------
    - name: Check if ramdisk is mounted
      command: mountpoint -q {{ ramdisk_mount }}
      register: ramdisk_mounted
      failed_when: false
      changed_when: false
    - name: Format and mount ramdisk now
      shell: |
        mkfs.xfs -f {{ ramdisk_device }}
        mount {{ ramdisk_mount }}
        mkdir -p {{ accounts_dir }}
        chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}
      when: ramdisk_mounted.rc != 0
    # ---- verify --------------------------------------------------------------
    - name: Verify ramdisk
      command: df -hT {{ ramdisk_mount }}
      register: ramdisk_df
      changed_when: false
    - name: Show ramdisk status
      debug:
        msg: "{{ ramdisk_df.stdout_lines }}"
--- a/playbooks/biscayne-recover.yml
+++ b/playbooks/biscayne-recover.yml
@ -0,0 +1,220 @@
 ---
 # Recover agave validator from any state to healthy
 #
 # This playbook is idempotent — it assesses current state and picks up
 # from wherever the system is. Each step checks its precondition and
 # skips if already satisfied.
 #
 # Steps:
 #   1. Scale deployment to 0
 #   2. Wait for pods to terminate
 #   3. Wipe accounts ramdisk
 #   4. Clean old snapshots
 #   5. Download fresh snapshot via aria2c
 #   6. Verify snapshot accessible via PV (kubectl)
 #   7. Scale deployment to 1
 #   8. Wait for pod Running
 #   9. Verify validator log shows snapshot unpacking
 #  10. Check RPC health
 #
 # Usage:
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-recover.yml
 #
 #   # Pass extra args to snapshot-download.py
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-recover.yml \
 #     -e 'snapshot_args=--version 2.2'
 #
 - name: Recover agave validator
  hosts: all
  gather_facts: false
  environment:
    KUBECONFIG: /home/rix/.kube/config
  vars:
    kind_cluster: laconic-70ce4c4b47e23b85
    k8s_namespace: "laconic-{{ kind_cluster }}"
    deployment_name: "{{ kind_cluster }}-deployment"
    snapshot_dir: /srv/solana/snapshots
    accounts_dir: /srv/solana/ramdisk/accounts
    ramdisk_mount: /srv/solana/ramdisk
    ramdisk_device: /dev/ram0
    snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py"
    snapshot_script: /tmp/snapshot-download.py
    snapshot_args: ""
    # Mainnet RPC for slot comparison
    mainnet_rpc: https://api.mainnet-beta.solana.com
    # Maximum slots behind before snapshot is considered stale
    max_slot_lag: 20000
  tasks:
    # ---- step 1: scale to 0 ---------------------------------------------------
    - name: Get current replica count
      command: >
        kubectl get deployment {{ deployment_name }}
        -n {{ k8s_namespace }}
        -o jsonpath='{.spec.replicas}'
      register: current_replicas
      failed_when: false
      changed_when: false
    - name: Scale deployment to 0
      command: >
        kubectl scale deployment {{ deployment_name }}
        -n {{ k8s_namespace }} --replicas=0
      when: current_replicas.stdout | default('0') | int > 0
      changed_when: true
    # ---- step 2: wait for pods to terminate ------------------------------------
    - name: Wait for pods to terminate
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -l app={{ deployment_name }}
        -o jsonpath='{.items}'
      register: pods_remaining
      retries: 60
      delay: 5
      until: pods_remaining.stdout == "[]" or pods_remaining.stdout == ""
      changed_when: false
      when: current_replicas.stdout | default('0') | int > 0
    - name: Verify no agave processes in kind node (io_uring safety check)
      command: >
        docker exec {{ kind_cluster }}-control-plane
        pgrep -c agave-validator
      register: agave_procs
      failed_when: false
      changed_when: false
    - name: Fail if agave zombie detected
      ansible.builtin.fail:
        msg: >-
          agave-validator process still running inside kind node after pod
          termination. This is the io_uring/ZFS deadlock. Do NOT proceed —
          host reboot required. See CLAUDE.md.
      when: agave_procs.rc == 0
    # ---- step 3: wipe accounts ramdisk -----------------------------------------
    # Cannot umount+mkfs because the kind node's bind mount holds it open.
    # Instead, delete contents. This is sufficient — agave starts clean.
    - name: Wipe accounts data
      ansible.builtin.shell: |
        rm -rf {{ accounts_dir }}/*
        chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}
      become: true
      changed_when: true
    # ---- step 4: clean old snapshots -------------------------------------------
    - name: Remove all old snapshots
      ansible.builtin.shell: rm -f {{ snapshot_dir }}/*.tar.* {{ snapshot_dir }}/*.tar
      become: true
      changed_when: true
    # ---- step 5: download fresh snapshot ---------------------------------------
    - name: Verify aria2c installed
      command: which aria2c
      changed_when: false
    - name: Copy snapshot script to remote
      ansible.builtin.copy:
        src: "{{ snapshot_script_local }}"
        dest: "{{ snapshot_script }}"
        mode: "0755"
    - name: Download snapshot and scale to 1
      ansible.builtin.shell: |
        python3 {{ snapshot_script }} \
          -o {{ snapshot_dir }} \
          --max-snapshot-age {{ max_slot_lag }} \
          --max-latency 500 \
          {{ snapshot_args }} \
        && KUBECONFIG=/home/rix/.kube/config kubectl scale deployment \
          {{ deployment_name }} -n {{ k8s_namespace }} --replicas=1
      become: true
      register: snapshot_result
      timeout: 3600
      changed_when: true
    # ---- step 6: verify snapshot accessible via PV -----------------------------
    - name: Get snapshot filename
      ansible.builtin.shell: ls -1 {{ snapshot_dir }}/snapshot-*.tar.* | head -1 | xargs basename
      register: snapshot_filename
      changed_when: false
    - name: Extract snapshot slot from filename
      ansible.builtin.set_fact:
        snapshot_slot: "{{ snapshot_filename.stdout | regex_search('snapshot-([0-9]+)-', '\\1') | first }}"
    - name: Get current mainnet slot
      ansible.builtin.uri:
        url: "{{ mainnet_rpc }}"
        method: POST
        body_format: json
        body:
          jsonrpc: "2.0"
          id: 1
          method: getSlot
          params:
            - commitment: finalized
        return_content: true
      register: mainnet_slot_response
    - name: Check snapshot freshness
      ansible.builtin.fail:
        msg: >-
          Snapshot too old: slot {{ snapshot_slot }}, mainnet at
          {{ mainnet_slot_response.json.result }},
          {{ mainnet_slot_response.json.result | int - snapshot_slot | int }} slots behind
          (max {{ max_slot_lag }}).
      when: (mainnet_slot_response.json.result | int - snapshot_slot | int) > max_slot_lag
    - name: Report snapshot freshness
      ansible.builtin.debug:
        msg: >-
          Snapshot slot {{ snapshot_slot }}, mainnet {{ mainnet_slot_response.json.result }},
          {{ mainnet_slot_response.json.result | int - snapshot_slot | int }} slots behind.
    # ---- step 7: scale already done in download step above ----------------------
    # ---- step 8: wait for pod running ------------------------------------------
    - name: Wait for pod to be running
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -l app={{ deployment_name }}
        -o jsonpath='{.items[0].status.phase}'
      register: pod_status
      retries: 60
      delay: 10
      until: pod_status.stdout == "Running"
      changed_when: false
    # ---- step 9: verify validator log ------------------------------------------
    - name: Wait for validator log file
      command: >
        kubectl exec -n {{ k8s_namespace }}
        deployment/{{ deployment_name }}
        -c agave-validator -- test -f /data/log/validator.log
      register: log_file_check
      retries: 12
      delay: 10
      until: log_file_check.rc == 0
      changed_when: false
    # ---- step 10: check RPC health ---------------------------------------------
    - name: Check RPC health (non-blocking)
      ansible.builtin.uri:
        url: http://{{ inventory_hostname }}:8899/health
        return_content: true
      register: rpc_health
      retries: 6
      delay: 30
      until: rpc_health.status == 200
      failed_when: false
    - name: Report final status
      ansible.builtin.debug:
        msg: >-
          Recovery complete.
          Snapshot: slot {{ snapshot_slot }}
          ({{ mainnet_slot_response.json.result | int - snapshot_slot | int }} slots behind).
          Pod: {{ pod_status.stdout }}.
          Log: {{ 'writing' if log_file_check.rc == 0 else 'not yet' }}.
          RPC: {{ rpc_health.content | default('not yet responding — still catching up') }}.
--- a/playbooks/biscayne-redeploy.yml
+++ b/playbooks/biscayne-redeploy.yml
@ -0,0 +1,321 @@
 ---
 # Redeploy agave-stack on biscayne with aria2c snapshot pre-download
 #
 # The validator's built-in downloader fetches snapshots at ~18 MB/s (single
 # connection). snapshot-download.py uses aria2c with 16 parallel connections to
 # saturate available bandwidth, cutting 90+ min downloads to ~10 min.
 #
 # Flow:
 #   1. [teardown]  Delete k8s namespace (preserve kind cluster)
 #   2. [wipe]      Conditionally clear ledger / accounts / old snapshots
 #   3. [deploy]    laconic-so deployment start, then immediately scale to 0
 #   4. [snapshot]  Download snapshot via aria2c to host bind mount
 #   5. [snapshot]  Verify snapshot visible inside kind node
 #   6. [deploy]    Scale validator back to 1
 #   7. [verify]    Wait for pod Running, check logs + RPC health
 #
 # The validator cannot run during snapshot download — it would lock/use the
 # snapshot files. laconic-so creates the cluster AND deploys the pod in one
 # shot, so we scale to 0 immediately after deploy, download, then scale to 1.
 #
 # Usage:
 #   # Standard redeploy (download snapshot, preserve accounts + ledger)
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-redeploy.yml
 #
 #   # Full wipe (accounts + ledger) — slow rebuild
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-redeploy.yml \
 #     -e wipe_accounts=true -e wipe_ledger=true
 #
 #   # Skip snapshot download (use existing)
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-redeploy.yml \
 #     -e skip_snapshot=true
 #
 #   # Pass extra args to snapshot-download.py
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-redeploy.yml \
 #     -e 'snapshot_args=--version 2.2 --min-download-speed 50'
 #
 #   # Snapshot only (no teardown/deploy)
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-redeploy.yml \
 #     --tags snapshot
 #
 - name: Redeploy agave validator on biscayne
  hosts: all
  gather_facts: false
  environment:
    KUBECONFIG: /home/rix/.kube/config
  vars:
    deployment_dir: /srv/deployments/agave
    laconic_so: /home/rix/.local/bin/laconic-so
    kind_cluster: laconic-70ce4c4b47e23b85
    k8s_namespace: "laconic-{{ kind_cluster }}"
    deployment_name: "{{ kind_cluster }}-deployment"
    snapshot_dir: /srv/solana/snapshots
    ledger_dir: /srv/solana/ledger
    accounts_dir: /srv/solana/ramdisk/accounts
    ramdisk_mount: /srv/solana/ramdisk
    ramdisk_device: /dev/ram0
    snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py"
    snapshot_script: /tmp/snapshot-download.py
    # Flags — non-destructive by default
    wipe_accounts: false
    wipe_ledger: false
    skip_snapshot: false
    snapshot_args: ""
  tasks:
    # ---- teardown: graceful stop, then delete namespace ----------------------
    #
    # IMPORTANT: Scale to 0 first, wait for agave to exit cleanly.
    # Deleting the namespace while agave is running causes io_uring/ZFS
    # deadlock (unkillable D-state threads). See CLAUDE.md.
    - name: Scale deployment to 0 (graceful stop)
      command: >
        kubectl scale deployment {{ deployment_name }}
        -n {{ k8s_namespace }} --replicas=0
      register: pre_teardown_scale
      failed_when: false
      tags: [teardown]
    - name: Wait for agave to exit
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -l app={{ deployment_name }}
        -o jsonpath='{.items}'
      register: pre_teardown_pods
      retries: 60
      delay: 5
      until: pre_teardown_pods.stdout == "[]" or pre_teardown_pods.stdout == "" or pre_teardown_pods.rc != 0
      failed_when: false
      when: pre_teardown_scale.rc == 0
      tags: [teardown]
    - name: Delete deployment namespace
      command: >
        kubectl delete namespace {{ k8s_namespace }} --timeout=120s
      register: ns_delete
      failed_when: false
      tags: [teardown]
    - name: Wait for namespace to terminate
      command: >
        kubectl get namespace {{ k8s_namespace }}
        -o jsonpath='{.status.phase}'
      register: ns_status
      retries: 30
      delay: 5
      until: ns_status.rc != 0
      failed_when: false
      when: ns_delete.rc == 0
      tags: [teardown]
    # ---- wipe: opt-in data cleanup ------------------------------------------
    - name: Wipe ledger data
      shell: rm -rf {{ ledger_dir }}/*
      become: true
      when: wipe_ledger | bool
      tags: [wipe]
    - name: Wipe accounts ramdisk (umount + mkfs.xfs + mount)
      shell: |
        mountpoint -q {{ ramdisk_mount }} && umount {{ ramdisk_mount }} || true
        mkfs.xfs -f {{ ramdisk_device }}
        mount {{ ramdisk_mount }}
        mkdir -p {{ accounts_dir }}
        chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}
      become: true
      when: wipe_accounts | bool
      tags: [wipe]
    - name: Clean old snapshots (keep newest full + incremental)
      shell: |
        cd {{ snapshot_dir }} || exit 0
        newest=$(ls -t snapshot-*.tar.* 2>/dev/null | head -1)
        if [ -n "$newest" ]; then
          newest_inc=$(ls -t incremental-snapshot-*.tar.* 2>/dev/null | head -1)
          find . -maxdepth 1 -name '*.tar.*' \
            ! -name "$newest" \
            ! -name "${newest_inc:-__none__}" \
            -delete
        fi
      become: true
      when: not skip_snapshot | bool
      tags: [wipe]
    # ---- preflight: verify ramdisk and mounts before deploy ------------------
    - name: Verify ramdisk is mounted
      command: mountpoint -q {{ ramdisk_mount }}
      register: ramdisk_check
      failed_when: ramdisk_check.rc != 0
      changed_when: false
      tags: [deploy, preflight]
    - name: Verify ramdisk is xfs (not the underlying ZFS)
      shell: df -T {{ ramdisk_mount }} | grep -q xfs
      register: ramdisk_type
      failed_when: ramdisk_type.rc != 0
      changed_when: false
      tags: [deploy, preflight]
    - name: Verify ramdisk visible inside kind node
      shell: >
        docker exec {{ kind_cluster }}-control-plane
        df -T /mnt/solana/ramdisk 2>/dev/null | grep -q xfs
      register: kind_ramdisk_check
      failed_when: kind_ramdisk_check.rc != 0
      changed_when: false
      tags: [deploy, preflight]
    # ---- deploy: bring up cluster, scale to 0 immediately -------------------
    - name: Verify kind-config.yml has unified mount root
      command: "grep -c 'containerPath: /mnt$' {{ deployment_dir }}/kind-config.yml"
      register: mount_root_check
      failed_when: mount_root_check.stdout | int < 1
      tags: [deploy]
    - name: Start deployment (creates kind cluster + deploys pod)
      command: "{{ laconic_so }} deployment --dir {{ deployment_dir }} start"
      timeout: 1200
      tags: [deploy]
    - name: Wait for deployment to exist
      command: >
        kubectl get deployment {{ deployment_name }}
        -n {{ k8s_namespace }}
        -o jsonpath='{.metadata.name}'
      register: deploy_exists
      retries: 30
      delay: 10
      until: deploy_exists.rc == 0
      tags: [deploy]
    - name: Scale validator to 0 (stop before snapshot download)
      command: >
        kubectl scale deployment {{ deployment_name }}
        -n {{ k8s_namespace }} --replicas=0
      tags: [deploy]
    - name: Wait for pods to terminate
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -l app={{ deployment_name }}
        -o jsonpath='{.items}'
      register: pods_gone
      retries: 30
      delay: 5
      until: pods_gone.stdout == "[]" or pods_gone.stdout == ""
      failed_when: false
      tags: [deploy]
    # ---- snapshot: download via aria2c, verify in kind node ------------------
    - name: Verify aria2c installed
      command: which aria2c
      changed_when: false
      when: not skip_snapshot | bool
      tags: [snapshot]
    - name: Copy snapshot script to remote
      copy:
        src: "{{ snapshot_script_local }}"
        dest: "{{ snapshot_script }}"
        mode: "0755"
      when: not skip_snapshot | bool
      tags: [snapshot]
    - name: Verify kind node mounts
      command: >
        docker exec {{ kind_cluster }}-control-plane
        ls /mnt/solana/snapshots/
      register: kind_mount_check
      tags: [snapshot]
    - name: Download snapshot via aria2c
      shell: >
        python3 {{ snapshot_script }}
        -o {{ snapshot_dir }}
        {{ snapshot_args }}
      become: true
      register: snapshot_result
      when: not skip_snapshot | bool
      timeout: 3600
      tags: [snapshot]
    - name: Show snapshot download result
      debug:
        msg: "{{ snapshot_result.stdout_lines | default(['skipped']) }}"
      tags: [snapshot]
    - name: Verify snapshot visible inside kind node
      shell: >
        docker exec {{ kind_cluster }}-control-plane
        ls -lhS /mnt/solana/snapshots/*.tar.* 2>/dev/null | head -5
      register: kind_snapshot_check
      failed_when: kind_snapshot_check.stdout == ""
      when: not skip_snapshot | bool
      tags: [snapshot]
    - name: Show snapshot files in kind node
      debug:
        msg: "{{ kind_snapshot_check.stdout_lines | default(['skipped']) }}"
      when: not skip_snapshot | bool
      tags: [snapshot]
    # ---- deploy (cont): scale validator back up with snapshot ----------------
    - name: Scale validator to 1 (start with downloaded snapshot)
      command: >
        kubectl scale deployment {{ deployment_name }}
        -n {{ k8s_namespace }} --replicas=1
      tags: [deploy]
    # ---- verify: confirm validator is running --------------------------------
    - name: Wait for pod to be running
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -o jsonpath='{.items[0].status.phase}'
      register: pod_status
      retries: 60
      delay: 10
      until: pod_status.stdout == "Running"
      tags: [verify]
    - name: Verify unified mount inside kind node
      command: "docker exec {{ kind_cluster }}-control-plane ls /mnt/solana/"
      register: mount_check
      tags: [verify]
    - name: Show mount contents
      debug:
        msg: "{{ mount_check.stdout_lines }}"
      tags: [verify]
    - name: Check validator log file is being written
      command: >
        kubectl exec -n {{ k8s_namespace }}
        deployment/{{ deployment_name }}
        -c agave-validator -- test -f /data/log/validator.log
      retries: 12
      delay: 10
      until: log_file_check.rc == 0
      register: log_file_check
      failed_when: false
      tags: [verify]
    - name: Check RPC health
      uri:
        url: http://127.0.0.1:8899/health
        return_content: true
      register: rpc_health
      retries: 6
      delay: 10
      until: rpc_health.status == 200
      failed_when: false
      delegate_to: "{{ inventory_hostname }}"
      tags: [verify]
    - name: Report status
      debug:
        msg: >-
          Deployment complete.
          Log: {{ 'writing' if log_file_check.rc == 0 else 'not yet created' }}.
          RPC: {{ rpc_health.content | default('not responding') }}.
          Wiped: ledger={{ wipe_ledger }}, accounts={{ wipe_accounts }}.
      tags: [verify]
--- a/playbooks/biscayne-stop.yml
+++ b/playbooks/biscayne-stop.yml
@ -0,0 +1,106 @@
 ---
 # Graceful shutdown of agave validator on biscayne
 #
 # Scales the deployment to 0 and waits for the pod to terminate.
 # This MUST be done before any kind node restart, host reboot,
 # or docker operations.
 #
 # The agave validator uses io_uring for async I/O. On ZFS, killing
 # the process ungracefully (SIGKILL, docker kill, etc.) can produce
 # unkillable kernel threads stuck in io_wq_put_and_exit, deadlocking
 # the container's PID namespace. A graceful SIGTERM via k8s scale-down
 # allows agave to flush and close its io_uring contexts cleanly.
 #
 # Usage:
 #   # Stop the validator
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-stop.yml
 #
 #   # Stop and restart kind node (LAST RESORT — e.g., broken namespace)
 #   # Normally unnecessary: mount propagation means ramdisk/ZFS changes
 #   # are visible in the kind node without restarting it.
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/biscayne-stop.yml \
 #     -e restart_kind=true
 #
 - name: Graceful validator shutdown
  hosts: all
  gather_facts: false
  environment:
    KUBECONFIG: /home/rix/.kube/config
  vars:
    kind_cluster: laconic-70ce4c4b47e23b85
    k8s_namespace: "laconic-{{ kind_cluster }}"
    deployment_name: "{{ kind_cluster }}-deployment"
    restart_kind: false
  tasks:
    - name: Get current replica count
      command: >
        kubectl get deployment {{ deployment_name }}
        -n {{ k8s_namespace }}
        -o jsonpath='{.spec.replicas}'
      register: current_replicas
      failed_when: false
      changed_when: false
    - name: Scale deployment to 0
      command: >
        kubectl scale deployment {{ deployment_name }}
        -n {{ k8s_namespace }} --replicas=0
      when: current_replicas.stdout | default('0') | int > 0
    - name: Wait for pods to terminate
      command: >
        kubectl get pods -n {{ k8s_namespace }}
        -l app={{ deployment_name }}
        -o jsonpath='{.items}'
      register: pods_gone
      retries: 60
      delay: 5
      until: pods_gone.stdout == "[]" or pods_gone.stdout == ""
      when: current_replicas.stdout | default('0') | int > 0
    - name: Verify no agave processes in kind node
      command: >
        docker exec {{ kind_cluster }}-control-plane
        pgrep -c agave-validator
      register: agave_procs
      failed_when: false
      changed_when: false
    - name: Fail if agave still running
      fail:
        msg: >-
          agave-validator process still running inside kind node after
          pod termination. Do NOT restart the kind node — investigate
          first to avoid io_uring/ZFS deadlock.
      when: agave_procs.rc == 0
    - name: Report stopped
      debug:
        msg: >-
          Validator stopped. Replicas: {{ current_replicas.stdout | default('0') }} -> 0.
          No agave processes detected in kind node.
      when: not restart_kind | bool
    # ---- optional: restart kind node -----------------------------------------
    - name: Restart kind node
      command: docker restart {{ kind_cluster }}-control-plane
      when: restart_kind | bool
      timeout: 120
    - name: Wait for kind node ready
      command: >
        kubectl get node {{ kind_cluster }}-control-plane
        -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
      register: node_ready
      retries: 30
      delay: 10
      until: node_ready.stdout == "True"
      when: restart_kind | bool
    - name: Report restarted
      debug:
        msg: >-
          Kind node restarted and ready.
          Deployment at 0 replicas — scale up when ready.
      when: restart_kind | bool
--- a/playbooks/connect-doublezero-multicast.yml
+++ b/playbooks/connect-doublezero-multicast.yml
@ -0,0 +1,134 @@
 ---
 # Connect biscayne to DoubleZero multicast via laconic-mia-sw01
 #
 # Establishes a GRE tunnel to the nearest DZ hybrid device and subscribes
 # to jito-shredstream and bebop multicast groups.
 #
 # Usage:
 #   ansible-playbook playbooks/connect-doublezero-multicast.yml
 #   ansible-playbook playbooks/connect-doublezero-multicast.yml --check  # dry-run
 - name: Connect biscayne to DoubleZero multicast
  hosts: biscayne
  gather_facts: false
  vars:
    dz_multicast_groups:
      - jito-shredstream
      - bebop
  tasks:
    # ------------------------------------------------------------------
    # Pre-checks
    # ------------------------------------------------------------------
    - name: Verify doublezerod service is running
      ansible.builtin.systemd:
        name: doublezerod
        state: started
      check_mode: true
      register: dz_service
      failed_when: dz_service.status.ActiveState != "active"
    - name: Get doublezero identity address
      ansible.builtin.command:
        cmd: doublezero address
      register: dz_address
      changed_when: false
    - name: Verify doublezero identity matches expected pubkey
      ansible.builtin.assert:
        that:
          - dz_address.stdout | trim == dz_identity
        fail_msg: >-
          DZ identity mismatch: got '{{ dz_address.stdout | trim }}',
          expected '{{ dz_identity }}'
    - name: Check current DZ connection status
      ansible.builtin.command:
        cmd: "doublezero -e {{ dz_environment }} status"
      register: dz_status
      changed_when: false
      failed_when: false
    - name: Fail if already connected (tunnel is up)
      ansible.builtin.fail:
        msg: >-
          DoubleZero tunnel is already connected. To reconnect, first
          disconnect manually with: doublezero -e {{ dz_environment }} disconnect
      when: "'connected' in dz_status.stdout | lower"
    # ------------------------------------------------------------------
    # Create access pass
    # ------------------------------------------------------------------
    - name: Create DZ access pass for multicast subscriber
      ansible.builtin.command:
        cmd: >-
          doublezero -e {{ dz_environment }} access-pass set
          --accesspass-type solana-multicast-subscriber
          --client-ip {{ client_ip }}
          --user-payer {{ dz_identity }}
          --solana-validator {{ validator_identity }}
          --tenant {{ dz_tenant }}
      register: dz_access_pass
      changed_when: "'created' in dz_access_pass.stdout | lower or 'updated' in dz_access_pass.stdout | lower"
    - name: Show access pass result
      ansible.builtin.debug:
        var: dz_access_pass.stdout_lines
    # ------------------------------------------------------------------
    # Connect to DZ multicast
    # ------------------------------------------------------------------
    - name: Connect to DoubleZero multicast via {{ dz_device }}
      ansible.builtin.command:
        cmd: >-
          doublezero -e {{ dz_environment }} connect multicast
          {% for group in dz_multicast_groups %}
          --subscribe {{ group }}
          {% endfor %}
          --device {{ dz_device }}
          --client-ip {{ client_ip }}
      register: dz_connect
      changed_when: true
    - name: Show connect result
      ansible.builtin.debug:
        var: dz_connect.stdout_lines
    # ------------------------------------------------------------------
    # Post-checks
    # ------------------------------------------------------------------
    - name: Verify tunnel status is connected
      ansible.builtin.command:
        cmd: "doublezero -e {{ dz_environment }} status"
      register: dz_post_status
      changed_when: false
      failed_when: "'connected' not in dz_post_status.stdout | lower"
    - name: Show tunnel status
      ansible.builtin.debug:
        var: dz_post_status.stdout_lines
    - name: Verify routes are installed
      ansible.builtin.command:
        cmd: "doublezero -e {{ dz_environment }} routes"
      register: dz_routes
      changed_when: false
    - name: Show installed routes
      ansible.builtin.debug:
        var: dz_routes.stdout_lines
    - name: Check multicast group membership
      ansible.builtin.command:
        cmd: "doublezero -e {{ dz_environment }} status"
      register: dz_multicast_status
      changed_when: false
    - name: Connection summary
      ansible.builtin.debug:
        msg: >-
          DoubleZero multicast connected via {{ dz_device }}.
          Subscribed groups: {{ dz_multicast_groups | join(', ') }}.
          Next step: request allowlist access from group owners
          (see docs/doublezero-multicast-access.md).
--- a/playbooks/files/ashburn-routing-ifup.sh
+++ b/playbooks/files/ashburn-routing-ifup.sh
@ -0,0 +1,18 @@
 #!/bin/bash
 # /etc/network/if-up.d/ashburn-routing
 # Restore policy routing for Ashburn validator relay after reboot/interface up.
 # Only act when doublezero0 comes up.
 [ "$IFACE" = "doublezero0" ] || exit 0
 # Ensure rt_tables entry exists
 grep -q '^100 ashburn$' /etc/iproute2/rt_tables || echo "100 ashburn" >> /etc/iproute2/rt_tables
 # Add policy rule (idempotent — ip rule skips duplicates silently on some kernels)
 ip rule show | grep -q 'fwmark 0x64 lookup ashburn' || ip rule add fwmark 100 table ashburn
 # Add default route via mia-sw01 through doublezero0 tunnel
 ip route replace default via 169.254.7.6 dev doublezero0 table ashburn
 # Add Ashburn IP to loopback (idempotent)
 ip addr show lo | grep -q '137.239.194.65' || ip addr add 137.239.194.65/32 dev lo
--- a/playbooks/fix-pv-mounts.yml
+++ b/playbooks/fix-pv-mounts.yml
@ -0,0 +1,166 @@
 ---
 # Verify PV hostPaths match expected kind-node paths, fix if wrong.
 #
 # Checks each PV's hostPath against the expected path derived from the
 # spec.yml volume mapping through the kind extraMounts. If any PV has a
 # wrong path, fails unless -e fix=true is passed.
 #
 # Does NOT touch the deployment.
 #
 # Usage:
 #   # Check only (fails if mounts are bad)
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/fix-pv-mounts.yml
 #
 #   # Fix stale PVs
 #   ansible-playbook -i biscayne.vaasl.io, playbooks/fix-pv-mounts.yml -e fix=true
 #
 - name: Verify and fix PV mount paths
  hosts: all
  gather_facts: false
  environment:
    KUBECONFIG: /home/rix/.kube/config
  vars:
    kind_cluster: laconic-70ce4c4b47e23b85
    k8s_namespace: "laconic-{{ kind_cluster }}"
    fix: false
    volumes:
      - name: validator-snapshots
        host_path: /mnt/solana/snapshots
        capacity: 200Gi
      - name: validator-ledger
        host_path: /mnt/solana/ledger
        capacity: 2Ti
      - name: validator-accounts
        host_path: /mnt/solana/ramdisk/accounts
        capacity: 800Gi
      - name: validator-log
        host_path: /mnt/solana/log
        capacity: 10Gi
  tasks:
    - name: Read current PV hostPaths
      command: >
        kubectl get pv {{ kind_cluster }}-{{ item.name }}
        -o jsonpath='{.spec.hostPath.path}'
      register: current_paths
      loop: "{{ volumes }}"
      failed_when: false
      changed_when: false
    - name: Build path comparison
      set_fact:
        path_mismatches: "{{ current_paths.results | selectattr('stdout', 'ne', '') | rejectattr('stdout', 'equalto', item.host_path) | list }}"
        path_missing: "{{ current_paths.results | selectattr('stdout', 'equalto', '') | list }}"
      loop: "{{ volumes }}"
      loop_control:
        label: "{{ item.name }}"
    - name: Show current vs expected paths
      debug:
        msg: >-
          {{ item.item.name }}:
          current={{ item.stdout if item.stdout else 'NOT FOUND' }}
          expected={{ item.item.host_path }}
          {{ 'OK' if item.stdout == item.item.host_path else 'NEEDS FIX' }}
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
    - name: Check for mismatched PVs
      fail:
        msg: >-
          PV {{ item.item.name }} has wrong hostPath:
          {{ item.stdout if item.stdout else 'NOT FOUND' }}
          (expected {{ item.item.host_path }}).
          Run with -e fix=true to delete and recreate.
      when: item.stdout != item.item.host_path and not fix | bool
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
    # ---- Fix mode ---------------------------------------------------------
    - name: Delete stale PVCs
      command: >
        kubectl delete pvc {{ kind_cluster }}-{{ item.item.name }}
        -n {{ k8s_namespace }} --timeout=60s
      when: fix | bool and item.stdout != item.item.host_path
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
      failed_when: false
    - name: Delete stale PVs
      command: >
        kubectl delete pv {{ kind_cluster }}-{{ item.item.name }}
        --timeout=60s
      when: fix | bool and item.stdout != item.item.host_path
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
      failed_when: false
    - name: Create PVs with correct hostPaths
      command: >
        kubectl apply -f -
      args:
        stdin: |
          apiVersion: v1
          kind: PersistentVolume
          metadata:
            name: {{ kind_cluster }}-{{ item.item.name }}
          spec:
            capacity:
              storage: {{ item.item.capacity }}
            accessModes:
              - ReadWriteOnce
            persistentVolumeReclaimPolicy: Retain
            storageClassName: manual
            hostPath:
              path: {{ item.item.host_path }}
      when: fix | bool and item.stdout != item.item.host_path
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
    - name: Create PVCs
      command: >
        kubectl apply -f -
      args:
        stdin: |
          apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
            name: {{ kind_cluster }}-{{ item.item.name }}
            namespace: {{ k8s_namespace }}
          spec:
            accessModes:
              - ReadWriteOnce
            storageClassName: manual
            volumeName: {{ kind_cluster }}-{{ item.item.name }}
            resources:
              requests:
                storage: {{ item.item.capacity }}
      when: fix | bool and item.stdout != item.item.host_path
      loop: "{{ current_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
    # ---- Final verify -----------------------------------------------------
    - name: Verify PV paths
      command: >
        kubectl get pv {{ kind_cluster }}-{{ item.name }}
        -o jsonpath='{.spec.hostPath.path}'
      register: final_paths
      loop: "{{ volumes }}"
      changed_when: false
      when: fix | bool
    - name: Assert all PV paths correct
      assert:
        that: item.stdout == item.item.host_path
        fail_msg: "{{ item.item.name }}: {{ item.stdout }} != {{ item.item.host_path }}"
        success_msg: "{{ item.item.name }}: {{ item.stdout }} OK"
      loop: "{{ final_paths.results }}"
      loop_control:
        label: "{{ item.item.name }}"
      when: fix | bool
--- a/playbooks/health-check.yml
+++ b/playbooks/health-check.yml
@ -0,0 +1,340 @@
 ---
 # Health check for biscayne agave-stack deployment
 #
 # Gathers system, validator, DoubleZero, and network status in a single run.
 # All tasks are read-only — safe to run at any time.
 #
 # Usage:
 #   ansible-playbook playbooks/health-check.yml
 #   ansible-playbook playbooks/health-check.yml -t validator   # just validator checks
 #   ansible-playbook playbooks/health-check.yml -t doublezero  # just DZ checks
 #   ansible-playbook playbooks/health-check.yml -t network     # just network checks
 - name: Biscayne agave-stack health check
  hosts: biscayne
  gather_facts: false
  tasks:
    # ------------------------------------------------------------------
    # Discover kind cluster and namespace
    # ------------------------------------------------------------------
    - name: Get kind cluster name
      ansible.builtin.command:
        cmd: kind get clusters
      register: kind_clusters
      changed_when: false
      failed_when: kind_clusters.rc != 0 or kind_clusters.stdout_lines | length == 0
    - name: Set cluster name fact
      ansible.builtin.set_fact:
        kind_cluster: "{{ kind_clusters.stdout_lines[0] }}"
    - name: Discover agave namespace
      ansible.builtin.shell:
        cmd: >-
          set -o pipefail &&
          kubectl get namespaces --no-headers -o custom-columns=':metadata.name'
          | grep '^laconic-'
        executable: /bin/bash
      register: ns_result
      changed_when: false
      failed_when: ns_result.stdout_lines | length == 0
    - name: Set namespace fact
      ansible.builtin.set_fact:
        agave_ns: "{{ ns_result.stdout_lines[0] }}"
    - name: Get pod name
      ansible.builtin.shell:
        cmd: >-
          set -o pipefail &&
          kubectl get pods -n {{ agave_ns }} --no-headers
          -o custom-columns=':metadata.name' | head -1
        executable: /bin/bash
      register: pod_result
      changed_when: false
      failed_when: pod_result.stdout | trim == ''
    - name: Set pod fact
      ansible.builtin.set_fact:
        agave_pod: "{{ pod_result.stdout | trim }}"
    - name: Show discovered resources
      ansible.builtin.debug:
        msg: "cluster={{ kind_cluster }}  ns={{ agave_ns }}  pod={{ agave_pod }}"
    # ------------------------------------------------------------------
    # Pod status
    # ------------------------------------------------------------------
    - name: Get pod status
      ansible.builtin.command:
        cmd: kubectl get pods -n {{ agave_ns }} -o wide
      register: pod_status
      changed_when: false
      tags: [validator]
    - name: Show pod status
      ansible.builtin.debug:
        var: pod_status.stdout_lines
      tags: [validator]
    - name: Get container restart counts
      ansible.builtin.shell:
        cmd: >-
          kubectl get pod {{ agave_pod }} -n {{ agave_ns }}
          -o jsonpath='{range .status.containerStatuses[*]}{.name}{" restarts="}{.restartCount}{" ready="}{.ready}{"\n"}{end}'
      register: restart_counts
      changed_when: false
      tags: [validator]
    - name: Show restart counts
      ansible.builtin.debug:
        var: restart_counts.stdout_lines
      tags: [validator]
    # ------------------------------------------------------------------
    # Validator sync status
    # ------------------------------------------------------------------
    - name: Get validator recent logs (replay progress)
      ansible.builtin.command:
        cmd: >-
          kubectl logs -n {{ agave_ns }} {{ agave_pod }}
          -c agave-validator --tail=30
      register: validator_logs
      changed_when: false
      tags: [validator]
    - name: Show validator logs
      ansible.builtin.debug:
        var: validator_logs.stdout_lines
      tags: [validator]
    - name: Check RPC health endpoint
      ansible.builtin.uri:
        url: http://127.0.0.1:8899/health
        method: GET
        return_content: true
        timeout: 5
      register: rpc_health
      failed_when: false
      tags: [validator]
    - name: Show RPC health
      ansible.builtin.debug:
        msg: "RPC health: {{ rpc_health.status | default('unreachable') }} — {{ rpc_health.content | default('no response') }}"
      tags: [validator]
    - name: Get validator version
      ansible.builtin.shell:
        cmd: >-
          kubectl exec -n {{ agave_ns }} {{ agave_pod }}
          -c agave-validator -- agave-validator --version 2>&1 || true
      register: validator_version
      changed_when: false
      tags: [validator]
    - name: Show validator version
      ansible.builtin.debug:
        var: validator_version.stdout
      tags: [validator]
    # ------------------------------------------------------------------
    # DoubleZero status
    # ------------------------------------------------------------------
    - name: Get host DZ identity
      ansible.builtin.command:
        cmd: sudo -u solana doublezero address
      register: dz_address
      changed_when: false
      failed_when: false
      tags: [doublezero]
    - name: Get host DZ tunnel status
      ansible.builtin.command:
        cmd: sudo -u solana doublezero -e {{ dz_environment }} status
      register: dz_status
      changed_when: false
      failed_when: false
      tags: [doublezero]
    - name: Get DZ routes
      ansible.builtin.shell:
        cmd: set -o pipefail && ip route | grep doublezero0 || echo "no doublezero0 routes"
        executable: /bin/bash
      register: dz_routes
      changed_when: false
      tags: [doublezero]
    - name: Get host doublezerod service state
      ansible.builtin.systemd:
        name: doublezerod
      register: dz_systemd_info
      failed_when: false
      check_mode: true
      tags: [doublezero]
    - name: Set DZ systemd state
      ansible.builtin.set_fact:
        dz_systemd_state: "{{ dz_systemd_info.status.ActiveState | default('unknown') }}"
      tags: [doublezero]
    - name: Get container DZ status
      ansible.builtin.shell:
        cmd: >-
          kubectl exec -n {{ agave_ns }} {{ agave_pod }}
          -c doublezerod -- doublezero status 2>&1 || echo "container DZ unavailable"
      register: dz_container_status
      changed_when: false
      tags: [doublezero]
    - name: Show DoubleZero status
      ansible.builtin.debug:
        msg:
          identity: "{{ dz_address.stdout | default('unknown') }}"
          host_tunnel: "{{ dz_status.stdout_lines | default(['unknown']) }}"
          host_systemd: "{{ dz_systemd_state }}"
          container: "{{ dz_container_status.stdout_lines | default(['unknown']) }}"
          routes: "{{ dz_routes.stdout_lines | default([]) }}"
      tags: [doublezero]
    # ------------------------------------------------------------------
    # Storage
    # ------------------------------------------------------------------
    - name: Check ramdisk usage
      ansible.builtin.command:
        cmd: df -h /srv/solana/ramdisk
      register: ramdisk_df
      changed_when: false
      failed_when: false
      tags: [storage]
    - name: Check ZFS dataset usage
      ansible.builtin.command:
        cmd: zfs list -o name,used,avail,mountpoint -r biscayne/DATA
      register: zfs_list
      changed_when: false
      tags: [storage]
    - name: Check ZFS zvol I/O
      ansible.builtin.shell:
        cmd: set -o pipefail && iostat -x zd0 1 2 | tail -3
        executable: /bin/bash
      register: zvol_io
      changed_when: false
      failed_when: false
      tags: [storage]
    - name: Show storage status
      ansible.builtin.debug:
        msg:
          ramdisk: "{{ ramdisk_df.stdout_lines | default(['not mounted']) }}"
          zfs: "{{ zfs_list.stdout_lines | default([]) }}"
          zvol_io: "{{ zvol_io.stdout_lines | default([]) }}"
      tags: [storage]
    # ------------------------------------------------------------------
    # System resources
    # ------------------------------------------------------------------
    - name: Check memory
      ansible.builtin.command:
        cmd: free -h
      register: mem
      changed_when: false
      tags: [system]
    - name: Check load average
      ansible.builtin.command:
        cmd: cat /proc/loadavg
      register: loadavg
      changed_when: false
      tags: [system]
    - name: Check swap
      ansible.builtin.command:
        cmd: swapon --show
      register: swap
      changed_when: false
      failed_when: false
      tags: [system]
    - name: Show system resources
      ansible.builtin.debug:
        msg:
          memory: "{{ mem.stdout_lines }}"
          load: "{{ loadavg.stdout }}"
          swap: "{{ swap.stdout | default('none') }}"
      tags: [system]
    # ------------------------------------------------------------------
    # Network / shred throughput
    # ------------------------------------------------------------------
    - name: Count shred packets per interface (5 sec sample)
      ansible.builtin.shell:
        cmd: |
          set -o pipefail
          for iface in eno1 doublezero0; do
            count=$(timeout 5 tcpdump -i "$iface" -nn 'udp dst portrange 9000-10000' -q 2>&1 | grep -oP '\d+(?= packets captured)' || echo 0)
            echo "$iface: $count packets/5s"
          done
        executable: /bin/bash
      register: shred_counts
      changed_when: false
      failed_when: false
      tags: [network]
    - name: Check interface throughput
      ansible.builtin.shell:
        cmd: >-
          set -o pipefail &&
          grep -E 'eno1|doublezero0' /proc/net/dev
          | awk '{printf "%s rx=%s tx=%s\n", $1, $2, $10}'
        executable: /bin/bash
      register: iface_stats
      changed_when: false
      tags: [network]
    - name: Check gossip/repair port connections
      ansible.builtin.shell:
        cmd: >-
          set -o pipefail &&
          ss -tupn | grep -E ':8001|:900[0-9]' | head -20 || echo "no connections"
        executable: /bin/bash
      register: gossip_ports
      changed_when: false
      tags: [network]
    - name: Check iptables DNAT rule (TVU shred relay)
      ansible.builtin.shell:
        cmd: >-
          set -o pipefail &&
          iptables -t nat -L PREROUTING -v -n | grep -E '64.92.84.81|20000' || echo "no DNAT rule"
        executable: /bin/bash
      register: dnat_rule
      changed_when: false
      tags: [network]
    - name: Show network status
      ansible.builtin.debug:
        msg:
          shred_counts: "{{ shred_counts.stdout_lines | default([]) }}"
          interfaces: "{{ iface_stats.stdout_lines | default([]) }}"
          gossip_ports: "{{ gossip_ports.stdout_lines | default([]) }}"
          tvu_dnat: "{{ dnat_rule.stdout_lines | default([]) }}"
      tags: [network]
    # ------------------------------------------------------------------
    # Summary
    # ------------------------------------------------------------------
    - name: Health check summary
      ansible.builtin.debug:
        msg: |
          === Biscayne Health Check ===
          Cluster: {{ kind_cluster }}
          Namespace: {{ agave_ns }}
          Pod: {{ agave_pod }}
          RPC: {{ rpc_health.status | default('unreachable') }}
          DZ identity: {{ dz_address.stdout | default('unknown') | trim }}
          DZ tunnel: {{ 'UP' if dz_status.rc | default(1) == 0 else 'DOWN' }}
          DZ systemd: {{ dz_systemd_state }}
          Ramdisk: {{ ramdisk_df.stdout_lines[-1] | default('unknown') }}
          Load: {{ loadavg.stdout | default('unknown') }}
--- a/scripts/check-shred-completeness.sh
+++ b/scripts/check-shred-completeness.sh
@ -0,0 +1,98 @@
 #!/bin/bash
 # Check shred completeness at the tip of the blockstore.
 #
 # Samples the most recent N slots and reports how many are full.
 # Use this to determine when enough complete blocks have accumulated
 # to safely download a new snapshot that lands within the complete range.
 #
 # Usage: kubectl exec ... -- bash -c "$(cat check-shred-completeness.sh)"
 #   Or:  ssh biscayne ... 'KUBECONFIG=... kubectl exec ... -- agave-ledger-tool ...'
 set -euo pipefail
 KUBECONFIG="${KUBECONFIG:-/home/rix/.kube/config}"
 NS="laconic-laconic-70ce4c4b47e23b85"
 DEPLOY="laconic-70ce4c4b47e23b85-deployment"
 SAMPLE_SIZE="${1:-200}"
 # Get blockstore bounds
 BOUNDS=$(kubectl exec -n "$NS" deployment/"$DEPLOY" -c agave-validator -- \
    agave-ledger-tool -l /data/ledger blockstore bounds 2>&1 | grep "^Ledger")
 HIGHEST=$(echo "$BOUNDS" | grep -oP 'to \K[0-9]+')
 START=$((HIGHEST - SAMPLE_SIZE))
 echo "Blockstore highest slot: $HIGHEST"
 echo "Sampling slots $START to $HIGHEST ($SAMPLE_SIZE slots)"
 echo ""
 # Get slot metadata
 OUTPUT=$(kubectl exec -n "$NS" deployment/"$DEPLOY" -c agave-validator -- \
    agave-ledger-tool -l /data/ledger blockstore print \
    --starting-slot "$START" --ending-slot "$HIGHEST" 2>&1 \
    | grep -E "^Slot|is_full")
 TOTAL=$(echo "$OUTPUT" | grep -c "^Slot" || true)
 FULL=$(echo "$OUTPUT" | grep -c "is_full: true" || true)
 INCOMPLETE=$(echo "$OUTPUT" | grep -c "is_full: false" || true)
 echo "Total slots with data: $TOTAL / $SAMPLE_SIZE"
 echo "Complete (is_full: true): $FULL"
 echo "Incomplete (is_full: false): $INCOMPLETE"
 if [ "$TOTAL" -gt 0 ]; then
    PCT=$((FULL * 100 / TOTAL))
    echo "Completeness: ${PCT}%"
 else
    echo "Completeness: N/A (no data)"
 fi
 echo ""
 # Find the first full slot counting backward from the tip
 # This tells us where the contiguous complete run starts
 echo "--- Contiguous complete run from tip ---"
 # Get just the slot numbers and is_full in reverse order
 REVERSED=$(echo "$OUTPUT" | paste - - | awk '{
    slot = $2;
    full = ($NF == "true") ? 1 : 0;
    print slot, full
 }' | sort -rn)
 CONTIGUOUS=0
 FIRST_FULL=""
 while IFS=' ' read -r slot full; do
    if [ "$full" -eq 1 ]; then
        CONTIGUOUS=$((CONTIGUOUS + 1))
        FIRST_FULL="$slot"
    else
        break
    fi
 done <<< "$REVERSED"
 if [ -n "$FIRST_FULL" ]; then
    echo "Contiguous complete slots from tip: $CONTIGUOUS"
    echo "Run starts at slot: $FIRST_FULL"
    echo "Run ends at slot: $HIGHEST"
    echo ""
    echo "A snapshot with slot >= $FIRST_FULL would replay from local blockstore."
    # Check against mainnet
    MAINNET_SLOT=$(curl -s -X POST -H "Content-Type: application/json" \
        -d '{"jsonrpc":"2.0","id":1,"method":"getSlot","params":[{"commitment":"finalized"}]}' \
        https://api.mainnet-beta.solana.com | grep -oP '"result":\K[0-9]+')
    GAP=$((MAINNET_SLOT - HIGHEST))
    echo "Mainnet tip: $MAINNET_SLOT (blockstore is $GAP slots behind tip)"
    if [ "$CONTIGUOUS" -gt 100 ]; then
        echo ""
        echo ">>> READY: $CONTIGUOUS contiguous complete slots. Safe to download a snapshot."
    else
        echo ""
        echo ">>> NOT READY: Only $CONTIGUOUS contiguous complete slots. Wait for more."
    fi
 else
    echo "No contiguous complete run from tip found."
 fi
--- a/scripts/pane-exec.sh
+++ b/scripts/pane-exec.sh
@ -0,0 +1,38 @@
 #!/bin/bash
 # Run a command in a tmux pane and capture its output.
 # User sees it streaming in the pane; caller gets stdout back.
 #
 # Usage: pane-exec.sh <pane-id> <command...>
 # Example: pane-exec.sh %6565 ansible-playbook -i inventory/switches.yml playbooks/foo.yml
 set -euo pipefail
 PANE="$1"
 shift
 CMD="$*"
 TMPFILE=$(mktemp /tmp/pane-output.XXXXXX)
 MARKER="__PANE_EXEC_DONE_${RANDOM}_$$__"
 cleanup() {
    tmux pipe-pane -t "$PANE" 2>/dev/null || true
    rm -f "$TMPFILE"
 }
 trap cleanup EXIT
 # Start capturing pane output
 tmux pipe-pane -o -t "$PANE" "cat >> $TMPFILE"
 # Send the command, then echo a marker so we know when it's done
 tmux send-keys -t "$PANE" "$CMD; echo $MARKER" Enter
 # Wait for the marker
 while ! grep -q "$MARKER" "$TMPFILE" 2>/dev/null; do
    sleep 0.5
 done
 # Stop capturing
 tmux pipe-pane -t "$PANE"
 # Strip ANSI escape codes, remove the marker line, output the rest
 sed 's/\x1b\[[0-9;]*[a-zA-Z]//g; s/\x1b\[[?][0-9]*[a-zA-Z]//g' "$TMPFILE" | grep -v "$MARKER"
--- a/scripts/scrape-arista-docs.mjs
+++ b/scripts/scrape-arista-docs.mjs
@ -0,0 +1,151 @@
 import { chromium } from 'playwright';
 import { writeFileSync, mkdirSync } from 'fs';
 import { join } from 'path';
 const OUT_DIR = join(import.meta.dirname, '..', 'docs', 'arista-scraped');
 mkdirSync(OUT_DIR, { recursive: true });
 const pages = [
  { url: 'https://www.arista.com/en/um-eos/eos-static-inter-vrf-route', file: 'static-inter-vrf-route.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-inter-vrf-local-route-leaking', file: 'inter-vrf-local-route-leaking.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-policy-based-routing', file: 'policy-based-routing.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-traffic-management', file: 'traffic-management.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-policy-based-routing-pbr', file: 'pbr.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-configuring-vrf-instances', file: 'configuring-vrf.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-gre-tunnels', file: 'gre-tunnels.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-access-control-lists', file: 'access-control-lists.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-static-routes', file: 'static-routes.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-configuration-sessions', file: 'configuration-sessions.md' },
  { url: 'https://www.arista.com/en/um-eos/eos-checkpoint-and-rollback', file: 'checkpoint-rollback.md' },
  { url: 'https://www.arista.com/en/um-eos', file: '_index.md' },
 ];
 async function scrapePage(page, url, filename) {
  console.log(`Scraping: ${url}`);
  try {
    const resp = await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30000 });
    console.log(`  Status: ${resp.status()}`);
    // Wait for JS to render
    await page.waitForTimeout(8000);
    // Check for CAPTCHA
    const bodyText = await page.evaluate(() => document.body.innerText.substring(0, 200));
    if (bodyText.includes('CAPTCHA') || bodyText.includes("couldn't load")) {
      console.log(`  BLOCKED by CAPTCHA/anti-bot on ${filename}`);
      writeFileSync(join(OUT_DIR, filename), `# BLOCKED BY CAPTCHA\n\nURL: ${url}\nThe Arista docs site requires CAPTCHA verification for headless browsers.\n`);
      return false;
    }
    // Extract content
    const content = await page.evaluate(() => {
      const selectors = [
        '#content', '.article-content', '.content-area', '#main-content',
        'article', '.item-page', '#sp-component', '.com-content-article',
        'main', '#sp-main-body',
      ];
      let el = null;
      for (const sel of selectors) {
        el = document.querySelector(sel);
        if (el && el.textContent.trim().length > 100) break;
      }
      if (!el) el = document.body;
      function nodeToMd(node) {
        if (node.nodeType === Node.TEXT_NODE) return node.textContent;
        if (node.nodeType !== Node.ELEMENT_NODE) return '';
        const tag = node.tagName.toLowerCase();
        if (['nav', 'footer', 'script', 'style', 'noscript', 'iframe'].includes(tag)) return '';
        if (node.classList && (node.classList.contains('nav') || node.classList.contains('sidebar') ||
            node.classList.contains('menu') || node.classList.contains('footer') ||
            node.classList.contains('header'))) return '';
        let children = Array.from(node.childNodes).map(c => nodeToMd(c)).join('');
        switch (tag) {
          case 'h1': return `\n# ${children.trim()}\n\n`;
          case 'h2': return `\n## ${children.trim()}\n\n`;
          case 'h3': return `\n### ${children.trim()}\n\n`;
          case 'h4': return `\n#### ${children.trim()}\n\n`;
          case 'p': return `\n${children.trim()}\n\n`;
          case 'br': return '\n';
          case 'li': return `- ${children.trim()}\n`;
          case 'ul': case 'ol': return `\n${children}\n`;
          case 'pre': return `\n\`\`\`\n${children.trim()}\n\`\`\`\n\n`;
          case 'code': return `\`${children.trim()}\``;
          case 'strong': case 'b': return `**${children.trim()}**`;
          case 'em': case 'i': return `*${children.trim()}*`;
          case 'table': return `\n${children}\n`;
          case 'tr': return `${children}|\n`;
          case 'th': case 'td': return `| ${children.trim()} `;
          case 'a': {
            const href = node.getAttribute('href');
            if (href && !href.startsWith('#') && !href.startsWith('javascript'))
              return `[${children.trim()}](${href})`;
            return children;
          }
          default: return children;
        }
      }
      return nodeToMd(el);
    });
    const cleaned = content.replace(/\n{4,}/g, '\n\n\n').replace(/[ \t]+$/gm, '').trim();
    const header = `<!-- Source: ${url} -->\n<!-- Scraped: ${new Date().toISOString()} -->\n\n`;
    writeFileSync(join(OUT_DIR, filename), header + cleaned + '\n');
    console.log(`  Saved ${filename} (${cleaned.length} chars)`);
    return true;
  } catch (e) {
    console.error(`  FAILED: ${e.message}`);
    writeFileSync(join(OUT_DIR, filename), `# FAILED TO LOAD\n\nURL: ${url}\nError: ${e.message}\n`);
    return false;
  }
 }
 async function main() {
  // Launch with stealth-like settings
  const browser = await chromium.launch({
    headless: false,  // Use headed mode via Xvfb if available, else new headless
    args: [
      '--headless=new',  // New headless mode (less detectable)
      '--disable-blink-features=AutomationControlled',
      '--no-sandbox',
    ],
  });
  const context = await browser.newContext({
    userAgent: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    locale: 'en-US',
    timezoneId: 'America/New_York',
    viewport: { width: 1920, height: 1080 },
  });
  // Remove webdriver property
  await context.addInitScript(() => {
    Object.defineProperty(navigator, 'webdriver', { get: () => false });
    // Override permissions
    const originalQuery = window.navigator.permissions.query;
    window.navigator.permissions.query = (parameters) =>
      parameters.name === 'notifications'
        ? Promise.resolve({ state: Notification.permission })
        : originalQuery(parameters);
  });
  const page = await context.newPage();
  let anySuccess = false;
  for (const { url, file } of pages) {
    const ok = await scrapePage(page, url, file);
    if (ok) anySuccess = true;
    // Add delay between requests
    await page.waitForTimeout(2000);
  }
  if (!anySuccess) {
    console.log('\nAll pages blocked by CAPTCHA. Arista docs require human verification.');
  }
  await browser.close();
  console.log('\nDone!');
 }
 main().catch(e => { console.error(e); process.exit(1); });
--- a/scripts/shred-unwrap.py
+++ b/scripts/shred-unwrap.py
@ -0,0 +1,34 @@
 #!/usr/bin/env python3
 """Strip IP+UDP headers from mirrored packets and forward raw UDP payload."""
 import socket
 import sys
 LISTEN_PORT = int(sys.argv[1]) if len(sys.argv) > 1 else 9100
 FORWARD_HOST = sys.argv[2] if len(sys.argv) > 2 else "127.0.0.1"
 FORWARD_PORT = int(sys.argv[3]) if len(sys.argv) > 3 else 9000
 sock_in = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
 sock_in.bind(("0.0.0.0", LISTEN_PORT))
 sock_out = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
 count = 0
 while True:
    data, addr = sock_in.recvfrom(65535)
    if len(data) < 28:
        continue
    # IP header: first nibble is version (4), second nibble is IHL (words)
    if (data[0] >> 4) != 4:
        continue
    ihl = (data[0] & 0x0F) * 4
    # Protocol should be UDP (17)
    if data[9] != 17:
        continue
    # Payload starts after IP header + 8-byte UDP header
    offset = ihl + 8
    payload = data[offset:]
    if payload:
        sock_out.sendto(payload, (FORWARD_HOST, FORWARD_PORT))
        count += 1
        if count % 10000 == 0:
            print(f"Forwarded {count} shreds", flush=True)
--- a/scripts/snapshot-download.py
+++ b/scripts/snapshot-download.py
@ -0,0 +1,546 @@
 #!/usr/bin/env python3
 """Download Solana snapshots using aria2c for parallel multi-connection downloads.
 Discovers snapshot sources by querying getClusterNodes for all RPCs in the
 cluster, probing each for available snapshots, benchmarking download speed,
 and downloading from the fastest source using aria2c (16 connections by default).
 Based on the discovery approach from etcusr/solana-snapshot-finder but replaces
 the single-connection wget download with aria2c parallel chunked downloads.
 Usage:
    # Download to /srv/solana/snapshots (mainnet, 16 connections)
    ./snapshot-download.py -o /srv/solana/snapshots
    # Dry run — find best source, print URL
    ./snapshot-download.py --dry-run
    # Custom RPC for cluster node discovery + 32 connections
    ./snapshot-download.py -r https://api.mainnet-beta.solana.com -n 32
    # Testnet
    ./snapshot-download.py -c testnet -o /data/snapshots
 Requirements:
    - aria2c (apt install aria2)
    - python3 >= 3.10 (stdlib only, no pip dependencies)
 """
 from __future__ import annotations
 import argparse
 import concurrent.futures
 import json
 import logging
 import os
 import re
 import shutil
 import subprocess
 import sys
 import time
 import urllib.error
 import urllib.request
 from dataclasses import dataclass, field
 from http.client import HTTPResponse
 from pathlib import Path
 from typing import NoReturn
 from urllib.request import Request
 log: logging.Logger = logging.getLogger("snapshot-download")
 CLUSTER_RPC: dict[str, str] = {
    "mainnet-beta": "https://api.mainnet-beta.solana.com",
    "testnet": "https://api.testnet.solana.com",
    "devnet": "https://api.devnet.solana.com",
 }
 # Snapshot filenames:
 #   snapshot-<slot>-<hash>.tar.zst
 #   incremental-snapshot-<base_slot>-<slot>-<hash>.tar.zst
 FULL_SNAP_RE: re.Pattern[str] = re.compile(
    r"^snapshot-(\d+)-([A-Za-z0-9]+)\.tar\.(zst|bz2)$"
 )
 INCR_SNAP_RE: re.Pattern[str] = re.compile(
    r"^incremental-snapshot-(\d+)-(\d+)-([A-Za-z0-9]+)\.tar\.(zst|bz2)$"
 )
@dataclass
 class SnapshotSource:
    """A snapshot file available from a specific RPC node."""
    rpc_address: str
    # Full redirect paths as returned by the server (e.g. /snapshot-123-hash.tar.zst)
    file_paths: list[str] = field(default_factory=list)
    slots_diff: int = 0
    latency_ms: float = 0.0
    download_speed: float = 0.0  # bytes/sec
 # -- JSON-RPC helpers ----------------------------------------------------------
 class _NoRedirectHandler(urllib.request.HTTPRedirectHandler):
    """Handler that captures redirect Location instead of following it."""
    def redirect_request(
        self,
        req: Request,
        fp: HTTPResponse,
        code: int,
        msg: str,
        headers: dict[str, str],  # type: ignore[override]
        newurl: str,
    ) -> None:
        return None
 def rpc_post(url: str, method: str, params: list[object] | None = None,
             timeout: int = 25) -> object | None:
    """JSON-RPC POST. Returns parsed 'result' field or None on error."""
    payload: bytes = json.dumps({
        "jsonrpc": "2.0", "id": 1,
        "method": method, "params": params or [],
    }).encode()
    req = Request(url, data=payload,
                  headers={"Content-Type": "application/json"})
    try:
        with urllib.request.urlopen(req, timeout=timeout) as resp:
            data: dict[str, object] = json.loads(resp.read())
            return data.get("result")
    except (urllib.error.URLError, json.JSONDecodeError, OSError, TimeoutError) as e:
        log.debug("rpc_post %s %s failed: %s", url, method, e)
        return None
 def head_no_follow(url: str, timeout: float = 3) -> tuple[str | None, float]:
    """HEAD request without following redirects.
    Returns (Location header value, latency_sec) if the server returned a
    3xx redirect. Returns (None, 0.0) on any error or non-redirect response.
    """
    opener: urllib.request.OpenerDirector = urllib.request.build_opener(_NoRedirectHandler)
    req = Request(url, method="HEAD")
    try:
        start: float = time.monotonic()
        resp: HTTPResponse = opener.open(req, timeout=timeout)  # type: ignore[assignment]
        latency: float = time.monotonic() - start
        # Non-redirect (2xx) — server didn't redirect, not useful for discovery
        location: str | None = resp.headers.get("Location")
        resp.close()
        return location, latency
    except urllib.error.HTTPError as e:
        # 3xx redirects raise HTTPError with the redirect info
        latency = time.monotonic() - start  # type: ignore[possibly-undefined]
        location = e.headers.get("Location")
        if location and 300 <= e.code < 400:
            return location, latency
        return None, 0.0
    except (urllib.error.URLError, OSError, TimeoutError):
        return None, 0.0
 # -- Discovery -----------------------------------------------------------------
 def get_current_slot(rpc_url: str) -> int | None:
    """Get current slot from RPC."""
    result: object | None = rpc_post(rpc_url, "getSlot")
    if isinstance(result, int):
        return result
    return None
 def get_cluster_rpc_nodes(rpc_url: str, version_filter: str | None = None) -> list[str]:
    """Get all RPC node addresses from getClusterNodes."""
    result: object | None = rpc_post(rpc_url, "getClusterNodes")
    if not isinstance(result, list):
        return []
    rpc_addrs: list[str] = []
    for node in result:
        if not isinstance(node, dict):
            continue
        if version_filter is not None:
            node_version: str | None = node.get("version")
            if node_version and not node_version.startswith(version_filter):
                continue
        rpc: str | None = node.get("rpc")
        if rpc:
            rpc_addrs.append(rpc)
    return list(set(rpc_addrs))
 def _parse_snapshot_filename(location: str) -> tuple[str, str | None]:
    """Extract filename and full redirect path from Location header.
    Returns (filename, full_path). full_path includes any path prefix
    the server returned (e.g. '/snapshots/snapshot-123-hash.tar.zst').
    """
    # Location may be absolute URL or relative path
    if location.startswith("http://") or location.startswith("https://"):
        # Absolute URL — extract path
        from urllib.parse import urlparse
        path: str = urlparse(location).path
    else:
        path = location
    filename: str = path.rsplit("/", 1)[-1]
    return filename, path
 def probe_rpc_snapshot(
    rpc_address: str,
    current_slot: int,
    max_age_slots: int,
    max_latency_ms: float,
 ) -> SnapshotSource | None:
    """Probe a single RPC node for available snapshots.
    Probes for full snapshot first (required), then incremental. Records all
    available files. Which files to actually download is decided at download
    time based on what already exists locally — not here.
    Based on the discovery approach from etcusr/solana-snapshot-finder.
    """
    full_url: str = f"http://{rpc_address}/snapshot.tar.bz2"
    # Full snapshot is required — every source must have one
    full_location, full_latency = head_no_follow(full_url, timeout=2)
    if not full_location:
        return None
    latency_ms: float = full_latency * 1000
    if latency_ms > max_latency_ms:
        return None
    full_filename, full_path = _parse_snapshot_filename(full_location)
    fm: re.Match[str] | None = FULL_SNAP_RE.match(full_filename)
    if not fm:
        return None
    full_snap_slot: int = int(fm.group(1))
    slots_diff: int = current_slot - full_snap_slot
    if slots_diff > max_age_slots or slots_diff < -100:
        return None
    file_paths: list[str] = [full_path]
    # Also check for incremental snapshot
    inc_url: str = f"http://{rpc_address}/incremental-snapshot.tar.bz2"
    inc_location, _ = head_no_follow(inc_url, timeout=2)
    if inc_location:
        inc_filename, inc_path = _parse_snapshot_filename(inc_location)
        m: re.Match[str] | None = INCR_SNAP_RE.match(inc_filename)
        if m:
            inc_base_slot: int = int(m.group(1))
            # Incremental must be based on this source's full snapshot
            if inc_base_slot == full_snap_slot:
                file_paths.append(inc_path)
    return SnapshotSource(
        rpc_address=rpc_address,
        file_paths=file_paths,
        slots_diff=slots_diff,
        latency_ms=latency_ms,
    )
 def discover_sources(
    rpc_url: str,
    current_slot: int,
    max_age_slots: int,
    max_latency_ms: float,
    threads: int,
    version_filter: str | None,
 ) -> list[SnapshotSource]:
    """Discover all snapshot sources from the cluster."""
    rpc_nodes: list[str] = get_cluster_rpc_nodes(rpc_url, version_filter)
    if not rpc_nodes:
        log.error("No RPC nodes found via getClusterNodes")
        return []
    log.info("Found %d RPC nodes, probing for snapshots...", len(rpc_nodes))
    sources: list[SnapshotSource] = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=threads) as pool:
        futures: dict[concurrent.futures.Future[SnapshotSource | None], str] = {
            pool.submit(
                probe_rpc_snapshot, addr, current_slot,
                max_age_slots, max_latency_ms,
            ): addr
            for addr in rpc_nodes
        }
        done: int = 0
        for future in concurrent.futures.as_completed(futures):
            done += 1
            if done % 200 == 0:
                log.info("  probed %d/%d nodes, %d sources found",
                         done, len(rpc_nodes), len(sources))
            try:
                result: SnapshotSource | None = future.result()
            except (urllib.error.URLError, OSError, TimeoutError) as e:
                log.debug("Probe failed for %s: %s", futures[future], e)
                continue
            if result:
                sources.append(result)
    log.info("Found %d RPC nodes with suitable snapshots", len(sources))
    return sources
 # -- Speed benchmark -----------------------------------------------------------
 def measure_speed(rpc_address: str, measure_time: int = 7) -> float:
    """Measure download speed from an RPC node. Returns bytes/sec."""
    url: str = f"http://{rpc_address}/snapshot.tar.bz2"
    req = Request(url)
    try:
        with urllib.request.urlopen(req, timeout=measure_time + 5) as resp:
            start: float = time.monotonic()
            total: int = 0
            while True:
                elapsed: float = time.monotonic() - start
                if elapsed >= measure_time:
                    break
                chunk: bytes = resp.read(81920)
                if not chunk:
                    break
                total += len(chunk)
            elapsed = time.monotonic() - start
            if elapsed <= 0:
                return 0.0
            return total / elapsed
    except (urllib.error.URLError, OSError, TimeoutError):
        return 0.0
 # -- Download ------------------------------------------------------------------
 def download_aria2c(
    urls: list[str],
    output_dir: str,
    filename: str,
    connections: int = 16,
 ) -> bool:
    """Download a file using aria2c with parallel connections.
    When multiple URLs are provided, aria2c treats them as mirrors of the
    same file and distributes chunks across all of them.
    """
    num_mirrors: int = len(urls)
    total_splits: int = max(connections, connections * num_mirrors)
    cmd: list[str] = [
        "aria2c",
        "--file-allocation=none",
        "--continue=true",
        f"--max-connection-per-server={connections}",
        f"--split={total_splits}",
        "--min-split-size=50M",
        # aria2c retries individual chunk connections on transient network
        # errors (TCP reset, timeout). This is transport-level retry analogous
        # to TCP retransmit, not application-level retry of a failed operation.
        "--max-tries=5",
        "--retry-wait=5",
        "--timeout=60",
        "--connect-timeout=10",
        "--summary-interval=10",
        "--console-log-level=notice",
        f"--dir={output_dir}",
        f"--out={filename}",
        "--auto-file-renaming=false",
        "--allow-overwrite=true",
        *urls,
    ]
    log.info("Downloading %s", filename)
    log.info("  aria2c: %d connections × %d mirrors (%d splits)",
             connections, num_mirrors, total_splits)
    start: float = time.monotonic()
    result: subprocess.CompletedProcess[bytes] = subprocess.run(cmd)
    elapsed: float = time.monotonic() - start
    if result.returncode != 0:
        log.error("aria2c failed with exit code %d", result.returncode)
        return False
    filepath: Path = Path(output_dir) / filename
    if not filepath.exists():
        log.error("aria2c reported success but %s does not exist", filepath)
        return False
    size_bytes: int = filepath.stat().st_size
    size_gb: float = size_bytes / (1024 ** 3)
    avg_mb: float = size_bytes / elapsed / (1024 ** 2) if elapsed > 0 else 0
    log.info("  Done: %.1f GB in %.0fs (%.1f MiB/s avg)", size_gb, elapsed, avg_mb)
    return True
 # -- Main ----------------------------------------------------------------------
 def main() -> int:
    p: argparse.ArgumentParser = argparse.ArgumentParser(
        description="Download Solana snapshots with aria2c parallel downloads",
    )
    p.add_argument("-o", "--output", default="/srv/solana/snapshots",
                   help="Snapshot output directory (default: /srv/solana/snapshots)")
    p.add_argument("-c", "--cluster", default="mainnet-beta",
                   choices=list(CLUSTER_RPC),
                   help="Solana cluster (default: mainnet-beta)")
    p.add_argument("-r", "--rpc", default=None,
                   help="RPC URL for cluster discovery (default: public RPC)")
    p.add_argument("-n", "--connections", type=int, default=16,
                   help="aria2c connections per download (default: 16)")
    p.add_argument("-t", "--threads", type=int, default=500,
                   help="Threads for parallel RPC probing (default: 500)")
    p.add_argument("--max-snapshot-age", type=int, default=1300,
                   help="Max snapshot age in slots (default: 1300)")
    p.add_argument("--max-latency", type=float, default=100,
                   help="Max RPC probe latency in ms (default: 100)")
    p.add_argument("--min-download-speed", type=int, default=20,
                   help="Min download speed in MiB/s (default: 20)")
    p.add_argument("--measurement-time", type=int, default=7,
                   help="Speed measurement duration in seconds (default: 7)")
    p.add_argument("--max-speed-checks", type=int, default=15,
                   help="Max nodes to benchmark before giving up (default: 15)")
    p.add_argument("--version", default=None,
                   help="Filter nodes by version prefix (e.g. '2.2')")
    p.add_argument("--full-only", action="store_true",
                   help="Download only full snapshot, skip incremental")
    p.add_argument("--dry-run", action="store_true",
                   help="Find best source and print URL, don't download")
    p.add_argument("-v", "--verbose", action="store_true")
    args: argparse.Namespace = p.parse_args()
    logging.basicConfig(
        level=logging.DEBUG if args.verbose else logging.INFO,
        format="%(asctime)s %(levelname)s %(message)s",
        datefmt="%H:%M:%S",
    )
    rpc_url: str = args.rpc or CLUSTER_RPC[args.cluster]
    # aria2c is required for actual downloads (not dry-run)
    if not args.dry_run and not shutil.which("aria2c"):
        log.error("aria2c not found. Install with: apt install aria2")
        return 1
    # Get current slot
    log.info("Cluster: %s | RPC: %s", args.cluster, rpc_url)
    current_slot: int | None = get_current_slot(rpc_url)
    if current_slot is None:
        log.error("Cannot get current slot from %s", rpc_url)
        return 1
    log.info("Current slot: %d", current_slot)
    # Discover sources
    sources: list[SnapshotSource] = discover_sources(
        rpc_url, current_slot,
        max_age_slots=args.max_snapshot_age,
        max_latency_ms=args.max_latency,
        threads=args.threads,
        version_filter=args.version,
    )
    if not sources:
        log.error("No snapshot sources found")
        return 1
    # Sort by latency (lowest first) for speed benchmarking
    sources.sort(key=lambda s: s.latency_ms)
    # Benchmark top candidates — all speeds in MiB/s (binary, 1 MiB = 1048576 bytes)
    log.info("Benchmarking download speed on top %d sources...", args.max_speed_checks)
    fast_sources: list[SnapshotSource] = []
    checked: int = 0
    min_speed_bytes: int = args.min_download_speed * 1024 * 1024  # MiB to bytes
    for source in sources:
        if checked >= args.max_speed_checks:
            break
        checked += 1
        speed: float = measure_speed(source.rpc_address, args.measurement_time)
        source.download_speed = speed
        speed_mib: float = speed / (1024 ** 2)
        if speed < min_speed_bytes:
            log.info("  %s: %.1f MiB/s (too slow, need >=%d MiB/s)",
                     source.rpc_address, speed_mib, args.min_download_speed)
            continue
        log.info("  %s: %.1f MiB/s (latency: %.0fms, age: %d slots)",
                 source.rpc_address, speed_mib,
                 source.latency_ms, source.slots_diff)
        fast_sources.append(source)
    if not fast_sources:
        log.error("No source met minimum speed requirement (%d MiB/s)",
                  args.min_download_speed)
        log.info("Try: --min-download-speed 10")
        return 1
    # Use the fastest source as primary, collect mirrors for each file
    best: SnapshotSource = fast_sources[0]
    file_paths: list[str] = best.file_paths
    if args.full_only:
        file_paths = [fp for fp in file_paths
                      if fp.rsplit("/", 1)[-1].startswith("snapshot-")]
    # Build mirror URL lists: for each file, collect URLs from all fast sources
    # that serve the same filename
    download_plan: list[tuple[str, list[str]]] = []
    for fp in file_paths:
        filename: str = fp.rsplit("/", 1)[-1]
        mirror_urls: list[str] = [f"http://{best.rpc_address}{fp}"]
        for other in fast_sources[1:]:
            for other_fp in other.file_paths:
                if other_fp.rsplit("/", 1)[-1] == filename:
                    mirror_urls.append(f"http://{other.rpc_address}{other_fp}")
                    break
        download_plan.append((filename, mirror_urls))
    speed_mib: float = best.download_speed / (1024 ** 2)
    log.info("Best source: %s (%.1f MiB/s), %d mirrors total",
             best.rpc_address, speed_mib, len(fast_sources))
    for filename, mirror_urls in download_plan:
        log.info("  %s (%d mirrors)", filename, len(mirror_urls))
        for url in mirror_urls:
            log.info("    %s", url)
    if args.dry_run:
        for _, mirror_urls in download_plan:
            for url in mirror_urls:
                print(url)
        return 0
    # Download — skip files that already exist locally
    os.makedirs(args.output, exist_ok=True)
    total_start: float = time.monotonic()
    for filename, mirror_urls in download_plan:
        filepath: Path = Path(args.output) / filename
        if filepath.exists() and filepath.stat().st_size > 0:
            log.info("Skipping %s (already exists: %.1f GB)",
                     filename, filepath.stat().st_size / (1024 ** 3))
            continue
        if not download_aria2c(mirror_urls, args.output, filename, args.connections):
            log.error("Failed to download %s", filename)
            return 1
    total_elapsed: float = time.monotonic() - total_start
    log.info("All downloads complete in %.0fs", total_elapsed)
    for filename, _ in download_plan:
        fp: Path = Path(args.output) / filename
        if fp.exists():
            log.info("  %s (%.1f GB)", fp.name, fp.stat().st_size / (1024 ** 3))
    return 0
 if __name__ == "__main__":
    sys.exit(main())
		`@ -0,0 +1,3 @@`
							`# biscayne-agave-runbook`

							`Ansible playbooks for operating the kind-based agave-stack deployment on biscayne.vaasl.io.`