stack-orchestrator/docs/ashburn-validator-relay.md

9.7 KiB

Ashburn Validator Relay — Full Traffic Redirect

Overview

All validator traffic (gossip, repair, TVU, TPU) enters and exits from 137.239.194.65 (laconic-was-sw01, Ashburn). Peers see the validator as an Ashburn node. This improves repair peer count and slot catchup rate by reducing RTT to the TeraSwitch/Pittsburgh cluster from ~30ms (direct Miami) to ~5ms (Ashburn).

Supersedes the previous TVU-only shred relay (see tvu-shred-relay.md).

Architecture

                 OUTBOUND (validator → peers)
agave-validator (kind pod, ports 8001, 9000-9025)
       ↓ Docker bridge → host FORWARD chain
biscayne host (186.233.184.235)
       ↓ mangle PREROUTING: fwmark 100 on sport 8001,9000-9025 from 172.20.0.0/16
       ↓ nat POSTROUTING: SNAT → src 137.239.194.65
       ↓ policy route: fwmark 100 → table ashburn → via 169.254.7.6 dev doublezero0
laconic-mia-sw01 (209.42.167.133, Miami)
       ↓ traffic-policy VALIDATOR-OUTBOUND: src 137.239.194.65 → nexthop 172.16.1.188
       ↓ backbone Et4/1 (25.4ms)
laconic-was-sw01 Et4/1 (Ashburn)
       ↓ default route via 64.92.84.80 out Et1/1
Internet (peers see src 137.239.194.65)

                 INBOUND (peers → validator)
Solana peers → 137.239.194.65:8001,9000-9025
       ↓ internet routing to was-sw01
laconic-was-sw01 Et1/1 (Ashburn)
       ↓ traffic-policy VALIDATOR-RELAY: ASIC redirect, line rate
       ↓ nexthop 172.16.1.189 via Et4/1 backbone (25.4ms)
laconic-mia-sw01 Et4/1 (Miami)
       ↓ L3 forward → biscayne via doublezero0 GRE or ISP routing
biscayne (186.233.184.235)
       ↓ nat PREROUTING: DNAT dst 137.239.194.65:* → 172.20.0.2:* (kind node)
       ↓ Docker bridge → validator pod
agave-validator

RPC traffic (port 8899) is NOT relayed — clients connect directly to biscayne.

Switch Config: laconic-was-sw01

SSH: install@137.239.200.198

Pre-change

configure checkpoint save pre-validator-relay

Rollback: rollback running-config checkpoint pre-validator-relay then write memory.

Config session with auto-revert

configure session validator-relay

! Loopback for 137.239.194.65 (do NOT touch Loopback100 which has .64)
interface Loopback101
   ip address 137.239.194.65/32

! ACL covering all validator ports
ip access-list VALIDATOR-RELAY-ACL
   10 permit udp any any eq 8001
   20 permit udp any any range 9000 9025
   30 permit tcp any any eq 8001

! Traffic-policy: ASIC redirect to backbone (mia-sw01)
traffic-policy VALIDATOR-RELAY
   match VALIDATOR-RELAY-ACL
      set nexthop 172.16.1.189

! Replace old SHRED-RELAY on Et1/1
interface Ethernet1/1
   no traffic-policy input SHRED-RELAY
   traffic-policy input VALIDATOR-RELAY

! system-rule overriding-action redirect (already present from SHRED-RELAY)

show session-config diffs
commit timer 00:05:00

After verification: configure session validator-relay commit then write memory.

Cleanup (after stable)

Old SHRED-RELAY policy and ACL can be removed once VALIDATOR-RELAY is confirmed:

configure session cleanup-shred-relay
no traffic-policy SHRED-RELAY
no ip access-list SHRED-RELAY-ACL
show session-config diffs
commit
write memory

Switch Config: laconic-mia-sw01

Pre-flight checks

Before applying config, verify:

  1. Which EOS interface terminates the doublezero0 GRE from biscayne (endpoint 209.42.167.133). Check with show interfaces tunnel or show ip interface brief | include Tunnel.

  2. Whether system-rule overriding-action redirect is already configured. Check with show running-config | include system-rule.

  3. Whether EOS traffic-policy works on tunnel interfaces. If not, apply on the physical interface where GRE packets arrive (likely Et facing biscayne's ISP network or the DZ infrastructure).

Config session

configure checkpoint save pre-validator-outbound

configure session validator-outbound

! ACL matching outbound validator traffic (source = Ashburn IP)
ip access-list VALIDATOR-OUTBOUND-ACL
   10 permit ip 137.239.194.65/32 any

! Redirect to was-sw01 via backbone
traffic-policy VALIDATOR-OUTBOUND
   match VALIDATOR-OUTBOUND-ACL
      set nexthop 172.16.1.188

! Apply on the interface where biscayne GRE traffic arrives
! Replace Tunnel<X> with the actual interface from pre-flight check #1
interface Tunnel<X>
   traffic-policy input VALIDATOR-OUTBOUND

! Add system-rule if not already present (pre-flight check #2)
system-rule overriding-action redirect

show session-config diffs
commit timer 00:05:00

After verification: commit + write memory.

Host Config: biscayne

Automated via ansible playbook playbooks/ashburn-validator-relay.yml.

Manual equivalent

# 1. Accept packets destined for 137.239.194.65
sudo ip addr add 137.239.194.65/32 dev lo

# 2. Inbound DNAT to kind node (172.20.0.2)
sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 8001 \
  -j DNAT --to-destination 172.20.0.2:8001
sudo iptables -t nat -A PREROUTING -p tcp -d 137.239.194.65 --dport 8001 \
  -j DNAT --to-destination 172.20.0.2:8001
sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 \
  -j DNAT --to-destination 172.20.0.2

# 3. Outbound: mark validator traffic
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 \
  -j MARK --set-mark 100
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 \
  -j MARK --set-mark 100
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 \
  -j MARK --set-mark 100

# 4. Outbound: SNAT to Ashburn IP (INSERT before Docker MASQUERADE)
sudo iptables -t nat -I POSTROUTING 1 -m mark --mark 100 \
  -j SNAT --to-source 137.239.194.65

# 5. Policy routing table
echo "100 ashburn" | sudo tee -a /etc/iproute2/rt_tables
sudo ip rule add fwmark 100 table ashburn
sudo ip route add default via 169.254.7.6 dev doublezero0 table ashburn

# 6. Persist
sudo netfilter-persistent save
# ip rule + ip route persist via /etc/network/if-up.d/ashburn-routing

Docker NAT port preservation

Must verify before going live: Docker masquerade must preserve source ports for kind's hostNetwork pods. If Docker rewrites the source port, the mangle PREROUTING match on --sport 8001,9000-9025 will miss traffic.

Test: tcpdump -i br-cf46a62ab5b2 -nn 'udp src port 8001' — if you see packets with sport 8001 from 172.20.0.2, port preservation works.

If Docker does NOT preserve ports, the mark must be set inside the kind node container (on the pod's veth) rather than on the host.

Execution Order

  1. was-sw01: checkpoint → config session with 5min auto-revert → verify counters → commit
  2. biscayne: add 137.239.194.65/32 to lo, add inbound DNAT rules
  3. Verify inbound: ping 137.239.194.65 from external host, check DNAT counters
  4. mia-sw01: pre-flight checks → config session with 5min auto-revert → commit
  5. biscayne: add outbound fwmark + policy routing + SNAT rules
  6. Test outbound: from biscayne, send UDP from port 8001, verify src 137.239.194.65 on was-sw01
  7. Verify: traffic-policy counters on both switches, iptables hit counts on biscayne
  8. Restart validator if needed (gossip should auto-refresh, but restart ensures clean state)
  9. was-sw01 + mia-sw01: write memory to persist
  10. Cleanup: remove old SHRED-RELAY and 64.92.84.81:20000 DNAT after stable

Verification

  1. show traffic-policy counters on was-sw01 — VALIDATOR-RELAY-ACL matches
  2. show traffic-policy counters on mia-sw01 — VALIDATOR-OUTBOUND-ACL matches
  3. sudo iptables -t nat -L -v -n on biscayne — DNAT and SNAT hit counts
  4. sudo iptables -t mangle -L -v -n on biscayne — fwmark hit counts
  5. ip rule show on biscayne — fwmark 100 lookup ashburn
  6. Validator gossip ContactInfo shows 137.239.194.65 for ALL addresses (gossip, repair, TVU, TPU)
  7. Repair peer count increases (target: 20+ peers)
  8. Slot catchup rate improves from ~0.9 toward ~2.5 slots/sec
  9. traceroute --sport=8001 <remote_peer> from biscayne routes via doublezero0/was-sw01

Rollback

biscayne

sudo ip addr del 137.239.194.65/32 dev lo
sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
sudo iptables -t nat -D PREROUTING -p tcp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 -j DNAT --to-destination 172.20.0.2
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 -j MARK --set-mark 100
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 -j MARK --set-mark 100
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 -j MARK --set-mark 100
sudo iptables -t nat -D POSTROUTING -m mark --mark 100 -j SNAT --to-source 137.239.194.65
sudo ip rule del fwmark 100 table ashburn
sudo ip route del default table ashburn
sudo netfilter-persistent save

was-sw01

rollback running-config checkpoint pre-validator-relay
write memory

mia-sw01

rollback running-config checkpoint pre-validator-outbound
write memory

Key Details

Item Value
Ashburn relay IP 137.239.194.65 (Loopback101 on was-sw01)
Ashburn LAN block 137.239.194.64/29 on was-sw01 Et1/1
Biscayne IP 186.233.184.235
Kind node IP 172.20.0.2 (Docker bridge br-cf46a62ab5b2)
Validator ports 8001 (gossip), 9000-9025 (TVU/repair/TPU)
Excluded ports 8899 (RPC), 8900 (WebSocket) — direct to biscayne
GRE tunnel doublezero0: 169.254.7.7 ↔ 169.254.7.6, remote 209.42.167.133
Backbone was-sw01 Et4/1 172.16.1.188/31 ↔ mia-sw01 Et4/1 172.16.1.189/31
Policy routing table 100 ashburn
Fwmark 100
was-sw01 SSH install@137.239.200.198
EOS version 4.34.0F