276 lines
9.7 KiB
Markdown
276 lines
9.7 KiB
Markdown
|
|
# Ashburn Validator Relay — Full Traffic Redirect
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
All validator traffic (gossip, repair, TVU, TPU) enters and exits from
|
||
|
|
`137.239.194.65` (laconic-was-sw01, Ashburn). Peers see the validator as an
|
||
|
|
Ashburn node. This improves repair peer count and slot catchup rate by reducing
|
||
|
|
RTT to the TeraSwitch/Pittsburgh cluster from ~30ms (direct Miami) to ~5ms
|
||
|
|
(Ashburn).
|
||
|
|
|
||
|
|
Supersedes the previous TVU-only shred relay (see `tvu-shred-relay.md`).
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
OUTBOUND (validator → peers)
|
||
|
|
agave-validator (kind pod, ports 8001, 9000-9025)
|
||
|
|
↓ Docker bridge → host FORWARD chain
|
||
|
|
biscayne host (186.233.184.235)
|
||
|
|
↓ mangle PREROUTING: fwmark 100 on sport 8001,9000-9025 from 172.20.0.0/16
|
||
|
|
↓ nat POSTROUTING: SNAT → src 137.239.194.65
|
||
|
|
↓ policy route: fwmark 100 → table ashburn → via 169.254.7.6 dev doublezero0
|
||
|
|
laconic-mia-sw01 (209.42.167.133, Miami)
|
||
|
|
↓ traffic-policy VALIDATOR-OUTBOUND: src 137.239.194.65 → nexthop 172.16.1.188
|
||
|
|
↓ backbone Et4/1 (25.4ms)
|
||
|
|
laconic-was-sw01 Et4/1 (Ashburn)
|
||
|
|
↓ default route via 64.92.84.80 out Et1/1
|
||
|
|
Internet (peers see src 137.239.194.65)
|
||
|
|
|
||
|
|
INBOUND (peers → validator)
|
||
|
|
Solana peers → 137.239.194.65:8001,9000-9025
|
||
|
|
↓ internet routing to was-sw01
|
||
|
|
laconic-was-sw01 Et1/1 (Ashburn)
|
||
|
|
↓ traffic-policy VALIDATOR-RELAY: ASIC redirect, line rate
|
||
|
|
↓ nexthop 172.16.1.189 via Et4/1 backbone (25.4ms)
|
||
|
|
laconic-mia-sw01 Et4/1 (Miami)
|
||
|
|
↓ L3 forward → biscayne via doublezero0 GRE or ISP routing
|
||
|
|
biscayne (186.233.184.235)
|
||
|
|
↓ nat PREROUTING: DNAT dst 137.239.194.65:* → 172.20.0.2:* (kind node)
|
||
|
|
↓ Docker bridge → validator pod
|
||
|
|
agave-validator
|
||
|
|
```
|
||
|
|
|
||
|
|
RPC traffic (port 8899) is NOT relayed — clients connect directly to biscayne.
|
||
|
|
|
||
|
|
## Switch Config: laconic-was-sw01
|
||
|
|
|
||
|
|
SSH: `install@137.239.200.198`
|
||
|
|
|
||
|
|
### Pre-change
|
||
|
|
|
||
|
|
```
|
||
|
|
configure checkpoint save pre-validator-relay
|
||
|
|
```
|
||
|
|
|
||
|
|
Rollback: `rollback running-config checkpoint pre-validator-relay` then `write memory`.
|
||
|
|
|
||
|
|
### Config session with auto-revert
|
||
|
|
|
||
|
|
```
|
||
|
|
configure session validator-relay
|
||
|
|
|
||
|
|
! Loopback for 137.239.194.65 (do NOT touch Loopback100 which has .64)
|
||
|
|
interface Loopback101
|
||
|
|
ip address 137.239.194.65/32
|
||
|
|
|
||
|
|
! ACL covering all validator ports
|
||
|
|
ip access-list VALIDATOR-RELAY-ACL
|
||
|
|
10 permit udp any any eq 8001
|
||
|
|
20 permit udp any any range 9000 9025
|
||
|
|
30 permit tcp any any eq 8001
|
||
|
|
|
||
|
|
! Traffic-policy: ASIC redirect to backbone (mia-sw01)
|
||
|
|
traffic-policy VALIDATOR-RELAY
|
||
|
|
match VALIDATOR-RELAY-ACL
|
||
|
|
set nexthop 172.16.1.189
|
||
|
|
|
||
|
|
! Replace old SHRED-RELAY on Et1/1
|
||
|
|
interface Ethernet1/1
|
||
|
|
no traffic-policy input SHRED-RELAY
|
||
|
|
traffic-policy input VALIDATOR-RELAY
|
||
|
|
|
||
|
|
! system-rule overriding-action redirect (already present from SHRED-RELAY)
|
||
|
|
|
||
|
|
show session-config diffs
|
||
|
|
commit timer 00:05:00
|
||
|
|
```
|
||
|
|
|
||
|
|
After verification: `configure session validator-relay commit` then `write memory`.
|
||
|
|
|
||
|
|
### Cleanup (after stable)
|
||
|
|
|
||
|
|
Old SHRED-RELAY policy and ACL can be removed once VALIDATOR-RELAY is confirmed:
|
||
|
|
|
||
|
|
```
|
||
|
|
configure session cleanup-shred-relay
|
||
|
|
no traffic-policy SHRED-RELAY
|
||
|
|
no ip access-list SHRED-RELAY-ACL
|
||
|
|
show session-config diffs
|
||
|
|
commit
|
||
|
|
write memory
|
||
|
|
```
|
||
|
|
|
||
|
|
## Switch Config: laconic-mia-sw01
|
||
|
|
|
||
|
|
### Pre-flight checks
|
||
|
|
|
||
|
|
Before applying config, verify:
|
||
|
|
|
||
|
|
1. Which EOS interface terminates the doublezero0 GRE from biscayne
|
||
|
|
(endpoint 209.42.167.133). Check with `show interfaces tunnel` or
|
||
|
|
`show ip interface brief | include Tunnel`.
|
||
|
|
|
||
|
|
2. Whether `system-rule overriding-action redirect` is already configured.
|
||
|
|
Check with `show running-config | include system-rule`.
|
||
|
|
|
||
|
|
3. Whether EOS traffic-policy works on tunnel interfaces. If not, apply on
|
||
|
|
the physical interface where GRE packets arrive (likely Et<X> facing
|
||
|
|
biscayne's ISP network or the DZ infrastructure).
|
||
|
|
|
||
|
|
### Config session
|
||
|
|
|
||
|
|
```
|
||
|
|
configure checkpoint save pre-validator-outbound
|
||
|
|
|
||
|
|
configure session validator-outbound
|
||
|
|
|
||
|
|
! ACL matching outbound validator traffic (source = Ashburn IP)
|
||
|
|
ip access-list VALIDATOR-OUTBOUND-ACL
|
||
|
|
10 permit ip 137.239.194.65/32 any
|
||
|
|
|
||
|
|
! Redirect to was-sw01 via backbone
|
||
|
|
traffic-policy VALIDATOR-OUTBOUND
|
||
|
|
match VALIDATOR-OUTBOUND-ACL
|
||
|
|
set nexthop 172.16.1.188
|
||
|
|
|
||
|
|
! Apply on the interface where biscayne GRE traffic arrives
|
||
|
|
! Replace Tunnel<X> with the actual interface from pre-flight check #1
|
||
|
|
interface Tunnel<X>
|
||
|
|
traffic-policy input VALIDATOR-OUTBOUND
|
||
|
|
|
||
|
|
! Add system-rule if not already present (pre-flight check #2)
|
||
|
|
system-rule overriding-action redirect
|
||
|
|
|
||
|
|
show session-config diffs
|
||
|
|
commit timer 00:05:00
|
||
|
|
```
|
||
|
|
|
||
|
|
After verification: commit + `write memory`.
|
||
|
|
|
||
|
|
## Host Config: biscayne
|
||
|
|
|
||
|
|
Automated via ansible playbook `playbooks/ashburn-validator-relay.yml`.
|
||
|
|
|
||
|
|
### Manual equivalent
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Accept packets destined for 137.239.194.65
|
||
|
|
sudo ip addr add 137.239.194.65/32 dev lo
|
||
|
|
|
||
|
|
# 2. Inbound DNAT to kind node (172.20.0.2)
|
||
|
|
sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 8001 \
|
||
|
|
-j DNAT --to-destination 172.20.0.2:8001
|
||
|
|
sudo iptables -t nat -A PREROUTING -p tcp -d 137.239.194.65 --dport 8001 \
|
||
|
|
-j DNAT --to-destination 172.20.0.2:8001
|
||
|
|
sudo iptables -t nat -A PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 \
|
||
|
|
-j DNAT --to-destination 172.20.0.2
|
||
|
|
|
||
|
|
# 3. Outbound: mark validator traffic
|
||
|
|
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 \
|
||
|
|
-j MARK --set-mark 100
|
||
|
|
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 \
|
||
|
|
-j MARK --set-mark 100
|
||
|
|
sudo iptables -t mangle -A PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 \
|
||
|
|
-j MARK --set-mark 100
|
||
|
|
|
||
|
|
# 4. Outbound: SNAT to Ashburn IP (INSERT before Docker MASQUERADE)
|
||
|
|
sudo iptables -t nat -I POSTROUTING 1 -m mark --mark 100 \
|
||
|
|
-j SNAT --to-source 137.239.194.65
|
||
|
|
|
||
|
|
# 5. Policy routing table
|
||
|
|
echo "100 ashburn" | sudo tee -a /etc/iproute2/rt_tables
|
||
|
|
sudo ip rule add fwmark 100 table ashburn
|
||
|
|
sudo ip route add default via 169.254.7.6 dev doublezero0 table ashburn
|
||
|
|
|
||
|
|
# 6. Persist
|
||
|
|
sudo netfilter-persistent save
|
||
|
|
# ip rule + ip route persist via /etc/network/if-up.d/ashburn-routing
|
||
|
|
```
|
||
|
|
|
||
|
|
### Docker NAT port preservation
|
||
|
|
|
||
|
|
**Must verify before going live:** Docker masquerade must preserve source ports
|
||
|
|
for kind's hostNetwork pods. If Docker rewrites the source port, the mangle
|
||
|
|
PREROUTING match on `--sport 8001,9000-9025` will miss traffic.
|
||
|
|
|
||
|
|
Test: `tcpdump -i br-cf46a62ab5b2 -nn 'udp src port 8001'` — if you see
|
||
|
|
packets with sport 8001 from 172.20.0.2, port preservation works.
|
||
|
|
|
||
|
|
If Docker does NOT preserve ports, the mark must be set inside the kind node
|
||
|
|
container (on the pod's veth) rather than on the host.
|
||
|
|
|
||
|
|
## Execution Order
|
||
|
|
|
||
|
|
1. **was-sw01**: checkpoint → config session with 5min auto-revert → verify counters → commit
|
||
|
|
2. **biscayne**: add 137.239.194.65/32 to lo, add inbound DNAT rules
|
||
|
|
3. **Verify inbound**: `ping 137.239.194.65` from external host, check DNAT counters
|
||
|
|
4. **mia-sw01**: pre-flight checks → config session with 5min auto-revert → commit
|
||
|
|
5. **biscayne**: add outbound fwmark + policy routing + SNAT rules
|
||
|
|
6. **Test outbound**: from biscayne, send UDP from port 8001, verify src 137.239.194.65 on was-sw01
|
||
|
|
7. **Verify**: traffic-policy counters on both switches, iptables hit counts on biscayne
|
||
|
|
8. **Restart validator** if needed (gossip should auto-refresh, but restart ensures clean state)
|
||
|
|
9. **was-sw01 + mia-sw01**: `write memory` to persist
|
||
|
|
10. **Cleanup**: remove old SHRED-RELAY and 64.92.84.81:20000 DNAT after stable
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
1. `show traffic-policy counters` on was-sw01 — VALIDATOR-RELAY-ACL matches
|
||
|
|
2. `show traffic-policy counters` on mia-sw01 — VALIDATOR-OUTBOUND-ACL matches
|
||
|
|
3. `sudo iptables -t nat -L -v -n` on biscayne — DNAT and SNAT hit counts
|
||
|
|
4. `sudo iptables -t mangle -L -v -n` on biscayne — fwmark hit counts
|
||
|
|
5. `ip rule show` on biscayne — fwmark 100 lookup ashburn
|
||
|
|
6. Validator gossip ContactInfo shows 137.239.194.65 for ALL addresses (gossip, repair, TVU, TPU)
|
||
|
|
7. Repair peer count increases (target: 20+ peers)
|
||
|
|
8. Slot catchup rate improves from ~0.9 toward ~2.5 slots/sec
|
||
|
|
9. `traceroute --sport=8001 <remote_peer>` from biscayne routes via doublezero0/was-sw01
|
||
|
|
|
||
|
|
## Rollback
|
||
|
|
|
||
|
|
### biscayne
|
||
|
|
|
||
|
|
```bash
|
||
|
|
sudo ip addr del 137.239.194.65/32 dev lo
|
||
|
|
sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
|
||
|
|
sudo iptables -t nat -D PREROUTING -p tcp -d 137.239.194.65 --dport 8001 -j DNAT --to-destination 172.20.0.2:8001
|
||
|
|
sudo iptables -t nat -D PREROUTING -p udp -d 137.239.194.65 --dport 9000:9025 -j DNAT --to-destination 172.20.0.2
|
||
|
|
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 8001 -j MARK --set-mark 100
|
||
|
|
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p udp --sport 9000:9025 -j MARK --set-mark 100
|
||
|
|
sudo iptables -t mangle -D PREROUTING -s 172.20.0.0/16 -p tcp --sport 8001 -j MARK --set-mark 100
|
||
|
|
sudo iptables -t nat -D POSTROUTING -m mark --mark 100 -j SNAT --to-source 137.239.194.65
|
||
|
|
sudo ip rule del fwmark 100 table ashburn
|
||
|
|
sudo ip route del default table ashburn
|
||
|
|
sudo netfilter-persistent save
|
||
|
|
```
|
||
|
|
|
||
|
|
### was-sw01
|
||
|
|
|
||
|
|
```
|
||
|
|
rollback running-config checkpoint pre-validator-relay
|
||
|
|
write memory
|
||
|
|
```
|
||
|
|
|
||
|
|
### mia-sw01
|
||
|
|
|
||
|
|
```
|
||
|
|
rollback running-config checkpoint pre-validator-outbound
|
||
|
|
write memory
|
||
|
|
```
|
||
|
|
|
||
|
|
## Key Details
|
||
|
|
|
||
|
|
| Item | Value |
|
||
|
|
|------|-------|
|
||
|
|
| Ashburn relay IP | `137.239.194.65` (Loopback101 on was-sw01) |
|
||
|
|
| Ashburn LAN block | `137.239.194.64/29` on was-sw01 Et1/1 |
|
||
|
|
| Biscayne IP | `186.233.184.235` |
|
||
|
|
| Kind node IP | `172.20.0.2` (Docker bridge br-cf46a62ab5b2) |
|
||
|
|
| Validator ports | 8001 (gossip), 9000-9025 (TVU/repair/TPU) |
|
||
|
|
| Excluded ports | 8899 (RPC), 8900 (WebSocket) — direct to biscayne |
|
||
|
|
| GRE tunnel | doublezero0: 169.254.7.7 ↔ 169.254.7.6, remote 209.42.167.133 |
|
||
|
|
| Backbone | was-sw01 Et4/1 172.16.1.188/31 ↔ mia-sw01 Et4/1 172.16.1.189/31 |
|
||
|
|
| Policy routing table | 100 ashburn |
|
||
|
|
| Fwmark | 100 |
|
||
|
|
| was-sw01 SSH | `install@137.239.200.198` |
|
||
|
|
| EOS version | 4.34.0F |
|