162 lines
4.7 KiB
Markdown
162 lines
4.7 KiB
Markdown
|
|
# TVU Shred Relay — Data-Plane Redirect
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
Biscayne's agave validator advertises `64.92.84.81:20000` (laconic-was-sw01 Et1/1) as its TVU
|
||
|
|
address. Turbine shreds arrive as normal UDP to the switch's front-panel IP. The 7280CR3A ASIC
|
||
|
|
handles front-panel traffic without punting to Linux userspace — it sees a local interface IP
|
||
|
|
with no service and drops at the hardware level.
|
||
|
|
|
||
|
|
### Previous approach (monitor + socat)
|
||
|
|
|
||
|
|
EOS monitor session mirrored matched packets to CPU (mirror0 interface). socat read from mirror0
|
||
|
|
and relayed to biscayne. shred-unwrap.py on biscayne stripped encapsulation headers.
|
||
|
|
|
||
|
|
Fragile: socat ran as a foreground process, died on disconnect.
|
||
|
|
|
||
|
|
### New approach (traffic-policy redirect)
|
||
|
|
|
||
|
|
EOS `traffic-policy` with `set nexthop` and `system-rule overriding-action redirect` overrides
|
||
|
|
the ASIC's "local IP, handle myself" decision. The ASIC forwards matched packets to the
|
||
|
|
specified next-hop at line rate. Pure data plane, no CPU involvement, persists in startup-config.
|
||
|
|
|
||
|
|
Available since EOS 4.28.0F on R3 platforms. Confirmed on 4.34.0F.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
Turbine peers (hundreds of validators)
|
||
|
|
|
|
||
|
|
v UDP shreds to 64.92.84.81:20000
|
||
|
|
laconic-was-sw01 Et1/1 (Ashburn)
|
||
|
|
| ASIC matches traffic-policy SHRED-RELAY
|
||
|
|
| Redirects to nexthop 172.16.1.189 (data plane, line rate)
|
||
|
|
v Et4/1 backbone (25.4ms)
|
||
|
|
laconic-mia-sw01 Et4/1 (Miami)
|
||
|
|
| forwards via default route (same metro)
|
||
|
|
v 0.13ms
|
||
|
|
biscayne (186.233.184.235, Miami)
|
||
|
|
| iptables DNAT: dst 64.92.84.81:20000 -> 127.0.0.1:9000
|
||
|
|
v
|
||
|
|
agave-validator TVU port (localhost:9000)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Production Config: laconic-was-sw01
|
||
|
|
|
||
|
|
### Pre-change safety
|
||
|
|
|
||
|
|
```
|
||
|
|
configure checkpoint save pre-shred-relay
|
||
|
|
```
|
||
|
|
|
||
|
|
Rollback: `rollback running-config checkpoint pre-shred-relay` then `write memory`.
|
||
|
|
|
||
|
|
### Config session with auto-revert
|
||
|
|
|
||
|
|
```
|
||
|
|
configure session shred-relay
|
||
|
|
|
||
|
|
! ACL for traffic-policy match
|
||
|
|
ip access-list SHRED-RELAY-ACL
|
||
|
|
10 permit udp any any eq 20000
|
||
|
|
|
||
|
|
! Traffic policy: redirect matched packets to backbone next-hop
|
||
|
|
traffic-policy SHRED-RELAY
|
||
|
|
match SHRED-RELAY-ACL
|
||
|
|
set nexthop 172.16.1.189
|
||
|
|
|
||
|
|
! Override ASIC punt-to-CPU for redirected traffic
|
||
|
|
system-rule overriding-action redirect
|
||
|
|
|
||
|
|
! Apply to Et1/1 ingress
|
||
|
|
interface Ethernet1/1
|
||
|
|
traffic-policy input SHRED-RELAY
|
||
|
|
|
||
|
|
! Remove old monitor session and its ACL
|
||
|
|
no monitor session 1
|
||
|
|
no ip access-list SHRED-RELAY
|
||
|
|
|
||
|
|
! Review before committing
|
||
|
|
show session-config diffs
|
||
|
|
|
||
|
|
! Commit with 5-minute auto-revert safety net
|
||
|
|
commit timer 00:05:00
|
||
|
|
```
|
||
|
|
|
||
|
|
After verification: `configure session shred-relay commit` then `write memory`.
|
||
|
|
|
||
|
|
### Linux cleanup on was-sw01
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Kill socat relay (PID 27743)
|
||
|
|
kill 27743
|
||
|
|
# Remove Linux kernel route
|
||
|
|
ip route del 186.233.184.235/32
|
||
|
|
```
|
||
|
|
|
||
|
|
The EOS static route `ip route 186.233.184.235/32 172.16.1.189` stays (general reachability).
|
||
|
|
|
||
|
|
## Production Config: biscayne
|
||
|
|
|
||
|
|
### iptables DNAT
|
||
|
|
|
||
|
|
Traffic-policy sends normal L3-forwarded UDP packets (no mirror encapsulation). Packets arrive
|
||
|
|
with dst `64.92.84.81:20000` containing clean shred payloads directly in the UDP body.
|
||
|
|
|
||
|
|
```bash
|
||
|
|
sudo iptables -t nat -A PREROUTING -p udp -d 64.92.84.81 --dport 20000 \
|
||
|
|
-j DNAT --to-destination 127.0.0.1:9000
|
||
|
|
|
||
|
|
# Persist across reboot
|
||
|
|
sudo apt install -y iptables-persistent
|
||
|
|
sudo netfilter-persistent save
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cleanup
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Kill shred-unwrap.py (PID 2497694)
|
||
|
|
kill 2497694
|
||
|
|
rm /tmp/shred-unwrap.py
|
||
|
|
```
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
1. `show traffic-policy interface Ethernet1/1` — policy applied
|
||
|
|
2. `show traffic-policy counters` — packets matching and redirected
|
||
|
|
3. `sudo iptables -t nat -L PREROUTING -v -n` — DNAT rule with packet counts
|
||
|
|
4. Validator logs: slot replay rate should maintain ~3.3 slots/sec
|
||
|
|
5. `ss -unp | grep 9000` — validator receiving on TVU port
|
||
|
|
|
||
|
|
## What was removed
|
||
|
|
|
||
|
|
| Component | Host |
|
||
|
|
|-----------|------|
|
||
|
|
| monitor session 1 | was-sw01 |
|
||
|
|
| SHRED-RELAY ACL (old) | was-sw01 |
|
||
|
|
| socat relay process | was-sw01 |
|
||
|
|
| Linux kernel static route | was-sw01 |
|
||
|
|
| shred-unwrap.py | biscayne |
|
||
|
|
|
||
|
|
## What was added
|
||
|
|
|
||
|
|
| Component | Host | Persistent? |
|
||
|
|
|-----------|------|-------------|
|
||
|
|
| traffic-policy SHRED-RELAY | was-sw01 | Yes (startup-config) |
|
||
|
|
| SHRED-RELAY-ACL | was-sw01 | Yes (startup-config) |
|
||
|
|
| system-rule overriding-action redirect | was-sw01 | Yes (startup-config) |
|
||
|
|
| iptables DNAT rule | biscayne | Yes (iptables-persistent) |
|
||
|
|
|
||
|
|
## Key Details
|
||
|
|
|
||
|
|
| Item | Value |
|
||
|
|
|------|-------|
|
||
|
|
| Biscayne validator identity | `4WeLUxfQghbhsLEuwaAzjZiHg2VBw87vqHc4iZrGvKPr` |
|
||
|
|
| Biscayne IP | `186.233.184.235` |
|
||
|
|
| laconic-was-sw01 public IP | `64.92.84.81` (Et1/1) |
|
||
|
|
| laconic-was-sw01 backbone IP | `172.16.1.188` (Et4/1) |
|
||
|
|
| laconic-was-sw01 SSH | `install@137.239.200.198` |
|
||
|
|
| laconic-mia-sw01 backbone IP | `172.16.1.189` (Et4/1) |
|
||
|
|
| Backbone RTT (WAS-MIA) | 25.4ms |
|
||
|
|
| EOS version | 4.34.0F |
|