stack-orchestrator/docs/ashburn-relay-checklist.md

5.4 KiB

Ashburn Relay / ip_echo Port Reachability Checklist

The validator exits when it can't verify UDP ports (8001, 9000, 9002, 9003) are reachable from entrypoint servers. The ip_echo protocol: validator TCP-connects to entrypoint on port 8001, entrypoint sees source IP, sends UDP probes back to that IP on the validator's ports. If probes don't arrive, validator crashes.

Layer 1: Biscayne outbound path

Validator's outbound ip_echo TCP (dport 8001) must exit via GRE tunnel so entrypoints see 137.239.194.65, not biscayne's real IP via Docker MASQUERADE.

[ ] 1.1  Mangle rules (4 rules in mangle PREROUTING):
         - udp sport 8001        (gossip outbound)
         - udp sport 9000:9025   (TVU/repair outbound)
         - tcp sport 8001        (gossip TCP outbound)
         - tcp dport 8001        (ip_echo outbound — THE CRITICAL ONE)

[ ] 1.2  SNAT rule at position 1 (before Docker MASQUERADE):
         POSTROUTING -m mark --mark 100 -j SNAT --to-source 137.239.194.65

[ ] 1.3  Policy routing rule:
         fwmark 0x64 lookup ashburn

[ ] 1.4  Ashburn routing table default route:
         default via 169.254.100.0 dev gre-ashburn

[ ] 1.5  Mangle counters incrementing (pkts/bytes on tcp dport 8001 rule)

Layer 2: GRE tunnel (biscayne ↔ mia-sw01)

[ ] 2.1  Tunnel exists and UP:
         gre-ashburn with 169.254.100.1/31

[ ] 2.2  Tunnel peer reachable:
         ping 169.254.100.0

[ ] 2.3  Ashburn IP on loopback:
         137.239.194.65/32 dev lo

Layer 3: Biscayne inbound path (DNAT + DOCKER-USER)

Entrypoint UDP probes arrive at 137.239.194.65 and must reach kind node 172.20.0.2.

[ ] 3.1  DNAT rules at position 1 in nat PREROUTING
         (before Docker's ADDRTYPE LOCAL rule):
         - udp dport 8001       → 172.20.0.2:8001
         - tcp dport 8001       → 172.20.0.2:8001
         - udp dport 9000:9025  → 172.20.0.2

[ ] 3.2  DOCKER-USER ACCEPT rules (3 rules):
         - udp dport 8001       → ACCEPT
         - tcp dport 8001       → ACCEPT
         - udp dport 9000:9025  → ACCEPT

[ ] 3.3  DNAT counters incrementing

Layer 4: mia-sw01

[ ] 4.1  Tunnel100 UP in VRF relay
         src 209.42.167.137, dst 186.233.184.235, link 169.254.100.0/31

[ ] 4.2  VRF relay default route:
         0.0.0.0/0 egress-vrf default 172.16.1.188

[ ] 4.3  Default VRF route to relay IP:
         137.239.194.65/32 egress-vrf relay 169.254.100.1

[ ] 4.4  ACL SEC-VALIDATOR-100-IN permits all needed traffic

[ ] 4.5  Backbone Et4/1 UP (172.16.1.189/31)

Layer 5: was-sw01

[ ] 5.1  Static route: 137.239.194.65/32 via 172.16.1.189

[ ] 5.2  Backbone Et4/1 UP (172.16.1.188/31)

[ ] 5.3  No Loopback101 (removed to avoid absorbing traffic locally)

Layer 6: Persistence

[ ] 6.1  ashburn-relay.service enabled and active (runs After=docker.service)

[ ] 6.2  /usr/local/sbin/ashburn-relay-setup.sh exists

Layer 7: End-to-end tests

All tests run via Ansible playbooks. The test scripts in scripts/ are utilities invoked by the playbooks — never run them manually via SSH.

[ ] 7.1  relay-test-tcp-dport.py (via ashburn-relay-check.yml or ad-hoc play)
         Tests: outbound tcp dport 8001 mangle → SNAT → tunnel
         Pass:  entrypoint sees 137.239.194.65
         Fail:  entrypoint sees 186.233.184.235 (Docker MASQUERADE)

[ ] 7.2  relay-test-ip-echo.py (via ashburn-relay-check.yml or ad-hoc play)
         Tests: FULL END-TO-END (outbound SNAT + inbound DNAT + DOCKER-USER)
         Pass:  UDP probe received from entrypoint
         Fail:  no UDP probes — inbound path broken

[ ] 7.3  relay-inbound-udp-test.yml (cross-inventory: biscayne + kelce)
         Tests: inbound UDP from external host → DNAT → kind node
         Pass:  UDP arrives in kind netns

Playbooks

# Read-only check of all relay state (biscayne + both switches):
ansible-playbook -i inventory-switches/switches.yml \
  -i inventory/biscayne.yml playbooks/ashburn-relay-check.yml

# Apply all biscayne relay rules (idempotent):
ansible-playbook -i inventory/biscayne.yml playbooks/ashburn-relay-biscayne.yml

# Apply outbound only (the ip_echo fix):
ansible-playbook -i inventory/biscayne.yml \
  playbooks/ashburn-relay-biscayne.yml -t outbound

# Apply inbound only (DNAT + DOCKER-USER):
ansible-playbook -i inventory/biscayne.yml \
  playbooks/ashburn-relay-biscayne.yml -t inbound

# Apply mia-sw01 config:
ansible-playbook -i inventory-switches/switches.yml \
  playbooks/ashburn-relay-mia-sw01.yml

# Apply was-sw01 config:
ansible-playbook -i inventory-switches/switches.yml \
  playbooks/ashburn-relay-was-sw01.yml

# Cross-inventory inbound UDP test (biscayne + kelce):
ansible-playbook -i inventory/biscayne.yml -i inventory/kelce.yml \
  playbooks/relay-inbound-udp-test.yml

Historical root causes

  1. TCP dport 8001 mangle rule missing — ip_echo TCP exits via Docker MASQUERADE, entrypoint sees wrong IP, UDP probes go to wrong address.

  2. DOCKER-USER ACCEPT rules missing — DNAT'd traffic hits Docker's FORWARD DROP policy, never reaches kind node.

  3. DNAT rule position wrong — Docker's ADDRTYPE LOCAL rule in PREROUTING catches traffic to loopback IPs before our DNAT rules. Must use -I PREROUTING 1.

  4. mia-sw01 egress-vrf route with interface specified — silently fails in EOS (accepted in config, never installed in RIB). Must use nexthop-only form.

  5. was-sw01 Loopback101 absorbing traffic — local delivery instead of forwarding to mia-sw01 via backbone.