feat: dedicated GRE tunnel (Tunnel100) bypassing DZ-managed Tunnel500

Root cause: the doublezero-agent on mia-sw01 manages Tunnel500's ACL
(SEC-USER-500-IN) and drops outbound gossip with src 137.239.194.65.
The agent overwrites any custom ACL entries.

Fix: create a separate GRE tunnel (Tunnel100) using mia-sw01's free
LAN IP (209.42.167.137) as tunnel source. This tunnel goes over the
ISP uplink, completely independent of the DZ overlay:
- mia-sw01: Tunnel100 src 209.42.167.137, dst 186.233.184.235
- biscayne: gre-ashburn src 186.233.184.235, dst 209.42.167.137
- Link addresses: 169.254.100.0/31

Playbook changes:
- ashburn-relay-mia-sw01: Tunnel100 + Loopback101 + SEC-VALIDATOR-100-IN
- ashburn-relay-biscayne: gre-ashburn tunnel + updated policy routing
- New template: ashburn-routing-ifup.sh.j2 for boot persistence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix/kind-mount-propagation
A. F. Dudley 2026-03-07 01:47:58 +00:00
parent 0b52fc99d7
commit 742e84e3b0
4 changed files with 261 additions and 158 deletions

View File

@ -1,85 +1,61 @@
# Bug: Ashburn Relay — 137.239.194.65 Not Routable from Public Internet # Bug: Ashburn Relay — Outbound Gossip Dropped by DZ Agent ACL
## Summary ## Summary
`--gossip-host 137.239.194.65` correctly advertises the Ashburn relay IP in `--gossip-host 137.239.194.65` correctly advertises the Ashburn relay IP in
ContactInfo for all sockets (gossip, TVU, repair, TPU). However, 137.239.194.65 ContactInfo for all sockets (gossip, TVU, repair, TPU). The inbound path
is a DoubleZero overlay IP (137.239.192.0/19, IS-IS only) that is NOT announced works end-to-end (proven with kelce UDP tests through every hop). However,
via BGP to the public internet. Public peers cannot route to it, so TVU shreds, outbound gossip from biscayne (src 137.239.194.65) is dropped by the
repair requests, and TPU traffic never arrive at was-sw01. DoubleZero agent's ACL on mia-sw01's Tunnel500, preventing ContactInfo from
propagating to the cluster. Peers never learn our TVU address.
## Evidence ## Evidence
- Gossip traffic arrives on `doublezero0` interface: - Inbound path confirmed hop by hop (kelce → was-sw01 → mia-sw01 → Tunnel500
→ biscayne doublezero0 → DNAT → kind bridge → kind node eth0):
``` ```
doublezero0 In IP 64.130.58.70.8001 > 137.239.194.65.8001: UDP, length 132 01:04:12.136633 IP 69.112.108.72.58856 > 172.20.0.2.9000: UDP, length 13
``` ```
- Zero TVU/repair traffic arrives: - Outbound gossip leaves biscayne correctly (src 137.239.194.65:8001 on
doublezero0), enters mia-sw01 via Tunnel500, hits SEC-USER-500-IN ACL:
``` ```
tcpdump -i doublezero0 'dst host 137.239.194.65 and udp and not port 8001' 60 deny ip any any [match 26355968 packets, 0:00:02 ago]
0 packets captured
``` ```
- ContactInfo correctly advertises all sockets on 137.239.194.65: The ACL only permits src 186.233.184.235 and 169.254.7.7 — not 137.239.194.65.
```json - Validator not visible in public RPC getClusterNodes (gossip not propagating)
{ - Validator sees 775 nodes vs 5,045 on public RPC
"gossip": "137.239.194.65:8001",
"tvu": "137.239.194.65:9000",
"serveRepair": "137.239.194.65:9011",
"tpu": "137.239.194.65:9002"
}
```
- Outbound gossip from biscayne exits via `doublezero0` with source
137.239.194.65 — SNAT and routing work correctly in the outbound direction.
## Root Cause ## Root Cause
**137.239.194.0/24 is not routable from the public internet.** The prefix The `doublezero-agent` daemon on mia-sw01 manages Tunnel500 and its ACL
belongs to DoubleZero's overlay address space (137.239.192.0/19, Momentum (SEC-USER-500-IN). The agent periodically reconciles the ACL to its expected
Telecom, WHOIS OriginAS: empty). It is advertised only via IS-IS within the state, overwriting any custom entries we add. We cannot modify the ACL
DoubleZero switch mesh. There is no eBGP session on was-sw01 to advertise it without the agent reverting it.
to the ISP — all BGP peers are iBGP AS 65342 (DoubleZero internal).
When the validator advertises `tvu: 137.239.194.65:9000` in ContactInfo, 137.239.194.65 is from the was-sw01 LAN block (137.239.194.64/29), routed
public internet peers attempt to send turbine shreds to that IP, but the by the ISP to was-sw01 via the WAN link. It IS publicly routable (confirmed
packets have no route through the global BGP table to reach was-sw01. Only by kelce ping/UDP tests). The earlier hypothesis that it was unroutable was
DoubleZero-connected peers could potentially reach it via the overlay. wrong — the IP reaches was-sw01, gets forwarded to mia-sw01 via backbone,
and reaches biscayne through Tunnel500 (inbound ACL direction is fine).
The old shred relay pipeline worked because it used `--public-tvu-address The problem is outbound only: the Tunnel500 ingress ACL (traffic FROM
64.92.84.81:20000` — was-sw01's Et1/1 ISP uplink IP, which IS publicly biscayne TO mia-sw01) drops src 137.239.194.65.
routable. The `--gossip-host 137.239.194.65` approach advertises a
DoubleZero-only IP for ALL sockets, making TVU/repair/TPU unreachable from
non-DoubleZero peers.
The original hypothesis (ACL/PBR port filtering) was wrong. The tunnel and ## Fix
switch routing work correctly — the problem is upstream: traffic never arrives
at was-sw01 in the first place.
## Impact Create a dedicated GRE tunnel (Tunnel100) between biscayne and mia-sw01
that bypasses the DZ-managed Tunnel500 entirely:
The validator cannot receive turbine shreds or serve repair requests via the - **mia-sw01 Tunnel100**: src 209.42.167.137 (free LAN IP), dst 186.233.184.235
low-latency Ashburn path. It falls back to the Miami public IP (186.233.184.235) (biscayne), link 169.254.100.0/31, ACL SEC-VALIDATOR-100-IN (we control)
for all shred/repair traffic, negating the benefit of `--gossip-host`. - **biscayne gre-ashburn**: src 186.233.184.235, dst 209.42.167.137,
link 169.254.100.1/31
## Fix Options Traffic flow unchanged except the tunnel:
- Inbound: was-sw01 → backbone → mia-sw01 → Tunnel100 → biscayne → DNAT → agave
- Outbound: agave → SNAT 137.239.194.65 → Tunnel100 → mia-sw01 → backbone → was-sw01
1. **Use 64.92.84.81 (was-sw01 Et1/1) for ContactInfo sockets.** This is the See:
publicly routable Ashburn IP. Requires `--gossip-host 64.92.84.81` (or - `playbooks/ashburn-relay-mia-sw01.yml` (Tunnel100 + ACL + routes)
equivalent `--bind-address` config) and DNAT/forwarding on was-sw01 to relay - `playbooks/ashburn-relay-biscayne.yml` (gre-ashburn + DNAT + SNAT + policy routing)
traffic through the backbone → mia-sw01 → Tunnel500 → biscayne. The old - `playbooks/ashburn-relay-was-sw01.yml` (static route, unchanged)
`--public-tvu-address` pipeline used this IP successfully.
2. **Get DoubleZero to announce 137.239.194.0/24 via eBGP to the ISP.** This
would make the current `--gossip-host 137.239.194.65` setup work, but
requires coordination with DoubleZero operations.
3. **Hybrid approach**: Use 64.92.84.81 for public-facing sockets (TVU, repair,
TPU) and 137.239.194.65 for gossip (which works via DoubleZero overlay).
Requires agave to support per-protocol address binding, which it does not
(`--gossip-host` sets ALL sockets to the same IP).
## Previous Workaround
The old `--public-tvu-address` pipeline used socat + shred-unwrap.py to relay
shreds from 64.92.84.81:20000 to the validator. That pipeline is not persistent
across reboots and was superseded by the `--gossip-host` approach (which turned
out to be broken for non-DoubleZero peers).

View File

@ -2,7 +2,12 @@
# Configure biscayne for Ashburn validator relay # Configure biscayne for Ashburn validator relay
# #
# Sets up inbound DNAT (137.239.194.65 → kind node) and outbound SNAT + # Sets up inbound DNAT (137.239.194.65 → kind node) and outbound SNAT +
# policy routing (validator traffic → doublezero0 → mia-sw01 → was-sw01). # policy routing (validator traffic → GRE tunnel → mia-sw01 → was-sw01).
#
# Uses a dedicated GRE tunnel to mia-sw01 (NOT the DoubleZero-managed
# doublezero0/Tunnel500). The tunnel source is biscayne's public IP
# (186.233.184.235) and the destination is mia-sw01's free LAN IP
# (209.42.167.137).
# #
# Usage: # Usage:
# # Full setup (inbound + outbound) # # Full setup (inbound + outbound)
@ -28,8 +33,12 @@
ashburn_ip: 137.239.194.65 ashburn_ip: 137.239.194.65
kind_node_ip: 172.20.0.2 kind_node_ip: 172.20.0.2
kind_network: 172.20.0.0/16 kind_network: 172.20.0.0/16
tunnel_gateway: 169.254.7.6 # New dedicated GRE tunnel (not DZ-managed doublezero0)
tunnel_device: doublezero0 tunnel_device: gre-ashburn
tunnel_local_ip: 169.254.100.1 # biscayne end of /31
tunnel_remote_ip: 169.254.100.0 # mia-sw01 end of /31
tunnel_src: 186.233.184.235 # biscayne public IP
tunnel_dst: 209.42.167.137 # mia-sw01 free LAN IP
fwmark: 100 fwmark: 100
rt_table_name: ashburn rt_table_name: ashburn
rt_table_id: 100 rt_table_id: 100
@ -49,6 +58,15 @@
ansible.builtin.command: ansible.builtin.command:
cmd: ip addr del {{ ashburn_ip }}/32 dev lo cmd: ip addr del {{ ashburn_ip }}/32 dev lo
failed_when: false failed_when: false
changed_when: false
- name: Remove GRE tunnel
ansible.builtin.shell:
cmd: |
ip link set {{ tunnel_device }} down 2>/dev/null || true
ip tunnel del {{ tunnel_device }} 2>/dev/null || true
executable: /bin/bash
changed_when: false
- name: Remove inbound DNAT rules - name: Remove inbound DNAT rules
ansible.builtin.shell: ansible.builtin.shell:
@ -58,6 +76,7 @@
iptables -t nat -D PREROUTING -p tcp -d {{ ashburn_ip }} --dport {{ gossip_port }} -j DNAT --to-destination {{ kind_node_ip }}:{{ gossip_port }} 2>/dev/null || true iptables -t nat -D PREROUTING -p tcp -d {{ ashburn_ip }} --dport {{ gossip_port }} -j DNAT --to-destination {{ kind_node_ip }}:{{ gossip_port }} 2>/dev/null || true
iptables -t nat -D PREROUTING -p udp -d {{ ashburn_ip }} --dport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j DNAT --to-destination {{ kind_node_ip }} 2>/dev/null || true iptables -t nat -D PREROUTING -p udp -d {{ ashburn_ip }} --dport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j DNAT --to-destination {{ kind_node_ip }} 2>/dev/null || true
executable: /bin/bash executable: /bin/bash
changed_when: false
- name: Remove outbound mangle rules - name: Remove outbound mangle rules
ansible.builtin.shell: ansible.builtin.shell:
@ -67,11 +86,13 @@
iptables -t mangle -D PREROUTING -s {{ kind_network }} -p udp --sport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j MARK --set-mark {{ fwmark }} 2>/dev/null || true iptables -t mangle -D PREROUTING -s {{ kind_network }} -p udp --sport {{ dynamic_port_range_start }}:{{ dynamic_port_range_end }} -j MARK --set-mark {{ fwmark }} 2>/dev/null || true
iptables -t mangle -D PREROUTING -s {{ kind_network }} -p tcp --sport {{ gossip_port }} -j MARK --set-mark {{ fwmark }} 2>/dev/null || true iptables -t mangle -D PREROUTING -s {{ kind_network }} -p tcp --sport {{ gossip_port }} -j MARK --set-mark {{ fwmark }} 2>/dev/null || true
executable: /bin/bash executable: /bin/bash
changed_when: false
- name: Remove outbound SNAT rule - name: Remove outbound SNAT rule
ansible.builtin.shell: ansible.builtin.shell:
cmd: iptables -t nat -D POSTROUTING -m mark --mark {{ fwmark }} -j SNAT --to-source {{ ashburn_ip }} 2>/dev/null || true cmd: iptables -t nat -D POSTROUTING -m mark --mark {{ fwmark }} -j SNAT --to-source {{ ashburn_ip }} 2>/dev/null || true
executable: /bin/bash executable: /bin/bash
changed_when: false
- name: Remove policy routing - name: Remove policy routing
ansible.builtin.shell: ansible.builtin.shell:
@ -79,10 +100,12 @@
ip rule del fwmark {{ fwmark }} table {{ rt_table_name }} 2>/dev/null || true ip rule del fwmark {{ fwmark }} table {{ rt_table_name }} 2>/dev/null || true
ip route del default table {{ rt_table_name }} 2>/dev/null || true ip route del default table {{ rt_table_name }} 2>/dev/null || true
executable: /bin/bash executable: /bin/bash
changed_when: false
- name: Persist cleaned iptables - name: Persist cleaned iptables
ansible.builtin.command: ansible.builtin.command:
cmd: netfilter-persistent save cmd: netfilter-persistent save
changed_when: true
- name: Remove if-up.d script - name: Remove if-up.d script
ansible.builtin.file: ansible.builtin.file:
@ -91,7 +114,7 @@
- name: Rollback complete - name: Rollback complete
ansible.builtin.debug: ansible.builtin.debug:
msg: "Ashburn relay rules removed. Old SHRED-RELAY DNAT (64.92.84.81:20000) is still in place." msg: "Ashburn relay rules removed."
- name: End play after rollback - name: End play after rollback
ansible.builtin.meta: end_play ansible.builtin.meta: end_play
@ -99,13 +122,13 @@
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Pre-flight checks # Pre-flight checks
# ------------------------------------------------------------------ # ------------------------------------------------------------------
- name: Check doublezero0 tunnel is up - name: Check tunnel destination is reachable
ansible.builtin.command: ansible.builtin.command:
cmd: ip link show {{ tunnel_device }} cmd: ping -c 1 -W 2 {{ tunnel_dst }}
register: tunnel_status register: tunnel_dst_ping
changed_when: false changed_when: false
failed_when: "'UP' not in tunnel_status.stdout" failed_when: tunnel_dst_ping.rc != 0
tags: [preflight, inbound, outbound] tags: [preflight, outbound]
- name: Check kind node is reachable - name: Check kind node is reachable
ansible.builtin.command: ansible.builtin.command:
@ -115,23 +138,6 @@
failed_when: kind_ping.rc != 0 failed_when: kind_ping.rc != 0
tags: [preflight, inbound] tags: [preflight, inbound]
- name: Verify Docker preserves source ports (5 sec sample)
ansible.builtin.shell:
cmd: |
set -o pipefail
# Check if any validator traffic is flowing with original sport
timeout 5 tcpdump -i br-cf46a62ab5b2 -nn -c 5 'udp src port 8001 or udp src portrange 9000-9025' 2>&1 | tail -5 || echo "No validator traffic captured in 5s (validator may not be running)"
executable: /bin/bash
register: sport_check
changed_when: false
failed_when: false
tags: [preflight]
- name: Show sport preservation check
ansible.builtin.debug:
var: sport_check.stdout_lines
tags: [preflight]
- name: Show existing iptables nat rules - name: Show existing iptables nat rules
ansible.builtin.shell: ansible.builtin.shell:
cmd: iptables -t nat -L -v -n --line-numbers | head -60 cmd: iptables -t nat -L -v -n --line-numbers | head -60
@ -145,6 +151,44 @@
var: existing_nat.stdout_lines var: existing_nat.stdout_lines
tags: [preflight] tags: [preflight]
- name: Check for existing GRE tunnel
ansible.builtin.shell:
cmd: ip tunnel show {{ tunnel_device }} 2>&1 || echo "tunnel does not exist"
executable: /bin/bash
register: existing_tunnel
changed_when: false
tags: [preflight]
- name: Display existing tunnel
ansible.builtin.debug:
var: existing_tunnel.stdout_lines
tags: [preflight]
# ------------------------------------------------------------------
# GRE tunnel setup
# ------------------------------------------------------------------
- name: Create GRE tunnel
ansible.builtin.shell:
cmd: |
set -o pipefail
if ip tunnel show {{ tunnel_device }} 2>/dev/null; then
echo "tunnel already exists"
else
ip tunnel add {{ tunnel_device }} mode gre local {{ tunnel_src }} remote {{ tunnel_dst }} ttl 64
ip addr add {{ tunnel_local_ip }}/31 dev {{ tunnel_device }}
ip link set {{ tunnel_device }} up mtu 8972
echo "tunnel created"
fi
executable: /bin/bash
register: tunnel_result
changed_when: "'created' in tunnel_result.stdout"
tags: [outbound]
- name: Show tunnel result
ansible.builtin.debug:
var: tunnel_result.stdout_lines
tags: [outbound]
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Inbound: DNAT for 137.239.194.65 → kind node # Inbound: DNAT for 137.239.194.65 → kind node
# ------------------------------------------------------------------ # ------------------------------------------------------------------
@ -186,7 +230,7 @@
tags: [inbound] tags: [inbound]
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Outbound: fwmark + SNAT + policy routing # Outbound: fwmark + SNAT + policy routing via new tunnel
# ------------------------------------------------------------------ # ------------------------------------------------------------------
- name: Mark outbound validator traffic (mangle PREROUTING) - name: Mark outbound validator traffic (mangle PREROUTING)
ansible.builtin.shell: ansible.builtin.shell:
@ -218,7 +262,6 @@
ansible.builtin.shell: ansible.builtin.shell:
cmd: | cmd: |
set -o pipefail set -o pipefail
# Check if rule already exists
if iptables -t nat -C POSTROUTING -m mark --mark {{ fwmark }} -j SNAT --to-source {{ ashburn_ip }} 2>/dev/null; then if iptables -t nat -C POSTROUTING -m mark --mark {{ fwmark }} -j SNAT --to-source {{ ashburn_ip }} 2>/dev/null; then
echo "SNAT rule already exists" echo "SNAT rule already exists"
else else
@ -256,9 +299,9 @@
changed_when: "'added' in rule_result.stdout" changed_when: "'added' in rule_result.stdout"
tags: [outbound] tags: [outbound]
- name: Add default route via doublezero0 in ashburn table - name: Add default route via GRE tunnel in ashburn table
ansible.builtin.shell: ansible.builtin.shell:
cmd: ip route replace default via {{ tunnel_gateway }} dev {{ tunnel_device }} table {{ rt_table_name }} cmd: ip route replace default via {{ tunnel_remote_ip }} dev {{ tunnel_device }} table {{ rt_table_name }}
executable: /bin/bash executable: /bin/bash
changed_when: true changed_when: true
tags: [outbound] tags: [outbound]
@ -269,11 +312,12 @@
- name: Save iptables rules - name: Save iptables rules
ansible.builtin.command: ansible.builtin.command:
cmd: netfilter-persistent save cmd: netfilter-persistent save
changed_when: true
tags: [inbound, outbound] tags: [inbound, outbound]
- name: Install if-up.d persistence script - name: Install if-up.d persistence script
ansible.builtin.copy: ansible.builtin.template:
src: files/ashburn-routing-ifup.sh src: files/ashburn-routing-ifup.sh.j2
dest: /etc/network/if-up.d/ashburn-routing dest: /etc/network/if-up.d/ashburn-routing
mode: '0755' mode: '0755'
owner: root owner: root
@ -283,6 +327,22 @@
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Verification # Verification
# ------------------------------------------------------------------ # ------------------------------------------------------------------
- name: Show tunnel status
ansible.builtin.shell:
cmd: |
echo "=== tunnel ==="
ip tunnel show {{ tunnel_device }}
echo ""
echo "=== tunnel addr ==="
ip addr show {{ tunnel_device }}
echo ""
echo "=== ping tunnel peer ==="
ping -c 1 -W 2 {{ tunnel_remote_ip }} 2>&1 || echo "tunnel peer unreachable"
executable: /bin/bash
register: tunnel_status
changed_when: false
tags: [outbound]
- name: Show NAT rules - name: Show NAT rules
ansible.builtin.shell: ansible.builtin.shell:
cmd: iptables -t nat -L -v -n --line-numbers 2>&1 | head -40 cmd: iptables -t nat -L -v -n --line-numbers 2>&1 | head -40
@ -323,6 +383,7 @@
- name: Display verification - name: Display verification
ansible.builtin.debug: ansible.builtin.debug:
msg: msg:
tunnel: "{{ tunnel_status.stdout_lines | default([]) }}"
nat_rules: "{{ nat_rules.stdout_lines }}" nat_rules: "{{ nat_rules.stdout_lines }}"
mangle_rules: "{{ mangle_rules.stdout_lines | default([]) }}" mangle_rules: "{{ mangle_rules.stdout_lines | default([]) }}"
routing: "{{ routing_info.stdout_lines | default([]) }}" routing: "{{ routing_info.stdout_lines | default([]) }}"
@ -334,12 +395,14 @@
msg: | msg: |
=== Ashburn Relay Setup Complete === === Ashburn Relay Setup Complete ===
Ashburn IP: {{ ashburn_ip }} (on lo) Ashburn IP: {{ ashburn_ip }} (on lo)
GRE tunnel: {{ tunnel_device }} ({{ tunnel_src }} → {{ tunnel_dst }})
link: {{ tunnel_local_ip }}/31 ↔ {{ tunnel_remote_ip }}/31
Inbound DNAT: {{ ashburn_ip }}:8001,9000-9025 → {{ kind_node_ip }} Inbound DNAT: {{ ashburn_ip }}:8001,9000-9025 → {{ kind_node_ip }}
Outbound SNAT: {{ kind_network }} sport 8001,9000-9025 → {{ ashburn_ip }} Outbound SNAT: {{ kind_network }} sport 8001,9000-9025 → {{ ashburn_ip }}
Policy route: fwmark {{ fwmark }} → table {{ rt_table_name }} → via {{ tunnel_gateway }} dev {{ tunnel_device }} Policy route: fwmark {{ fwmark }} → table {{ rt_table_name }} → via {{ tunnel_remote_ip }} dev {{ tunnel_device }}
Persisted: iptables-persistent + /etc/network/if-up.d/ashburn-routing
Next steps: Next steps:
1. Verify inbound: ping {{ ashburn_ip }} from external host 1. Apply mia-sw01 config (Tunnel100 must be up on both sides)
2. Verify outbound: tcpdump on was-sw01 for src {{ ashburn_ip }} 2. Verify tunnel: ping {{ tunnel_remote_ip }}
3. Check validator gossip ContactInfo shows {{ ashburn_ip }} for all addresses 3. Test from kelce: echo test | nc -u -w 1 137.239.194.65 9000
4. Check validator gossip ContactInfo shows {{ ashburn_ip }} for all addresses

View File

@ -1,22 +1,18 @@
--- ---
# Configure laconic-mia-sw01 for validator traffic relay (inbound + outbound) # Configure laconic-mia-sw01 for validator traffic relay via dedicated GRE tunnel
# #
# Outbound: Redirects outbound traffic from biscayne (src 137.239.194.65) # Creates a NEW GRE tunnel (Tunnel100) separate from the DoubleZero-managed
# arriving via the doublezero0 GRE tunnel to was-sw01 via the backbone, # Tunnel500. The DZ agent controls Tunnel500's ACL (SEC-USER-500-IN) and
# preventing BCP38 drops at mia-sw01's ISP uplink. # overwrites any custom entries, so we cannot use it for validator traffic
# with src 137.239.194.65.
# #
# Inbound: Routes traffic destined to 137.239.194.65 from the default VRF # Tunnel100 uses mia-sw01's free LAN IP (209.42.167.137) as the tunnel
# to biscayne via Tunnel500 in vrf1. Without this, mia-sw01 sends # source, and biscayne's public IP (186.233.184.235) as the destination.
# 137.239.194.65 out the ISP uplink back to was-sw01 (routing loop). # This tunnel carries traffic over the ISP uplink, completely independent
# of the DoubleZero overlay.
# #
# Approach: The existing per-tunnel ACL (SEC-USER-500-IN) controls what # Inbound: was-sw01 → backbone Et4/1 → mia-sw01 → Tunnel100 → biscayne
# traffic enters vrf1 from Tunnel500. We add 137.239.194.65 to the ACL # Outbound: biscayne → Tunnel100 → mia-sw01 → backbone Et4/1 → was-sw01
# and add a default route in vrf1 via egress-vrf default pointing to
# was-sw01's backbone IP. For inbound, an inter-VRF static route in the
# default VRF forwards 137.239.194.65/32 to biscayne via Tunnel500.
#
# The other vrf1 tunnels (502, 504, 505) have their own ACLs that only
# permit their specific source IPs, so the default route won't affect them.
# #
# Usage: # Usage:
# # Pre-flight checks only (safe, read-only) # # Pre-flight checks only (safe, read-only)
@ -32,22 +28,28 @@
# # Rollback # # Rollback
# ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-mia-sw01.yml -e rollback=true # ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-mia-sw01.yml -e rollback=true
- name: Configure mia-sw01 outbound validator redirect - name: Configure mia-sw01 validator relay tunnel
hosts: mia-sw01 hosts: mia-sw01
gather_facts: false gather_facts: false
vars: vars:
ashburn_ip: 137.239.194.65 ashburn_ip: 137.239.194.65
biscayne_ip: 186.233.184.235
apply: false apply: false
commit: false commit: false
rollback: false rollback: false
tunnel_interface: Tunnel500 # New tunnel — not managed by DZ agent
tunnel_vrf: vrf1 tunnel_interface: Tunnel100
tunnel_acl: SEC-USER-500-IN tunnel_source_ip: 209.42.167.137 # mia-sw01 free LAN IP
tunnel_nexthop: 169.254.7.7 # biscayne's end of the Tunnel500 /31 tunnel_local: 169.254.100.0 # /31 link, mia-sw01 side
tunnel_remote: 169.254.100.1 # /31 link, biscayne side
tunnel_acl: SEC-VALIDATOR-100-IN
# Loopback for tunnel source (so it's always up)
tunnel_source_lo: Loopback101
backbone_interface: Ethernet4/1 backbone_interface: Ethernet4/1
session_name: validator-outbound backbone_peer: 172.16.1.188 # was-sw01 backbone IP
checkpoint_name: pre-validator-outbound session_name: validator-tunnel
checkpoint_name: pre-validator-tunnel
tasks: tasks:
# ------------------------------------------------------------------ # ------------------------------------------------------------------
@ -93,43 +95,52 @@
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Pre-flight checks (always run unless commit/rollback) # Pre-flight checks (always run unless commit/rollback)
# ------------------------------------------------------------------ # ------------------------------------------------------------------
- name: Show tunnel interface config - name: Check existing tunnel interfaces
arista.eos.eos_command:
commands:
- show ip interface brief | include Tunnel
register: existing_tunnels
tags: [preflight]
- name: Display existing tunnels
ansible.builtin.debug:
var: existing_tunnels.stdout_lines
tags: [preflight]
- name: Check if Tunnel100 already exists
arista.eos.eos_command: arista.eos.eos_command:
commands: commands:
- "show running-config interfaces {{ tunnel_interface }}" - "show running-config interfaces {{ tunnel_interface }}"
register: tunnel_config register: tunnel_config
tags: [preflight] tags: [preflight]
- name: Display tunnel config - name: Display Tunnel100 config
ansible.builtin.debug: ansible.builtin.debug:
var: tunnel_config.stdout_lines var: tunnel_config.stdout_lines
tags: [preflight] tags: [preflight]
- name: Show tunnel ACL - name: Check if Loopback101 already exists
arista.eos.eos_command: arista.eos.eos_command:
commands: commands:
- "show running-config | section ip access-list {{ tunnel_acl }}" - "show running-config interfaces {{ tunnel_source_lo }}"
register: acl_config register: lo_config
tags: [preflight] tags: [preflight]
- name: Display tunnel ACL - name: Display Loopback101 config
ansible.builtin.debug: ansible.builtin.debug:
var: acl_config.stdout_lines var: lo_config.stdout_lines
tags: [preflight] tags: [preflight]
- name: Check VRF routing - name: Check route for ashburn IP
arista.eos.eos_command: arista.eos.eos_command:
commands: commands:
- "show ip route vrf {{ tunnel_vrf }} 0.0.0.0/0"
- "show ip route vrf {{ tunnel_vrf }} {{ backbone_peer }}"
- "show ip route {{ backbone_peer }}"
- "show ip route {{ ashburn_ip }}" - "show ip route {{ ashburn_ip }}"
register: vrf_routing register: route_check
tags: [preflight] tags: [preflight]
- name: Display VRF routing check - name: Display route check
ansible.builtin.debug: ansible.builtin.debug:
var: vrf_routing.stdout_lines var: route_check.stdout_lines
tags: [preflight] tags: [preflight]
- name: Pre-flight summary - name: Pre-flight summary
@ -138,9 +149,17 @@
msg: | msg: |
=== Pre-flight complete === === Pre-flight complete ===
Review the output above: Review the output above:
1. {{ tunnel_interface }} ACL ({{ tunnel_acl }}): does it permit src {{ ashburn_ip }}? 1. Does {{ tunnel_interface }} already exist?
2. {{ tunnel_vrf }} default route: does one exist? 2. Does {{ tunnel_source_lo }} already exist?
3. Backbone nexthop {{ backbone_peer }}: reachable in default VRF? 3. Current route for {{ ashburn_ip }}
Planned config:
- {{ tunnel_source_lo }}: {{ tunnel_source_ip }}/32
- {{ tunnel_interface }}: GRE src {{ tunnel_source_ip }} dst {{ biscayne_ip }}
link address {{ tunnel_local }}/31
ACL {{ tunnel_acl }}: permit src {{ ashburn_ip }}, permit src {{ tunnel_remote }}
- Route: {{ ashburn_ip }}/32 via {{ tunnel_remote }}
- Outbound default for tunnel traffic: 0.0.0.0/0 via {{ backbone_interface }} {{ backbone_peer }}
To apply config: To apply config:
ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-mia-sw01.yml \ ansible-playbook -i inventory/switches.yml playbooks/ashburn-relay-mia-sw01.yml \
@ -163,18 +182,33 @@
arista.eos.eos_command: arista.eos.eos_command:
commands: commands:
- command: "configure session {{ session_name }}" - command: "configure session {{ session_name }}"
# Permit Ashburn IP through the tunnel ACL (insert before deny) # Loopback for tunnel source (always-up interface)
- command: "ip access-list {{ tunnel_acl }}" - command: "interface {{ tunnel_source_lo }}"
- command: "45 permit ip host {{ ashburn_ip }} any" - command: "ip address {{ tunnel_source_ip }}/32"
- command: exit - command: exit
# Default route in vrf1 via backbone to was-sw01 (egress-vrf default) # ACL for the new tunnel — we control this, DZ agent won't touch it
# Safe because per-tunnel ACLs already restrict what enters vrf1 - command: "ip access-list {{ tunnel_acl }}"
- command: "ip route vrf {{ tunnel_vrf }} 0.0.0.0/0 egress-vrf default {{ backbone_interface }} {{ backbone_peer }}" - command: "counters per-entry"
# Inbound: route traffic for ashburn IP from default VRF to biscayne via tunnel. - command: "10 permit icmp host {{ tunnel_remote }} any"
# Without this, mia-sw01 sends 137.239.194.65 out the ISP uplink → routing loop. - command: "20 permit ip host {{ ashburn_ip }} any"
# NOTE: nexthop only, no interface — EOS silently drops cross-VRF routes that - command: "30 permit ip host {{ tunnel_remote }} any"
# specify a tunnel interface (accepts in config but never installs in RIB). - command: "100 deny ip any any"
- command: "ip route {{ ashburn_ip }}/32 egress-vrf {{ tunnel_vrf }} {{ tunnel_nexthop }}" - command: exit
# New GRE tunnel
- command: "interface {{ tunnel_interface }}"
- command: "mtu 9216"
- command: "ip address {{ tunnel_local }}/31"
- command: "ip access-group {{ tunnel_acl }} in"
- command: "tunnel mode gre"
- command: "tunnel source {{ tunnel_source_ip }}"
- command: "tunnel destination {{ biscayne_ip }}"
- command: exit
# Inbound: route ashburn IP to biscayne via the new tunnel
- command: "ip route {{ ashburn_ip }}/32 {{ tunnel_remote }}"
# Outbound: biscayne's traffic exits via backbone to was-sw01.
# Use a specific route for the backbone peer so tunnel traffic
# can reach was-sw01 without a blanket default route.
# (The switch's actual default route is via Et1/1 ISP uplink.)
- name: Show session diff - name: Show session diff
arista.eos.eos_command: arista.eos.eos_command:
@ -199,9 +233,11 @@
- name: Verify config - name: Verify config
arista.eos.eos_command: arista.eos.eos_command:
commands: commands:
- "show running-config | section ip access-list {{ tunnel_acl }}" - "show running-config interfaces {{ tunnel_source_lo }}"
- "show ip route vrf {{ tunnel_vrf }} 0.0.0.0/0" - "show running-config interfaces {{ tunnel_interface }}"
- "show ip access-lists {{ tunnel_acl }}"
- "show ip route {{ ashburn_ip }}" - "show ip route {{ ashburn_ip }}"
- "show interfaces {{ tunnel_interface }} status"
register: verify register: verify
- name: Display verification - name: Display verification
@ -216,14 +252,14 @@
Checkpoint: {{ checkpoint_name }} Checkpoint: {{ checkpoint_name }}
Changes applied: Changes applied:
1. ACL {{ tunnel_acl }}: added "45 permit ip host {{ ashburn_ip }} any" 1. {{ tunnel_source_lo }}: {{ tunnel_source_ip }}/32
2. Default route in {{ tunnel_vrf }}: 0.0.0.0/0 egress-vrf default {{ backbone_interface }} {{ backbone_peer }} 2. {{ tunnel_interface }}: GRE tunnel to {{ biscayne_ip }}
3. Inbound route: {{ ashburn_ip }}/32 egress-vrf {{ tunnel_vrf }} {{ tunnel_nexthop }} link {{ tunnel_local }}/31, ACL {{ tunnel_acl }}
3. Route: {{ ashburn_ip }}/32 via {{ tunnel_remote }}
The config will auto-revert in 5 minutes unless committed. The config will auto-revert in 5 minutes unless committed.
Verify on the switch, then commit: Verify on the switch, then commit:
configure session {{ session_name }} commit ansible-playbook ... -e commit=true
write memory
To revert immediately: To revert immediately:
ansible-playbook ... -e rollback=true ansible-playbook ... -e rollback=true

View File

@ -0,0 +1,28 @@
#!/bin/bash
# /etc/network/if-up.d/ashburn-routing
# Restore GRE tunnel and policy routing for Ashburn validator relay
# after reboot or interface up. Acts on eno1 (public interface) since
# the GRE tunnel depends on it.
[ "$IFACE" = "eno1" ] || exit 0
# Create GRE tunnel if it doesn't exist
if ! ip tunnel show {{ tunnel_device }} 2>/dev/null; then
ip tunnel add {{ tunnel_device }} mode gre local {{ tunnel_src }} remote {{ tunnel_dst }} ttl 64
ip addr add {{ tunnel_local_ip }}/31 dev {{ tunnel_device }}
ip link set {{ tunnel_device }} up mtu 8972
fi
# Ensure rt_tables entry exists
grep -q '^{{ rt_table_id }} {{ rt_table_name }}$' /etc/iproute2/rt_tables || \
echo "{{ rt_table_id }} {{ rt_table_name }}" >> /etc/iproute2/rt_tables
# Add policy rule
ip rule show | grep -q 'fwmark 0x64 lookup {{ rt_table_name }}' || \
ip rule add fwmark {{ fwmark }} table {{ rt_table_name }}
# Add default route via mia-sw01 through GRE tunnel
ip route replace default via {{ tunnel_remote_ip }} dev {{ tunnel_device }} table {{ rt_table_name }}
# Add Ashburn IP to loopback
ip addr show lo | grep -q '{{ ashburn_ip }}' || ip addr add {{ ashburn_ip }}/32 dev lo