fix: switch ramdisk from /dev/ram0 to tmpfs, refactor snapshot-download.py

The /dev/ram0 + XFS + format-ramdisk.service approach was unnecessary
complexity from a migration confusion — there was no actual tmpfs bug
with io_uring. tmpfs is simpler (no format-on-boot), resizable on the
fly, and what every other Solana operator uses.

Changes:
- prepare-agave: remove format-ramdisk.service and ramdisk-accounts.service,
  use tmpfs fstab entry with size=1024G (was 600G /dev/ram0, too small)
- recover: remove ramdisk_device var (no longer needed)
- redeploy: wipe accounts by rm -rf instead of umount+mkfs
- snapshot-download.py: extract download_best_snapshot() public API for
  use by the new container entrypoint.py (in agave-stack)
- CLAUDE.md: update ramdisk docs, fix /srv/solana → /srv/kind/solana paths
- health-check: fix ramdisk path references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix/kind-mount-propagation
A. F. Dudley 2026-03-08 18:43:41 +00:00
parent 591d158e1f
commit b2342bc539
8 changed files with 314 additions and 302 deletions

1
.gitignore vendored
View File

@ -1,3 +1,4 @@
.venv/ .venv/
sessions.duckdb sessions.duckdb
sessions.duckdb.wal sessions.duckdb.wal
.worktrees

View File

@ -14,9 +14,8 @@ below it are correct. Playbooks belong to exactly one layer.
| 5. Deploy agave | Deploy agave-stack into kind, snapshot download, scale up | `biscayne-redeploy.yml` (snapshot/verify tags), `biscayne-recover.yml` | | 5. Deploy agave | Deploy agave-stack into kind, snapshot download, scale up | `biscayne-redeploy.yml` (snapshot/verify tags), `biscayne-recover.yml` |
**Layer 4 invariants** (asserted by `biscayne-prepare-agave.yml`): **Layer 4 invariants** (asserted by `biscayne-prepare-agave.yml`):
- `/srv/solana` is XFS on a zvol — agave uses io_uring which deadlocks on ZFS - `/srv/kind/solana` is XFS on a zvol — agave uses io_uring which deadlocks on ZFS. `/srv/solana` is NOT the zvol (it's a ZFS dataset directory); never use it for data paths
- `/srv/solana/ramdisk` is XFS on `/dev/ram0` — accounts must be on ramdisk - `/srv/kind/solana/ramdisk` is tmpfs (1TB) — accounts must be in RAM
- `/srv/kind/solana` is an rbind of `/srv/solana` — makes the zvol visible to kind at `/mnt/solana`
These invariants are checked at runtime and persisted to fstab/systemd so they These invariants are checked at runtime and persisted to fstab/systemd so they
survive reboot. They are agave's requirements reaching into the boot sequence, survive reboot. They are agave's requirements reaching into the boot sequence,
@ -58,18 +57,13 @@ Correct shutdown sequence:
### Ramdisk ### Ramdisk
The accounts directory must be on a ramdisk for performance. `/dev/ram0` loses its The accounts directory must be in RAM for performance. tmpfs is used instead of
filesystem on reboot and must be reformatted before mounting. `/dev/ram0` — simpler (no format-on-boot service needed), resizable on the fly
with `mount -o remount,size=<new>`, and what most Solana operators use.
**Boot ordering is handled by systemd units** (installed by `biscayne-prepare-agave.yml`): **Boot ordering**: fstab entry mounts tmpfs at `/srv/kind/solana/ramdisk` with
- `format-ramdisk.service`: runs `mkfs.xfs -f /dev/ram0` before `local-fs.target` `x-systemd.requires=srv-kind-solana.mount`. tmpfs mounts natively via fstab —
- fstab entry: mounts `/dev/ram0` at `/srv/solana/ramdisk` with no systemd format service needed. **No manual intervention after reboot.**
`x-systemd.requires=format-ramdisk.service`
- `ramdisk-accounts.service`: creates `/srv/solana/ramdisk/accounts` and sets
ownership after the mount
These units run before docker, so the kind node's bind mounts always see the
ramdisk. **No manual intervention is needed after reboot.**
**Mount propagation**: The kind node bind-mounts `/srv/kind``/mnt` at container **Mount propagation**: The kind node bind-mounts `/srv/kind``/mnt` at container
start. laconic-so sets `propagation: HostToContainer` on all kind extraMounts start. laconic-so sets `propagation: HostToContainer` on all kind extraMounts
@ -139,10 +133,11 @@ kind node via a single bind mount.
- Deployment: `laconic-70ce4c4b47e23b85-deployment` - Deployment: `laconic-70ce4c4b47e23b85-deployment`
- Kind node container: `laconic-70ce4c4b47e23b85-control-plane` - Kind node container: `laconic-70ce4c4b47e23b85-control-plane`
- Deployment dir: `/srv/deployments/agave` - Deployment dir: `/srv/deployments/agave`
- Snapshot dir: `/srv/solana/snapshots` - Snapshot dir: `/srv/kind/solana/snapshots` (on zvol, visible to kind at `/mnt/validator-snapshots`)
- Ledger dir: `/srv/solana/ledger` - Ledger dir: `/srv/kind/solana/ledger` (on zvol, visible to kind at `/mnt/validator-ledger`)
- Accounts dir: `/srv/solana/ramdisk/accounts` - Accounts dir: `/srv/kind/solana/ramdisk/accounts` (on ramdisk `/dev/ram0`, visible to kind at `/mnt/validator-accounts`)
- Log dir: `/srv/solana/log` - Log dir: `/srv/kind/solana/log` (on zvol, visible to kind at `/mnt/validator-log`)
- **WARNING**: `/srv/solana` is a ZFS dataset directory, NOT the zvol. Never use it for data paths.
- Host bind mount root: `/srv/kind` -> kind node `/mnt` - Host bind mount root: `/srv/kind` -> kind node `/mnt`
- laconic-so: `/home/rix/.local/bin/laconic-so` (editable install) - laconic-so: `/home/rix/.local/bin/laconic-so` (editable install)
@ -150,10 +145,10 @@ kind node via a single bind mount.
| PV Name | hostPath | | PV Name | hostPath |
|----------------------|-------------------------------| |----------------------|-------------------------------|
| validator-snapshots | /mnt/solana/snapshots | | validator-snapshots | /mnt/validator-snapshots |
| validator-ledger | /mnt/solana/ledger | | validator-ledger | /mnt/validator-ledger |
| validator-accounts | /mnt/solana/ramdisk/accounts | | validator-accounts | /mnt/validator-accounts |
| validator-log | /mnt/solana/log | | validator-log | /mnt/validator-log |
### Snapshot Freshness ### Snapshot Freshness
@ -164,7 +159,7 @@ try to catch up from an old snapshot — it will take too long and may never con
Check with: Check with:
``` ```
# Snapshot slot (from filename) # Snapshot slot (from filename)
ls /srv/solana/snapshots/snapshot-*.tar.* ls /srv/kind/solana/snapshots/snapshot-*.tar.*
# Current mainnet slot # Current mainnet slot
curl -s -X POST -H "Content-Type: application/json" \ curl -s -X POST -H "Content-Type: application/json" \

View File

@ -10,26 +10,18 @@
# #
# Agave requires three things from the host that kind doesn't provide: # Agave requires three things from the host that kind doesn't provide:
# #
# Invariant 1: /srv/solana is XFS on a zvol (not ZFS) # Invariant 1: /srv/kind/solana is XFS on a zvol (not ZFS)
# Why: agave uses io_uring for async I/O. io_uring workers deadlock on # Why: agave uses io_uring for async I/O. io_uring workers deadlock on
# ZFS datasets (D-state in dsl_dir_tempreserve_space). XFS on a zvol # ZFS datasets (D-state in dsl_dir_tempreserve_space). XFS on a zvol
# (block device) works fine. This is why the data lives on a zvol, not # (block device) works fine. /srv/solana is NOT the zvol — it's a
# a ZFS dataset. # directory on the ZFS dataset biscayne/DATA/srv. All data paths must
# Persisted as: fstab entry mounting /dev/zvol/.../solana at /srv/solana # use /srv/kind/solana which is the actual zvol mount.
# Persisted as: fstab entry mounting /dev/zvol/.../solana at /srv/kind/solana
# #
# Invariant 2: /srv/solana/ramdisk is XFS on /dev/ram0 (600G ramdisk) # Invariant 2: /srv/kind/solana/ramdisk is tmpfs (1TB)
# Why: agave accounts must be on ramdisk for performance. /dev/ram0 # Why: agave accounts must be in RAM for performance. tmpfs survives
# loses its filesystem on reboot, so it must be reformatted before # process restarts but not host reboots (same as /dev/ram0 but simpler).
# mounting each boot. # Persisted as: fstab entry (no format service needed)
# Persisted as: format-ramdisk.service (mkfs before mount) + fstab entry
#
# Invariant 3: /srv/kind/solana is XFS (zvol) and /srv/kind/solana/ramdisk is XFS (ram0)
# Why: kind mounts /srv/kind → /mnt inside the kind node. PVs reference
# /mnt/solana/*. An rbind of /srv/solana does NOT work because ZFS's
# shared propagation (shared:75 on /srv) overlays ZFS on top of the bind.
# Direct device mounts bypass propagation entirely.
# Persisted as: two fstab entries — zvol at /srv/kind/solana, ram0 at
# /srv/kind/solana/ramdisk, both with x-systemd.requires ordering
# #
# This playbook checks each invariant and only acts if it's not met. # This playbook checks each invariant and only acts if it's not met.
# Idempotent — safe to run multiple times. # Idempotent — safe to run multiple times.
@ -42,132 +34,76 @@
gather_facts: false gather_facts: false
become: true become: true
vars: vars:
ramdisk_device: /dev/ram0
zvol_device: /dev/zvol/biscayne/DATA/volumes/solana zvol_device: /dev/zvol/biscayne/DATA/volumes/solana
solana_dir: /srv/solana
ramdisk_mount: /srv/solana/ramdisk
kind_solana_dir: /srv/kind/solana kind_solana_dir: /srv/kind/solana
accounts_dir: /srv/solana/ramdisk/accounts ramdisk_mount: /srv/kind/solana/ramdisk
ramdisk_size: 1024G
accounts_dir: /srv/kind/solana/ramdisk/accounts
deployment_dir: /srv/deployments/agave deployment_dir: /srv/deployments/agave
kind_ramdisk_opts: "noatime,nodiratime,nofail,x-systemd.requires=format-ramdisk.service,x-systemd.requires=srv-kind-solana.mount"
tasks: tasks:
# ---- systemd units ---------------------------------------------------------- # ---- cleanup legacy ramdisk services -----------------------------------------
- name: Install ramdisk format service - name: Stop and disable legacy ramdisk services
ansible.builtin.copy: ansible.builtin.systemd:
dest: /etc/systemd/system/format-ramdisk.service name: "{{ item }}"
mode: "0644" state: stopped
content: | enabled: false
[Unit] loop:
Description=Format /dev/ram0 as XFS for Solana accounts - format-ramdisk.service
DefaultDependencies=no - ramdisk-accounts.service
Before=local-fs.target failed_when: false
After=systemd-modules-load.service
ConditionPathExists={{ ramdisk_device }}
[Service] - name: Remove legacy ramdisk service files
Type=oneshot ansible.builtin.file:
RemainAfterExit=yes path: "/etc/systemd/system/{{ item }}"
ExecStart=/sbin/mkfs.xfs -f {{ ramdisk_device }} state: absent
loop:
[Install] - format-ramdisk.service
WantedBy=local-fs.target - ramdisk-accounts.service
register: unit_file register: legacy_units_removed
- name: Install ramdisk post-mount service
ansible.builtin.copy:
dest: /etc/systemd/system/ramdisk-accounts.service
mode: "0644"
content: |
[Unit]
Description=Create Solana accounts directory on ramdisk
After=srv-solana-ramdisk.mount
Requires=srv-solana-ramdisk.mount
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash -c 'mkdir -p {{ accounts_dir }} && chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}'
[Install]
WantedBy=multi-user.target
register: accounts_unit
# ---- fstab entries ---------------------------------------------------------- # ---- fstab entries ----------------------------------------------------------
- name: Ensure zvol fstab entry # /srv/solana is NOT the zvol — it's a directory on the ZFS dataset.
# All data paths use /srv/kind/solana (the actual zvol mount).
- name: Remove stale /srv/solana zvol fstab entry
ansible.builtin.lineinfile: ansible.builtin.lineinfile:
path: /etc/fstab path: /etc/fstab
regexp: '^\S+\s+{{ solana_dir }}\s' regexp: '^\S+\s+/srv/solana\s+xfs'
line: '{{ zvol_device }} {{ solana_dir }} xfs defaults 0 2' state: absent
register: fstab_zvol
- name: Ensure ramdisk fstab entry - name: Remove stale /srv/solana/ramdisk fstab entry
ansible.builtin.lineinfile: ansible.builtin.lineinfile:
path: /etc/fstab path: /etc/fstab
regexp: '^{{ ramdisk_device }}\s+{{ ramdisk_mount }}\s' regexp: '^/dev/ram0\s+'
line: '{{ ramdisk_device }} {{ ramdisk_mount }} xfs noatime,nodiratime,nofail,x-systemd.requires=format-ramdisk.service 0 0' state: absent
register: fstab_ramdisk
# Direct device mounts at /srv/kind/solana — bypasses ZFS shared propagation.
# An rbind of /srv/solana fails because ZFS's shared:75 on /srv overlays
# ZFS on top of any bind mount under /srv. Direct device mounts avoid this.
- name: Ensure kind zvol fstab entry
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: '^\S+\s+{{ kind_solana_dir }}\s'
line: '{{ zvol_device }} {{ kind_solana_dir }} xfs defaults,nofail,x-systemd.requires=zfs-mount.service 0 0'
register: fstab_kind
- name: Ensure kind ramdisk fstab entry
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: '^\S+\s+{{ kind_solana_dir }}/ramdisk\s'
line: "{{ ramdisk_device }} {{ kind_solana_dir }}/ramdisk xfs {{ kind_ramdisk_opts }} 0 0"
register: fstab_kind_ramdisk
# Remove stale rbind fstab entry from previous approach
- name: Remove stale kind rbind fstab entry - name: Remove stale kind rbind fstab entry
ansible.builtin.lineinfile: ansible.builtin.lineinfile:
path: /etc/fstab path: /etc/fstab
regexp: '^\S+\s+{{ kind_solana_dir }}\s+none\s+rbind' regexp: '^\S+\s+{{ kind_solana_dir }}\s+none\s+rbind'
state: absent state: absent
register: fstab_stale_rbind
# ---- reload and enable ------------------------------------------------------ - name: Ensure zvol fstab entry
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: '^\S+\s+{{ kind_solana_dir }}\s'
line: '{{ zvol_device }} {{ kind_solana_dir }} xfs defaults,nofail,x-systemd.requires=zfs-mount.service 0 0'
register: fstab_zvol
- name: Ensure tmpfs ramdisk fstab entry
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: '^\S+\s+{{ ramdisk_mount }}\s'
line: "tmpfs {{ ramdisk_mount }} tmpfs nodev,nosuid,noexec,nodiratime,size={{ ramdisk_size }},nofail,x-systemd.requires=srv-kind-solana.mount 0 0"
register: fstab_ramdisk
# ---- reload systemd if anything changed --------------------------------------
- name: Reload systemd - name: Reload systemd
ansible.builtin.systemd: ansible.builtin.systemd:
daemon_reload: true daemon_reload: true
when: >- when: legacy_units_removed.changed or fstab_zvol.changed or fstab_ramdisk.changed
unit_file.changed or accounts_unit.changed or
fstab_zvol.changed or fstab_ramdisk.changed or
fstab_kind.changed or fstab_kind_ramdisk.changed or
fstab_stale_rbind.changed
- name: Enable ramdisk services # ---- apply device mounts now if not correct ----------------------------------
ansible.builtin.systemd:
name: "{{ item }}"
enabled: true
loop:
- format-ramdisk.service
- ramdisk-accounts.service
# ---- apply now if ramdisk not mounted --------------------------------------
- name: Check if ramdisk is mounted
ansible.builtin.command: mountpoint -q {{ ramdisk_mount }}
register: ramdisk_mounted
failed_when: false
changed_when: false
- name: Format and mount ramdisk now
ansible.builtin.shell: |
mkfs.xfs -f {{ ramdisk_device }}
mount {{ ramdisk_mount }}
mkdir -p {{ accounts_dir }}
chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}
changed_when: ramdisk_mounted.rc != 0
when: ramdisk_mounted.rc != 0
# ---- apply kind device mounts now if not correct ----------------------------
- name: Check kind zvol mount is XFS - name: Check kind zvol mount is XFS
ansible.builtin.shell: ansible.builtin.shell:
cmd: > cmd: >
@ -178,16 +114,16 @@
failed_when: false failed_when: false
changed_when: false changed_when: false
- name: Unmount stale kind mounts - name: Unmount stale mounts
ansible.builtin.shell: ansible.builtin.shell:
cmd: | cmd: |
umount {{ kind_solana_dir }}/ramdisk 2>/dev/null || true umount {{ ramdisk_mount }} 2>/dev/null || true
umount {{ kind_solana_dir }} 2>/dev/null || true umount {{ kind_solana_dir }} 2>/dev/null || true
executable: /bin/bash executable: /bin/bash
changed_when: kind_zvol_check.rc != 0 changed_when: kind_zvol_check.rc != 0
when: kind_zvol_check.rc != 0 when: kind_zvol_check.rc != 0
- name: Mount zvol at kind solana dir - name: Mount zvol
ansible.posix.mount: ansible.posix.mount:
path: "{{ kind_solana_dir }}" path: "{{ kind_solana_dir }}"
src: "{{ zvol_device }}" src: "{{ zvol_device }}"
@ -195,24 +131,32 @@
state: mounted state: mounted
when: kind_zvol_check.rc != 0 when: kind_zvol_check.rc != 0
- name: Check kind ramdisk mount is XFS - name: Check ramdisk mount is tmpfs
ansible.builtin.shell: ansible.builtin.shell:
cmd: > cmd: >
set -o pipefail && set -o pipefail &&
findmnt -n -o FSTYPE {{ kind_solana_dir }}/ramdisk | grep -q xfs findmnt -n -o FSTYPE {{ ramdisk_mount }} | grep -q tmpfs
executable: /bin/bash executable: /bin/bash
register: kind_ramdisk_check register: ramdisk_check
failed_when: false failed_when: false
changed_when: false changed_when: false
- name: Mount ramdisk at kind solana ramdisk dir - name: Mount tmpfs ramdisk
ansible.posix.mount: ansible.posix.mount:
path: "{{ kind_solana_dir }}/ramdisk" path: "{{ ramdisk_mount }}"
src: "{{ ramdisk_device }}" src: tmpfs
fstype: xfs fstype: tmpfs
opts: noatime,nodiratime opts: "nodev,nosuid,noexec,nodiratime,size={{ ramdisk_size }}"
state: mounted state: mounted
when: kind_ramdisk_check.rc != 0 when: ramdisk_check.rc != 0
- name: Create accounts directory
ansible.builtin.file:
path: "{{ accounts_dir }}"
state: directory
owner: solana
group: solana
mode: "0755"
# Docker requires shared propagation on mounts it bind-mounts into # Docker requires shared propagation on mounts it bind-mounts into
# containers. Without this, `docker start` fails with "not a shared # containers. Without this, `docker start` fails with "not a shared
@ -227,36 +171,24 @@
changed_when: false changed_when: false
# ---- verify ----------------------------------------------------------------- # ---- verify -----------------------------------------------------------------
- name: Verify ramdisk is XFS
ansible.builtin.shell:
cmd: set -o pipefail && df -T {{ ramdisk_mount }} | grep -q xfs
executable: /bin/bash
changed_when: false
- name: Verify zvol is XFS - name: Verify zvol is XFS
ansible.builtin.shell:
cmd: set -o pipefail && df -T {{ solana_dir }} | grep -q xfs
executable: /bin/bash
changed_when: false
- name: Verify kind zvol is XFS
ansible.builtin.shell: ansible.builtin.shell:
cmd: set -o pipefail && df -T {{ kind_solana_dir }} | grep -q xfs cmd: set -o pipefail && df -T {{ kind_solana_dir }} | grep -q xfs
executable: /bin/bash executable: /bin/bash
changed_when: false changed_when: false
- name: Verify kind ramdisk is XFS - name: Verify ramdisk is tmpfs
ansible.builtin.shell: ansible.builtin.shell:
cmd: set -o pipefail && df -T {{ kind_solana_dir }}/ramdisk | grep -q xfs cmd: set -o pipefail && df -T {{ ramdisk_mount }} | grep -q tmpfs
executable: /bin/bash executable: /bin/bash
changed_when: false changed_when: false
- name: Verify kind mount contents - name: Verify mount contents
ansible.builtin.shell: ansible.builtin.shell:
cmd: > cmd: >
set -o pipefail && set -o pipefail &&
ls {{ kind_solana_dir }}/ledger {{ kind_solana_dir }}/snapshots ls {{ kind_solana_dir }}/ledger {{ kind_solana_dir }}/snapshots
{{ kind_solana_dir }}/ramdisk/accounts 2>&1 | head -5 {{ ramdisk_mount }}/accounts 2>&1 | head -5
executable: /bin/bash executable: /bin/bash
register: kind_mount_verify register: kind_mount_verify
changed_when: false changed_when: false
@ -273,13 +205,12 @@
register: cluster_id_result register: cluster_id_result
changed_when: false changed_when: false
- name: Check kind node XFS visibility - name: Check kind node filesystem visibility
ansible.builtin.shell: ansible.builtin.shell:
cmd: > cmd: >
set -o pipefail && set -o pipefail &&
docker exec {{ cluster_id_result.stdout }}-control-plane docker exec {{ cluster_id_result.stdout }}-control-plane
df -T /mnt/validator-ledger /mnt/validator-accounts df -T /mnt/validator-ledger /mnt/validator-accounts
| grep -c xfs
executable: /bin/bash executable: /bin/bash
register: kind_fstype register: kind_fstype
changed_when: false changed_when: false
@ -289,7 +220,7 @@
ansible.builtin.debug: ansible.builtin.debug:
msg: msg:
kind_mount: "{{ kind_mount_verify.stdout_lines }}" kind_mount: "{{ kind_mount_verify.stdout_lines }}"
kind_fstype: "{{ 'xfs (correct)' if kind_fstype.stdout | default('0') | int >= 2 else 'NOT XFS — kind restart required' }}" kind_fstype: "{{ kind_fstype.stdout_lines | default([]) }}"
- name: Configure Ashburn validator relay - name: Configure Ashburn validator relay
ansible.builtin.import_playbook: ashburn-relay-biscayne.yml ansible.builtin.import_playbook: ashburn-relay-biscayne.yml

View File

@ -33,10 +33,9 @@
kind_cluster: laconic-70ce4c4b47e23b85 kind_cluster: laconic-70ce4c4b47e23b85
k8s_namespace: "laconic-{{ kind_cluster }}" k8s_namespace: "laconic-{{ kind_cluster }}"
deployment_name: "{{ kind_cluster }}-deployment" deployment_name: "{{ kind_cluster }}-deployment"
snapshot_dir: /srv/solana/snapshots snapshot_dir: /srv/kind/solana/snapshots
accounts_dir: /srv/solana/ramdisk/accounts accounts_dir: /srv/kind/solana/ramdisk/accounts
ramdisk_mount: /srv/solana/ramdisk ramdisk_mount: /srv/kind/solana/ramdisk
ramdisk_device: /dev/ram0
snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py" snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py"
snapshot_script: /tmp/snapshot-download.py snapshot_script: /tmp/snapshot-download.py
snapshot_args: "" snapshot_args: ""

View File

@ -57,11 +57,11 @@
kind_cluster: laconic-70ce4c4b47e23b85 kind_cluster: laconic-70ce4c4b47e23b85
k8s_namespace: "laconic-{{ kind_cluster }}" k8s_namespace: "laconic-{{ kind_cluster }}"
deployment_name: "{{ kind_cluster }}-deployment" deployment_name: "{{ kind_cluster }}-deployment"
snapshot_dir: /srv/solana/snapshots snapshot_dir: /srv/kind/solana/snapshots
ledger_dir: /srv/solana/ledger ledger_dir: /srv/kind/solana/ledger
accounts_dir: /srv/solana/ramdisk/accounts accounts_dir: /srv/kind/solana/ramdisk/accounts
ramdisk_mount: /srv/solana/ramdisk ramdisk_mount: /srv/kind/solana/ramdisk
ramdisk_device: /dev/ram0 ramdisk_size: 1024G
snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py" snapshot_script_local: "{{ playbook_dir }}/../scripts/snapshot-download.py"
snapshot_script: /tmp/snapshot-download.py snapshot_script: /tmp/snapshot-download.py
# Flags — non-destructive by default # Flags — non-destructive by default
@ -139,12 +139,9 @@
when: wipe_ledger | bool when: wipe_ledger | bool
tags: [wipe] tags: [wipe]
- name: Wipe accounts ramdisk (umount + mkfs.xfs + mount) - name: Wipe accounts ramdisk
ansible.builtin.shell: | ansible.builtin.shell: |
set -o pipefail rm -rf {{ accounts_dir }}/*
mountpoint -q {{ ramdisk_mount }} && umount {{ ramdisk_mount }} || true
mkfs.xfs -f {{ ramdisk_device }}
mount {{ ramdisk_mount }}
mkdir -p {{ accounts_dir }} mkdir -p {{ accounts_dir }}
chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }} chown solana:solana {{ ramdisk_mount }} {{ accounts_dir }}
become: true become: true

View File

@ -6,7 +6,7 @@
# #
# Prerequisites: # Prerequisites:
# - biscayne-prepare-agave.yml has been run (fstab entries, systemd units) # - biscayne-prepare-agave.yml has been run (fstab entries, systemd units)
# - A snapshot exists in /srv/solana/snapshots (or use biscayne-recover.yml) # - A snapshot exists in /srv/kind/solana/snapshots (or use biscayne-recover.yml)
# #
# Usage: # Usage:
# ansible-playbook playbooks/biscayne-start.yml # ansible-playbook playbooks/biscayne-start.yml

View File

@ -211,7 +211,7 @@
# ------------------------------------------------------------------ # ------------------------------------------------------------------
- name: Check ramdisk usage - name: Check ramdisk usage
ansible.builtin.command: ansible.builtin.command:
cmd: df -h /srv/solana/ramdisk cmd: df -h /srv/kind/solana/ramdisk
register: ramdisk_df register: ramdisk_df
changed_when: false changed_when: false
failed_when: false failed_when: false
@ -238,7 +238,7 @@
cmd: > cmd: >
set -o pipefail && set -o pipefail &&
findmnt -n -o TARGET,SOURCE,FSTYPE,PROPAGATION findmnt -n -o TARGET,SOURCE,FSTYPE,PROPAGATION
/srv/solana /srv/solana/ramdisk /srv/kind/solana 2>&1 /srv/kind/solana /srv/kind/solana/ramdisk 2>&1
executable: /bin/bash executable: /bin/bash
register: host_mounts register: host_mounts
changed_when: false changed_when: false

View File

@ -9,8 +9,8 @@ Based on the discovery approach from etcusr/solana-snapshot-finder but replaces
the single-connection wget download with aria2c parallel chunked downloads. the single-connection wget download with aria2c parallel chunked downloads.
Usage: Usage:
# Download to /srv/solana/snapshots (mainnet, 16 connections) # Download to /srv/kind/solana/snapshots (mainnet, 16 connections)
./snapshot-download.py -o /srv/solana/snapshots ./snapshot-download.py -o /srv/kind/solana/snapshots
# Dry run — find best source, print URL # Dry run — find best source, print URL
./snapshot-download.py --dry-run ./snapshot-download.py --dry-run
@ -43,7 +43,6 @@ import urllib.request
from dataclasses import dataclass, field from dataclasses import dataclass, field
from http.client import HTTPResponse from http.client import HTTPResponse
from pathlib import Path from pathlib import Path
from typing import NoReturn
from urllib.request import Request from urllib.request import Request
log: logging.Logger = logging.getLogger("snapshot-download") log: logging.Logger = logging.getLogger("snapshot-download")
@ -192,16 +191,12 @@ def _parse_snapshot_filename(location: str) -> tuple[str, str | None]:
def probe_rpc_snapshot( def probe_rpc_snapshot(
rpc_address: str, rpc_address: str,
current_slot: int, current_slot: int,
max_age_slots: int,
max_latency_ms: float,
) -> SnapshotSource | None: ) -> SnapshotSource | None:
"""Probe a single RPC node for available snapshots. """Probe a single RPC node for available snapshots.
Probes for full snapshot first (required), then incremental. Records all Discovery only no filtering. Returns a SnapshotSource with all available
available files. Which files to actually download is decided at download info so the caller can decide what to keep. Filtering happens after all
time based on what already exists locally not here. probes complete, so rejected sources are still visible for debugging.
Based on the discovery approach from etcusr/solana-snapshot-finder.
""" """
full_url: str = f"http://{rpc_address}/snapshot.tar.bz2" full_url: str = f"http://{rpc_address}/snapshot.tar.bz2"
@ -211,8 +206,6 @@ def probe_rpc_snapshot(
return None return None
latency_ms: float = full_latency * 1000 latency_ms: float = full_latency * 1000
if latency_ms > max_latency_ms:
return None
full_filename, full_path = _parse_snapshot_filename(full_location) full_filename, full_path = _parse_snapshot_filename(full_location)
fm: re.Match[str] | None = FULL_SNAP_RE.match(full_filename) fm: re.Match[str] | None = FULL_SNAP_RE.match(full_filename)
@ -222,9 +215,6 @@ def probe_rpc_snapshot(
full_snap_slot: int = int(fm.group(1)) full_snap_slot: int = int(fm.group(1))
slots_diff: int = current_slot - full_snap_slot slots_diff: int = current_slot - full_snap_slot
if slots_diff > max_age_slots or slots_diff < -100:
return None
file_paths: list[str] = [full_path] file_paths: list[str] = [full_path]
# Also check for incremental snapshot # Also check for incremental snapshot
@ -255,7 +245,11 @@ def discover_sources(
threads: int, threads: int,
version_filter: str | None, version_filter: str | None,
) -> list[SnapshotSource]: ) -> list[SnapshotSource]:
"""Discover all snapshot sources from the cluster.""" """Discover all snapshot sources, then filter.
Probing and filtering are separate: all reachable sources are collected
first so we can report what exists even if filters reject everything.
"""
rpc_nodes: list[str] = get_cluster_rpc_nodes(rpc_url, version_filter) rpc_nodes: list[str] = get_cluster_rpc_nodes(rpc_url, version_filter)
if not rpc_nodes: if not rpc_nodes:
log.error("No RPC nodes found via getClusterNodes") log.error("No RPC nodes found via getClusterNodes")
@ -263,31 +257,59 @@ def discover_sources(
log.info("Found %d RPC nodes, probing for snapshots...", len(rpc_nodes)) log.info("Found %d RPC nodes, probing for snapshots...", len(rpc_nodes))
sources: list[SnapshotSource] = [] all_sources: list[SnapshotSource] = []
with concurrent.futures.ThreadPoolExecutor(max_workers=threads) as pool: with concurrent.futures.ThreadPoolExecutor(max_workers=threads) as pool:
futures: dict[concurrent.futures.Future[SnapshotSource | None], str] = { futures: dict[concurrent.futures.Future[SnapshotSource | None], str] = {
pool.submit( pool.submit(probe_rpc_snapshot, addr, current_slot): addr
probe_rpc_snapshot, addr, current_slot,
max_age_slots, max_latency_ms,
): addr
for addr in rpc_nodes for addr in rpc_nodes
} }
done: int = 0 done: int = 0
for future in concurrent.futures.as_completed(futures): for future in concurrent.futures.as_completed(futures):
done += 1 done += 1
if done % 200 == 0: if done % 200 == 0:
log.info(" probed %d/%d nodes, %d sources found", log.info(" probed %d/%d nodes, %d reachable",
done, len(rpc_nodes), len(sources)) done, len(rpc_nodes), len(all_sources))
try: try:
result: SnapshotSource | None = future.result() result: SnapshotSource | None = future.result()
except (urllib.error.URLError, OSError, TimeoutError) as e: except (urllib.error.URLError, OSError, TimeoutError) as e:
log.debug("Probe failed for %s: %s", futures[future], e) log.debug("Probe failed for %s: %s", futures[future], e)
continue continue
if result: if result:
sources.append(result) all_sources.append(result)
log.info("Found %d RPC nodes with suitable snapshots", len(sources)) log.info("Discovered %d reachable sources", len(all_sources))
return sources
# Apply filters
filtered: list[SnapshotSource] = []
rejected_age: int = 0
rejected_latency: int = 0
for src in all_sources:
if src.slots_diff > max_age_slots or src.slots_diff < -100:
rejected_age += 1
continue
if src.latency_ms > max_latency_ms:
rejected_latency += 1
continue
filtered.append(src)
if rejected_age or rejected_latency:
log.info("Filtered: %d rejected by age (>%d slots), %d by latency (>%.0fms)",
rejected_age, max_age_slots, rejected_latency, max_latency_ms)
if not filtered and all_sources:
# Show what was available so the user can adjust filters
all_sources.sort(key=lambda s: s.slots_diff)
best = all_sources[0]
log.warning("All %d sources rejected by filters. Best available: "
"%s (age=%d slots, latency=%.0fms). "
"Try --max-snapshot-age %d --max-latency %.0f",
len(all_sources), best.rpc_address,
best.slots_diff, best.latency_ms,
best.slots_diff + 500,
max(best.latency_ms * 1.5, 500))
log.info("Found %d sources after filtering", len(filtered))
return filtered
# -- Speed benchmark ----------------------------------------------------------- # -- Speed benchmark -----------------------------------------------------------
@ -336,7 +358,7 @@ def download_aria2c(
cmd: list[str] = [ cmd: list[str] = [
"aria2c", "aria2c",
"--file-allocation=none", "--file-allocation=none",
"--continue=true", "--continue=false",
f"--max-connection-per-server={connections}", f"--max-connection-per-server={connections}",
f"--split={total_splits}", f"--split={total_splits}",
"--min-split-size=50M", "--min-split-size=50M",
@ -380,97 +402,74 @@ def download_aria2c(
return True return True
# -- Main ---------------------------------------------------------------------- # -- Public API ----------------------------------------------------------------
def main() -> int: def download_best_snapshot(
p: argparse.ArgumentParser = argparse.ArgumentParser( output_dir: str,
description="Download Solana snapshots with aria2c parallel downloads", *,
) cluster: str = "mainnet-beta",
p.add_argument("-o", "--output", default="/srv/solana/snapshots", rpc_url: str | None = None,
help="Snapshot output directory (default: /srv/solana/snapshots)") connections: int = 16,
p.add_argument("-c", "--cluster", default="mainnet-beta", threads: int = 500,
choices=list(CLUSTER_RPC), max_snapshot_age: int = 10000,
help="Solana cluster (default: mainnet-beta)") max_latency: float = 500,
p.add_argument("-r", "--rpc", default=None, min_download_speed: int = 20,
help="RPC URL for cluster discovery (default: public RPC)") measurement_time: int = 7,
p.add_argument("-n", "--connections", type=int, default=16, max_speed_checks: int = 15,
help="aria2c connections per download (default: 16)") version_filter: str | None = None,
p.add_argument("-t", "--threads", type=int, default=500, full_only: bool = False,
help="Threads for parallel RPC probing (default: 500)") ) -> bool:
p.add_argument("--max-snapshot-age", type=int, default=1300, """Download the best available snapshot to output_dir.
help="Max snapshot age in slots (default: 1300)")
p.add_argument("--max-latency", type=float, default=100,
help="Max RPC probe latency in ms (default: 100)")
p.add_argument("--min-download-speed", type=int, default=20,
help="Min download speed in MiB/s (default: 20)")
p.add_argument("--measurement-time", type=int, default=7,
help="Speed measurement duration in seconds (default: 7)")
p.add_argument("--max-speed-checks", type=int, default=15,
help="Max nodes to benchmark before giving up (default: 15)")
p.add_argument("--version", default=None,
help="Filter nodes by version prefix (e.g. '2.2')")
p.add_argument("--full-only", action="store_true",
help="Download only full snapshot, skip incremental")
p.add_argument("--dry-run", action="store_true",
help="Find best source and print URL, don't download")
p.add_argument("-v", "--verbose", action="store_true")
args: argparse.Namespace = p.parse_args()
logging.basicConfig( Programmatic API for use by entrypoint.py or other callers.
level=logging.DEBUG if args.verbose else logging.INFO, Returns True on success, False on failure.
format="%(asctime)s %(levelname)s %(message)s", """
datefmt="%H:%M:%S", resolved_rpc: str = rpc_url or CLUSTER_RPC[cluster]
)
rpc_url: str = args.rpc or CLUSTER_RPC[args.cluster] if not shutil.which("aria2c"):
# aria2c is required for actual downloads (not dry-run)
if not args.dry_run and not shutil.which("aria2c"):
log.error("aria2c not found. Install with: apt install aria2") log.error("aria2c not found. Install with: apt install aria2")
return 1 return False
# Get current slot log.info("Cluster: %s | RPC: %s", cluster, resolved_rpc)
log.info("Cluster: %s | RPC: %s", args.cluster, rpc_url) current_slot: int | None = get_current_slot(resolved_rpc)
current_slot: int | None = get_current_slot(rpc_url)
if current_slot is None: if current_slot is None:
log.error("Cannot get current slot from %s", rpc_url) log.error("Cannot get current slot from %s", resolved_rpc)
return 1 return False
log.info("Current slot: %d", current_slot) log.info("Current slot: %d", current_slot)
# Discover sources
sources: list[SnapshotSource] = discover_sources( sources: list[SnapshotSource] = discover_sources(
rpc_url, current_slot, resolved_rpc, current_slot,
max_age_slots=args.max_snapshot_age, max_age_slots=max_snapshot_age,
max_latency_ms=args.max_latency, max_latency_ms=max_latency,
threads=args.threads, threads=threads,
version_filter=args.version, version_filter=version_filter,
) )
if not sources: if not sources:
log.error("No snapshot sources found") log.error("No snapshot sources found")
return 1 return False
# Sort by latency (lowest first) for speed benchmarking # Sort by latency (lowest first) for speed benchmarking
sources.sort(key=lambda s: s.latency_ms) sources.sort(key=lambda s: s.latency_ms)
# Benchmark top candidates — all speeds in MiB/s (binary, 1 MiB = 1048576 bytes) # Benchmark top candidates
log.info("Benchmarking download speed on top %d sources...", args.max_speed_checks) log.info("Benchmarking download speed on top %d sources...", max_speed_checks)
fast_sources: list[SnapshotSource] = [] fast_sources: list[SnapshotSource] = []
checked: int = 0 checked: int = 0
min_speed_bytes: int = args.min_download_speed * 1024 * 1024 # MiB to bytes min_speed_bytes: int = min_download_speed * 1024 * 1024
for source in sources: for source in sources:
if checked >= args.max_speed_checks: if checked >= max_speed_checks:
break break
checked += 1 checked += 1
speed: float = measure_speed(source.rpc_address, args.measurement_time) speed: float = measure_speed(source.rpc_address, measurement_time)
source.download_speed = speed source.download_speed = speed
speed_mib: float = speed / (1024 ** 2) speed_mib: float = speed / (1024 ** 2)
if speed < min_speed_bytes: if speed < min_speed_bytes:
log.info(" %s: %.1f MiB/s (too slow, need >=%d MiB/s)", log.info(" %s: %.1f MiB/s (too slow, need >=%d MiB/s)",
source.rpc_address, speed_mib, args.min_download_speed) source.rpc_address, speed_mib, min_download_speed)
continue continue
log.info(" %s: %.1f MiB/s (latency: %.0fms, age: %d slots)", log.info(" %s: %.1f MiB/s (latency: %.0fms, age: %d slots)",
@ -480,19 +479,17 @@ def main() -> int:
if not fast_sources: if not fast_sources:
log.error("No source met minimum speed requirement (%d MiB/s)", log.error("No source met minimum speed requirement (%d MiB/s)",
args.min_download_speed) min_download_speed)
log.info("Try: --min-download-speed 10") return False
return 1
# Use the fastest source as primary, collect mirrors for each file # Use the fastest source as primary, collect mirrors for each file
best: SnapshotSource = fast_sources[0] best: SnapshotSource = fast_sources[0]
file_paths: list[str] = best.file_paths file_paths: list[str] = best.file_paths
if args.full_only: if full_only:
file_paths = [fp for fp in file_paths file_paths = [fp for fp in file_paths
if fp.rsplit("/", 1)[-1].startswith("snapshot-")] if fp.rsplit("/", 1)[-1].startswith("snapshot-")]
# Build mirror URL lists: for each file, collect URLs from all fast sources # Build mirror URL lists
# that serve the same filename
download_plan: list[tuple[str, list[str]]] = [] download_plan: list[tuple[str, list[str]]] = []
for fp in file_paths: for fp in file_paths:
filename: str = fp.rsplit("/", 1)[-1] filename: str = fp.rsplit("/", 1)[-1]
@ -509,38 +506,130 @@ def main() -> int:
best.rpc_address, speed_mib, len(fast_sources)) best.rpc_address, speed_mib, len(fast_sources))
for filename, mirror_urls in download_plan: for filename, mirror_urls in download_plan:
log.info(" %s (%d mirrors)", filename, len(mirror_urls)) log.info(" %s (%d mirrors)", filename, len(mirror_urls))
for url in mirror_urls:
log.info(" %s", url)
if args.dry_run: # Download
for _, mirror_urls in download_plan: os.makedirs(output_dir, exist_ok=True)
for url in mirror_urls:
print(url)
return 0
# Download — skip files that already exist locally
os.makedirs(args.output, exist_ok=True)
total_start: float = time.monotonic() total_start: float = time.monotonic()
for filename, mirror_urls in download_plan: for filename, mirror_urls in download_plan:
filepath: Path = Path(args.output) / filename filepath: Path = Path(output_dir) / filename
if filepath.exists() and filepath.stat().st_size > 0: if filepath.exists() and filepath.stat().st_size > 0:
log.info("Skipping %s (already exists: %.1f GB)", log.info("Skipping %s (already exists: %.1f GB)",
filename, filepath.stat().st_size / (1024 ** 3)) filename, filepath.stat().st_size / (1024 ** 3))
continue continue
if not download_aria2c(mirror_urls, args.output, filename, args.connections): if not download_aria2c(mirror_urls, output_dir, filename, connections):
log.error("Failed to download %s", filename) log.error("Failed to download %s", filename)
return 1 return False
total_elapsed: float = time.monotonic() - total_start total_elapsed: float = time.monotonic() - total_start
log.info("All downloads complete in %.0fs", total_elapsed) log.info("All downloads complete in %.0fs", total_elapsed)
for filename, _ in download_plan: for filename, _ in download_plan:
fp: Path = Path(args.output) / filename fp_path: Path = Path(output_dir) / filename
if fp.exists(): if fp_path.exists():
log.info(" %s (%.1f GB)", fp.name, fp.stat().st_size / (1024 ** 3)) log.info(" %s (%.1f GB)", fp_path.name, fp_path.stat().st_size / (1024 ** 3))
return True
# -- Main (CLI) ----------------------------------------------------------------
def main() -> int:
p: argparse.ArgumentParser = argparse.ArgumentParser(
description="Download Solana snapshots with aria2c parallel downloads",
)
p.add_argument("-o", "--output", default="/srv/kind/solana/snapshots",
help="Snapshot output directory (default: /srv/kind/solana/snapshots)")
p.add_argument("-c", "--cluster", default="mainnet-beta",
choices=list(CLUSTER_RPC),
help="Solana cluster (default: mainnet-beta)")
p.add_argument("-r", "--rpc", default=None,
help="RPC URL for cluster discovery (default: public RPC)")
p.add_argument("-n", "--connections", type=int, default=16,
help="aria2c connections per download (default: 16)")
p.add_argument("-t", "--threads", type=int, default=500,
help="Threads for parallel RPC probing (default: 500)")
p.add_argument("--max-snapshot-age", type=int, default=10000,
help="Max snapshot age in slots (default: 10000)")
p.add_argument("--max-latency", type=float, default=500,
help="Max RPC probe latency in ms (default: 500)")
p.add_argument("--min-download-speed", type=int, default=20,
help="Min download speed in MiB/s (default: 20)")
p.add_argument("--measurement-time", type=int, default=7,
help="Speed measurement duration in seconds (default: 7)")
p.add_argument("--max-speed-checks", type=int, default=15,
help="Max nodes to benchmark before giving up (default: 15)")
p.add_argument("--version", default=None,
help="Filter nodes by version prefix (e.g. '2.2')")
p.add_argument("--full-only", action="store_true",
help="Download only full snapshot, skip incremental")
p.add_argument("--dry-run", action="store_true",
help="Find best source and print URL, don't download")
p.add_argument("--post-cmd",
help="Shell command to run after successful download "
"(e.g. 'kubectl scale deployment ... --replicas=1')")
p.add_argument("-v", "--verbose", action="store_true")
args: argparse.Namespace = p.parse_args()
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s %(levelname)s %(message)s",
datefmt="%H:%M:%S",
)
# Dry-run uses inline flow (needs access to sources for URL printing)
if args.dry_run:
rpc_url: str = args.rpc or CLUSTER_RPC[args.cluster]
current_slot: int | None = get_current_slot(rpc_url)
if current_slot is None:
log.error("Cannot get current slot from %s", rpc_url)
return 1
sources: list[SnapshotSource] = discover_sources(
rpc_url, current_slot,
max_age_slots=args.max_snapshot_age,
max_latency_ms=args.max_latency,
threads=args.threads,
version_filter=args.version,
)
if not sources:
log.error("No snapshot sources found")
return 1
sources.sort(key=lambda s: s.latency_ms)
best = sources[0]
for fp in best.file_paths:
print(f"http://{best.rpc_address}{fp}")
return 0 return 0
ok: bool = download_best_snapshot(
args.output,
cluster=args.cluster,
rpc_url=args.rpc,
connections=args.connections,
threads=args.threads,
max_snapshot_age=args.max_snapshot_age,
max_latency=args.max_latency,
min_download_speed=args.min_download_speed,
measurement_time=args.measurement_time,
max_speed_checks=args.max_speed_checks,
version_filter=args.version,
full_only=args.full_only,
)
if ok and args.post_cmd:
log.info("Running post-download command: %s", args.post_cmd)
result: subprocess.CompletedProcess[bytes] = subprocess.run(
args.post_cmd, shell=True,
)
if result.returncode != 0:
log.error("Post-download command failed with exit code %d",
result.returncode)
return 1
log.info("Post-download command completed successfully")
return 0 if ok else 1
if __name__ == "__main__": if __name__ == "__main__":
sys.exit(main()) sys.exit(main())