HyprVault — backups
Restic-backed backups managed from the dashboard. Each policy describes what to back up on which node, when, and how long to keep snapshots. Each run is one execution — started either by the cron job the preset installs, or by a manual trigger from the dashboard / API.
Data model
BackupPolicy 1 ── n BackupRun
↑
└── nodeId → Node, createdBy → User
- BackupPolicy:
name,nodeId,paths(JSON array),schedule(cron),repoUrl(Restic),passwordRef(where to source RESTIC_PASSWORD),keepDaily / keepWeekly / keepMonthly,status(ENABLED/PAUSED). - BackupRun:
policyId,status(RUNNING/SUCCEEDED/FAILED),snapshotId,sizeBytes,filesCount,durationMs,errorMsg,startedAt,finishedAt.
passwordRef is intentionally a pointer, not the secret. We never store
the password in the DB. Two supported forms:
env:RESTIC_PASSWORD(default) — reads the agent's environment.file:/etc/hyprbox/restic.pass— reads a file on the host.
Lifecycle
Operator → POST /api/backups → BackupPolicy created (status ENABLED)
→ POST /api/backups/:id/trigger → Job queued with the hypervault-restic preset
↓
Agent picks up the job (next heartbeat)
↓
hyprbox-backup wrapper runs:
1. POST /api/backups/:id/runs → BackupRun created (RUNNING)
2. restic backup
3. POST /api/backups/runs/:runId/complete
with status + sizeBytes + snapshotId
For recurring runs, the same preset drops /etc/cron.d/hyprbox-backup-<id>
that calls the wrapper script with no API trigger — the script itself starts
the run via POST /runs using its agent token.
Setting up a node for backups
Provision an agent token bound to the node (Settings → Agent tokens in the dashboard). Set
HYPRBOX_NODE_TOKENon the agent.Provide RESTIC_PASSWORD to the agent. With systemd:
# /etc/systemd/system/hyprnode.service.d/restic.conf [Service] Environment="RESTIC_PASSWORD=<your-strong-password>"Or write it to a file (mode 0600) and reference
file:/etc/hyprbox/restic.passin the policy.Decide where snapshots live. Restic supports local paths, SFTP, S3, Azure Blob, Backblaze B2, GCS, REST server, and rclone. Examples:
s3:s3.amazonaws.com/my-bucket/restic(withAWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEYin the agent's env)b2:bucket-name:/resticsftp:backup@backup-host:/srv/restic/srv/restic(local — only useful for very small fleets with a shared NFS)
Create the policy from the dashboard —
/dashboard/backups→ "+ New policy". Pick the node, list the paths to include, paste the repo URL, set retention.Click "Backup now →" to verify everything works end-to-end. The first run takes longer because Restic initialises the repo and ingests the full data set.
API surface
| Action | Endpoint | Auth |
|---|---|---|
| Create policy | POST /api/backups |
user |
| List my policies | GET /api/backups[?nodeId] |
user |
| Detail + last 20 runs | GET /api/backups/:id |
user |
| Update / pause / resume | PATCH /api/backups/:id |
user |
| Delete | DELETE /api/backups/:id |
user |
| Trigger one-shot | POST /api/backups/:id/trigger |
user |
| Start a run | POST /api/backups/:id/runs |
node token |
| Complete a run | POST /api/backups/runs/:runId/complete |
node token |
Both agent endpoints honour the token's optional nodeId pinning — a token
bound to node A cannot start runs for node B's policies.
The trigger endpoint queues a hypervault-restic job; you can watch it
stream in the Jobs page like any other deployment.
Variables fed into the preset
When /trigger renders the preset, these variables come from the policy row:
| Variable | Source | Used for |
|---|---|---|
policy_id |
policy.id |
run correlation |
backup_paths |
policy.paths joined with : |
restic backup args |
repo_url |
policy.repoUrl |
RESTIC_REPOSITORY env |
password_ref |
policy.passwordRef |
resolved by the wrapper script |
keep_daily / keep_weekly / keep_monthly |
policy.keep* |
restic forget --keep-* |
api_url |
HYPRBOX_API_URL env on the API |
wrapper's POST target |
Retention semantics
We pass --keep-daily N, --keep-weekly N, --keep-monthly N to Restic's
forget --prune. Restic's algorithm keeps the most recent N snapshots in
each bucket, so a setting of daily=7, weekly=4, monthly=6 keeps roughly
7 + 4 + 6 = 17 snapshots in steady state (with some overlap when a snapshot
satisfies multiple buckets).
The wrapper runs forget --tag policy=<id> so each policy's retention is
independent. Two policies sharing a repository don't accidentally prune each
other's snapshots.
Troubleshooting
Repo init fails on first run
The wrapper script runs restic init if restic snapshots errors. For
S3-compatible repos, that needs the bucket to exist AND the credentials to
have write access. Check the wrapper's stdout in the Job detail page.
bash: restic: command not found
The preset's package step installs restic, but apt may have updated to a
newer version since the repo was set up. Re-trigger the policy — the install
step is idempotent and will pull the current version.
Snapshots succeed but sizeBytes is 0
Restic's summary parsing in the wrapper relies on --json output. Very old
Restic versions (<0.12) don't emit that structure. Upgrade Restic or live
without the size metric.
Run is RUNNING forever
The agent crashed or lost network mid-run. No automatic sweeper yet (Phase 4.5). Manually:
UPDATE "BackupRun" SET status='FAILED', error_msg='agent disappeared',
finished_at=now() WHERE id='<id>' AND status='RUNNING';
How to restore from a snapshot
Out of scope for the API right now — Phase 4.5 will add a guided restore flow. For now, SSH to the node and use Restic directly:
export RESTIC_REPOSITORY=s3:...
export RESTIC_PASSWORD=...
restic snapshots --tag policy=<id>
restic restore <snapshot-id> --target /tmp/restore
Limits
- 4 MB cap on wrapper stdout per job (same as any other Job's output).
- One run at a time per policy is recommended; the schema doesn't enforce it but Restic's lock file does at the repo level.
- No alerting on backup failures yet. The activity event published on
completion (
snapfor SUCCEEDED,warnfor FAILED) shows up in the Logs page and on the node detail timeline.