HyprBox docs GitHub ↗

HyprNode (agent)

Statically-linked Go binary (~9 MB) that runs on each managed server. Two responsibilities, one tick loop: send heartbeats to the API, and run any queued jobs for this node.

Install

From source

cd agent/hyprnode
go build -trimpath -ldflags="-s -w" -o /usr/local/bin/hyprnode .

Docker

docker build -t hyprbox/hyprnode:0.1.0 agent/hyprnode
docker run --rm \
  -e HYPRBOX_API_URL=https://hyprbox.example.com \
  -e HYPRBOX_NODE_ID=prod-eu-1 \
  -e HYPRBOX_NODE_TOKEN=hbnt_... \
  hyprbox/hyprnode:0.1.0

The image is scratch + the binary + the CA bundle, ~10 MB. Runs as uid 65532.

# /etc/systemd/system/hyprnode.service
[Unit]
Description=HyprBox node agent
After=network.target

[Service]
Type=simple
User=root
EnvironmentFile=/etc/hyprbox/hyprnode.env
ExecStart=/usr/local/bin/hyprnode
Restart=on-failure
RestartSec=10s

[Install]
WantedBy=multi-user.target

Why root? The agent applies system fixes (apt, ufw, ssh hardening, …). Presets escalate per command via sudo, so on a non-root user without passwordless sudo every fix fails at the password prompt. Running as root is the realistic posture for a server-management agent — the trust boundary is the node token, not the OS user. (If you must run unprivileged, give the agent user a NOPASSWD sudoers entry instead.)

# /etc/hyprbox/hyprnode.env
HYPRBOX_API_URL=https://hyprbox.example.com
HYPRBOX_NODE_ID=prod-eu-1
HYPRBOX_NODE_TOKEN=hbnt_...
HYPRBOX_INTERVAL=5m
chmod 600 /etc/hyprbox/hyprnode.env
systemctl enable --now hyprnode
journalctl -u hyprnode -f

Environment variables

Variable Default Description
HYPRBOX_API_URL http://localhost:4000 API base URL. No trailing slash.
HYPRBOX_NODE_ID hostname Unique identifier. Must match the nodeId of the pinned token (if any).
HYPRBOX_NODE_TOKEN (empty) Agent token (hbnt_…). Required in prod (HYPRBOX_REQUIRE_NODE_TOKEN=true on the API).
HYPRBOX_INTERVAL 5m Heartbeat cadence (Go duration string: 30s, 1m, 5m).
HYPRBOX_SCAN_INTERVAL 15m Scanner cadence — independent of heartbeats.
HYPRBOX_TLS_PATHS /etc/letsencrypt/live/*/fullchain.pem:/etc/ssl/certs/hyprbox/*.crt Colon-separated cert paths or globs.
HYPRBOX_TLS_WARN_DAYS 30 Below this many days until expiry → emit WARN.
HYPRBOX_TLS_CRITICAL_DAYS 7 Below this → CRITICAL.

The agent runs on Linux only — runtime.GOOS != "linux" disables both the jobrunner AND the scanner (heartbeats still work everywhere). Use a VM or container to test those loops.

What the agent reports

Each heartbeat includes:

  • node_id, hostname, os — basic identity.
  • cpu_percent, ram_percent, disk_percent — current load (gopsutil).
  • status — always online from the agent; the API flips it to OFFLINE if no heartbeat arrives within 10 min.
  • agent_version — pinned at build time.
  • Inventory fields (ip, region, kernel, architecture) — sent optionally; the API only overwrites previous values when present.

Scanners

The agent runs a small set of local probes on a separate, slower cadence than heartbeats (default 15 min). Each scanner is a stateless function that returns a list of findings; the runner aggregates them across scanners and POSTs /api/findings once per tick.

Scanner What it looks at Findings it emits
scanner/tls.go TLS certificates at HYPRBOX_TLS_PATHS (defaults to LE live tree + /etc/ssl/certs/hyprbox/*.crt) tls.expiring (severity scales with days left), tls.unreadable (could not parse the PEM)

Scanner-level invariants

  • A scanner returns no findings → silence on the wire. The API has its own staleness sweeper that flips long-unseen findings to RESOLVED (Part E).
  • Each finding has a stable key so re-detection upserts the same row. Dynamic information (days left, current usage) belongs in message / metadata, never in key.
  • A scanner that can't read the thing it's probing emits an INFO <scanner>.unreadable finding ONCE per affected path — useful for the "we tried but couldn't" case, not noisy enough to drown the OPEN list.

Adding a new scanner

  1. New file under internal/scanner/<name>.go exporting func Scan<Name>() []Finding.
  2. Wire it into Tick() in internal/scanner/runner.go (a plain function call).
  3. Document it in the table above + add the finding type(s) to docs/FINDINGS.md.
  4. (Optional) add a Recommendation row in apps/api/prisma/seed-recommendations.ts mapping the finding type to a vetted preset.

The dedup contract on the API side (@@unique([nodeId, type, key])) means your scanner can be naïve about state — just emit what you see; the API turns repeats into updates.

Job execution

After each successful heartbeat the agent drains pending jobs:

  1. GET /api/jobs/pending?nodeId=<HYPRBOX_NODE_ID>.
  2. If a job is returned, spawn /bin/bash -s with the script on stdin (no file ever touches disk).
  3. Buffer stdout + stderr in memory; flush to POST /api/jobs/:id/output every 1.5s.
  4. On exit, POST /api/jobs/:id/complete with exitCode, any final buffer contents, and errorMsg if the process couldn't be started.
  5. Loop back to step 1 until pending returns null, then sleep until the next interval.

Per-job timeout: 30 minutes (bashTimeoutPerJob). Hit it and the agent kills bash, reports exit 137-ish, and moves on.

Troubleshooting

heartbeat error: API returned 401

Token is missing, invalid, or revoked. Mint a new one from the dashboard (/dashboard/settings) or via POST /api/tokens.

heartbeat error: API returned 403

The token is pinned to a different nodeId than what the agent is sending. Either change HYPRBOX_NODE_ID, or mint a new token without nodeId (fleet- wide) or with the right nodeId.

collect error: ...

gopsutil couldn't read host metrics. On Linux this almost always means the agent is running in a container without /proc mounted, or on a kernel without the expected /sys entries. Check the container's volume mounts (/proc:ro,rslave for the Docker compose case).

Agent is silent / logs show nothing

Two likely causes:

  • HYPRBOX_API_URL unreachable. Test with curl $HYPRBOX_API_URL/health from the same host.
  • Process exits immediately. Add set -x to your invocation; with Restart=on-failure systemd will keep retrying.

Jobs queue but agent never picks them up

  • Verify the node is ONLINE (heartbeat side works).
  • Check the agent log for [jobrunner] poll error: .... Most likely unauthorized — check HYPRBOX_NODE_TOKEN.
  • If running on Windows for dev: the jobrunner is disabled by runtime.GOOS != "linux". That's intentional. Use a Linux container or VM.

Updating the agent

The agent has no auto-update mechanism — that's intentional. Push new builds via your config-management tool of choice (Ansible, Salt, plain scp + systemctl restart hyprnode).

The protocol is forward-compatible: an older API tolerates new optional heartbeat fields (Zod schema marks them .optional()), and an older agent that doesn't know about jobs simply skips the polling step.