HyprNode (agent)
Statically-linked Go binary (~9 MB) that runs on each managed server. Two responsibilities, one tick loop: send heartbeats to the API, and run any queued jobs for this node.
Install
From source
cd agent/hyprnode
go build -trimpath -ldflags="-s -w" -o /usr/local/bin/hyprnode .
Docker
docker build -t hyprbox/hyprnode:0.1.0 agent/hyprnode
docker run --rm \
-e HYPRBOX_API_URL=https://hyprbox.example.com \
-e HYPRBOX_NODE_ID=prod-eu-1 \
-e HYPRBOX_NODE_TOKEN=hbnt_... \
hyprbox/hyprnode:0.1.0
The image is scratch + the binary + the CA bundle, ~10 MB. Runs as uid 65532.
systemd unit (recommended for prod)
# /etc/systemd/system/hyprnode.service
[Unit]
Description=HyprBox node agent
After=network.target
[Service]
Type=simple
User=root
EnvironmentFile=/etc/hyprbox/hyprnode.env
ExecStart=/usr/local/bin/hyprnode
Restart=on-failure
RestartSec=10s
[Install]
WantedBy=multi-user.target
Why root? The agent applies system fixes (apt, ufw, ssh hardening, …). Presets escalate per command via
sudo, so on a non-root user without passwordless sudo every fix fails at the password prompt. Running as root is the realistic posture for a server-management agent — the trust boundary is the node token, not the OS user. (If you must run unprivileged, give the agent user aNOPASSWDsudoers entry instead.)
# /etc/hyprbox/hyprnode.env
HYPRBOX_API_URL=https://hyprbox.example.com
HYPRBOX_NODE_ID=prod-eu-1
HYPRBOX_NODE_TOKEN=hbnt_...
HYPRBOX_INTERVAL=5m
chmod 600 /etc/hyprbox/hyprnode.env
systemctl enable --now hyprnode
journalctl -u hyprnode -f
Environment variables
| Variable | Default | Description |
|---|---|---|
HYPRBOX_API_URL |
http://localhost:4000 |
API base URL. No trailing slash. |
HYPRBOX_NODE_ID |
hostname | Unique identifier. Must match the nodeId of the pinned token (if any). |
HYPRBOX_NODE_TOKEN |
(empty) | Agent token (hbnt_…). Required in prod (HYPRBOX_REQUIRE_NODE_TOKEN=true on the API). |
HYPRBOX_INTERVAL |
5m |
Heartbeat cadence (Go duration string: 30s, 1m, 5m). |
HYPRBOX_SCAN_INTERVAL |
15m |
Scanner cadence — independent of heartbeats. |
HYPRBOX_TLS_PATHS |
/etc/letsencrypt/live/*/fullchain.pem:/etc/ssl/certs/hyprbox/*.crt |
Colon-separated cert paths or globs. |
HYPRBOX_TLS_WARN_DAYS |
30 |
Below this many days until expiry → emit WARN. |
HYPRBOX_TLS_CRITICAL_DAYS |
7 |
Below this → CRITICAL. |
The agent runs on Linux only — runtime.GOOS != "linux" disables both the
jobrunner AND the scanner (heartbeats still work everywhere). Use a VM or
container to test those loops.
What the agent reports
Each heartbeat includes:
node_id,hostname,os— basic identity.cpu_percent,ram_percent,disk_percent— current load (gopsutil).status— alwaysonlinefrom the agent; the API flips it toOFFLINEif no heartbeat arrives within 10 min.agent_version— pinned at build time.- Inventory fields (
ip,region,kernel,architecture) — sent optionally; the API only overwrites previous values when present.
Scanners
The agent runs a small set of local probes on a separate, slower cadence
than heartbeats (default 15 min). Each scanner is a stateless function that
returns a list of findings; the runner aggregates them across scanners and
POSTs /api/findings once per tick.
| Scanner | What it looks at | Findings it emits |
|---|---|---|
scanner/tls.go |
TLS certificates at HYPRBOX_TLS_PATHS (defaults to LE live tree + /etc/ssl/certs/hyprbox/*.crt) |
tls.expiring (severity scales with days left), tls.unreadable (could not parse the PEM) |
Scanner-level invariants
- A scanner returns no findings → silence on the wire. The API has its own staleness sweeper that flips long-unseen findings to RESOLVED (Part E).
- Each finding has a stable
keyso re-detection upserts the same row. Dynamic information (days left, current usage) belongs inmessage/metadata, never inkey. - A scanner that can't read the thing it's probing emits an INFO
<scanner>.unreadablefinding ONCE per affected path — useful for the "we tried but couldn't" case, not noisy enough to drown the OPEN list.
Adding a new scanner
- New file under
internal/scanner/<name>.goexportingfunc Scan<Name>() []Finding. - Wire it into
Tick()ininternal/scanner/runner.go(a plain function call). - Document it in the table above + add the finding type(s) to
docs/FINDINGS.md. - (Optional) add a
Recommendationrow inapps/api/prisma/seed-recommendations.tsmapping the finding type to a vetted preset.
The dedup contract on the API side (@@unique([nodeId, type, key])) means
your scanner can be naïve about state — just emit what you see; the API
turns repeats into updates.
Job execution
After each successful heartbeat the agent drains pending jobs:
GET /api/jobs/pending?nodeId=<HYPRBOX_NODE_ID>.- If a job is returned, spawn
/bin/bash -swith the script on stdin (no file ever touches disk). - Buffer stdout + stderr in memory; flush to
POST /api/jobs/:id/outputevery 1.5s. - On exit,
POST /api/jobs/:id/completewithexitCode, any final buffer contents, anderrorMsgif the process couldn't be started. - Loop back to step 1 until
pendingreturnsnull, then sleep until the next interval.
Per-job timeout: 30 minutes (bashTimeoutPerJob). Hit it and the agent kills
bash, reports exit 137-ish, and moves on.
Troubleshooting
heartbeat error: API returned 401
Token is missing, invalid, or revoked. Mint a new one from the dashboard
(/dashboard/settings) or via POST /api/tokens.
heartbeat error: API returned 403
The token is pinned to a different nodeId than what the agent is sending.
Either change HYPRBOX_NODE_ID, or mint a new token without nodeId (fleet-
wide) or with the right nodeId.
collect error: ...
gopsutil couldn't read host metrics. On Linux this almost always means the
agent is running in a container without /proc mounted, or on a kernel
without the expected /sys entries. Check the container's volume mounts
(/proc:ro,rslave for the Docker compose case).
Agent is silent / logs show nothing
Two likely causes:
HYPRBOX_API_URLunreachable. Test withcurl $HYPRBOX_API_URL/healthfrom the same host.- Process exits immediately. Add
set -xto your invocation; withRestart=on-failuresystemd will keep retrying.
Jobs queue but agent never picks them up
- Verify the node is
ONLINE(heartbeat side works). - Check the agent log for
[jobrunner] poll error: .... Most likelyunauthorized — check HYPRBOX_NODE_TOKEN. - If running on Windows for dev: the jobrunner is disabled by
runtime.GOOS != "linux". That's intentional. Use a Linux container or VM.
Updating the agent
The agent has no auto-update mechanism — that's intentional. Push new builds
via your config-management tool of choice (Ansible, Salt, plain scp +
systemctl restart hyprnode).
The protocol is forward-compatible: an older API tolerates new optional
heartbeat fields (Zod schema marks them .optional()), and an older agent
that doesn't know about jobs simply skips the polling step.