HyprNode (agent)

Statically-linked Go binary (~9 MB) that runs on each managed server. Two responsibilities, one tick loop: send heartbeats to the API, and run any queued jobs for this node.

Install

From source

cd agent/hyprnode
go build -trimpath -ldflags="-s -w" -o /usr/local/bin/hyprnode .

Docker

docker build -t hyprbox/hyprnode:0.1.0 agent/hyprnode
docker run --rm \
  -e HYPRBOX_API_URL=https://hyprbox.example.com \
  -e HYPRBOX_NODE_ID=prod-eu-1 \
  -e HYPRBOX_NODE_TOKEN=hbnt_... \
  hyprbox/hyprnode:0.1.0

The image is scratch + the binary + the CA bundle, ~10 MB. Runs as uid 65532.

systemd unit (recommended for prod)

# /etc/systemd/system/hyprnode.service
[Unit]
Description=HyprBox node agent
After=network.target

[Service]
Type=simple
User=root
EnvironmentFile=/etc/hyprbox/hyprnode.env
ExecStart=/usr/local/bin/hyprnode
Restart=on-failure
RestartSec=10s

[Install]
WantedBy=multi-user.target

Why root? The agent applies system fixes (apt, ufw, ssh hardening, …). Presets escalate per command via sudo, so on a non-root user without passwordless sudo every fix fails at the password prompt. Running as root is the realistic posture for a server-management agent — the trust boundary is the node token, not the OS user. (If you must run unprivileged, give the agent user a NOPASSWD sudoers entry instead.)

# /etc/hyprbox/hyprnode.env
HYPRBOX_API_URL=https://hyprbox.example.com
HYPRBOX_NODE_ID=prod-eu-1
HYPRBOX_NODE_TOKEN=hbnt_...
HYPRBOX_INTERVAL=5m

chmod 600 /etc/hyprbox/hyprnode.env
systemctl enable --now hyprnode
journalctl -u hyprnode -f

Environment variables

Variable	Default	Description
`HYPRBOX_API_URL`	`http://localhost:4000`	API base URL. No trailing slash.
`HYPRBOX_NODE_ID`	hostname	Unique identifier. Must match the `nodeId` of the pinned token (if any).
`HYPRBOX_NODE_TOKEN`	(empty)	Agent token (`hbnt_…`). Required in prod (`HYPRBOX_REQUIRE_NODE_TOKEN=true` on the API).
`HYPRBOX_INTERVAL`	`5m`	Heartbeat cadence (Go duration string: `30s`, `1m`, `5m`).
`HYPRBOX_SCAN_INTERVAL`	`15m`	Scanner cadence — independent of heartbeats.
`HYPRBOX_TLS_PATHS`	`/etc/letsencrypt/live//fullchain.pem:/etc/ssl/certs/hyprbox/.crt`	Colon-separated cert paths or globs.
`HYPRBOX_TLS_WARN_DAYS`	`30`	Below this many days until expiry → emit WARN.
`HYPRBOX_TLS_CRITICAL_DAYS`	`7`	Below this → CRITICAL.

The agent runs on Linux only — runtime.GOOS != "linux" disables both the jobrunner AND the scanner (heartbeats still work everywhere). Use a VM or container to test those loops.

What the agent reports

Each heartbeat includes:

node_id, hostname, os — basic identity.
cpu_percent, ram_percent, disk_percent — current load (gopsutil).
status — always online from the agent; the API flips it to OFFLINE if no heartbeat arrives within 10 min.
agent_version — pinned at build time.
Inventory fields (ip, region, kernel, architecture) — sent optionally; the API only overwrites previous values when present.

Scanners

The agent runs a small set of local probes on a separate, slower cadence than heartbeats (default 15 min). Each scanner is a stateless function that returns a list of findings; the runner aggregates them across scanners and POSTs /api/findings once per tick.

Scanner	What it looks at	Findings it emits
`scanner/tls.go`	TLS certificates at `HYPRBOX_TLS_PATHS` (defaults to LE live tree + `/etc/ssl/certs/hyprbox/*.crt`)	`tls.expiring` (severity scales with days left), `tls.unreadable` (could not parse the PEM)

Scanner-level invariants

A scanner returns no findings → silence on the wire. The API has its own staleness sweeper that flips long-unseen findings to RESOLVED (Part E).
Each finding has a stable key so re-detection upserts the same row. Dynamic information (days left, current usage) belongs in message / metadata, never in key.
A scanner that can't read the thing it's probing emits an INFO <scanner>.unreadable finding ONCE per affected path — useful for the "we tried but couldn't" case, not noisy enough to drown the OPEN list.

Adding a new scanner

New file under internal/scanner/<name>.go exporting func Scan<Name>() []Finding.
Wire it into Tick() in internal/scanner/runner.go (a plain function call).
Document it in the table above + add the finding type(s) to docs/FINDINGS.md.
(Optional) add a Recommendation row in apps/api/prisma/seed-recommendations.ts mapping the finding type to a vetted preset.

The dedup contract on the API side (@@unique([nodeId, type, key])) means your scanner can be naïve about state — just emit what you see; the API turns repeats into updates.

Job execution

After each successful heartbeat the agent drains pending jobs:

GET /api/jobs/pending?nodeId=<HYPRBOX_NODE_ID>.
If a job is returned, spawn /bin/bash -s with the script on stdin (no file ever touches disk).
Buffer stdout + stderr in memory; flush to POST /api/jobs/:id/output every 1.5s.
On exit, POST /api/jobs/:id/complete with exitCode, any final buffer contents, and errorMsg if the process couldn't be started.
Loop back to step 1 until pending returns null, then sleep until the next interval.

Per-job timeout: 30 minutes (bashTimeoutPerJob). Hit it and the agent kills bash, reports exit 137-ish, and moves on.

Troubleshooting

`heartbeat error: API returned 401`

Token is missing, invalid, or revoked. Mint a new one from the dashboard (/dashboard/settings) or via POST /api/tokens.

`heartbeat error: API returned 403`

The token is pinned to a different nodeId than what the agent is sending. Either change HYPRBOX_NODE_ID, or mint a new token without nodeId (fleet- wide) or with the right nodeId.

`collect error: ...`

gopsutil couldn't read host metrics. On Linux this almost always means the agent is running in a container without /proc mounted, or on a kernel without the expected /sys entries. Check the container's volume mounts (/proc:ro,rslave for the Docker compose case).

Agent is silent / logs show nothing

Two likely causes:

HYPRBOX_API_URL unreachable. Test with curl $HYPRBOX_API_URL/health from the same host.
Process exits immediately. Add set -x to your invocation; with Restart=on-failure systemd will keep retrying.

Jobs queue but agent never picks them up

Verify the node is ONLINE (heartbeat side works).
Check the agent log for [jobrunner] poll error: .... Most likely unauthorized — check HYPRBOX_NODE_TOKEN.
If running on Windows for dev: the jobrunner is disabled by runtime.GOOS != "linux". That's intentional. Use a Linux container or VM.

Updating the agent

The agent has no auto-update mechanism — that's intentional. Push new builds via your config-management tool of choice (Ansible, Salt, plain scp + systemctl restart hyprnode).

The protocol is forward-compatible: an older API tolerates new optional heartbeat fields (Zod schema marks them .optional()), and an older agent that doesn't know about jobs simply skips the polling step.

HyprNode (agent)

#Install

#From source

#Docker

#systemd unit (recommended for prod)

#Environment variables

#What the agent reports

#Scanners

#Scanner-level invariants

#Adding a new scanner

#Job execution

#Troubleshooting

#heartbeat error: API returned 401

#heartbeat error: API returned 403

#collect error: ...

#Agent is silent / logs show nothing

#Jobs queue but agent never picks them up

#Updating the agent