ctrl-exec - Manual Verification Checks
Checks that cannot be automated and should be run after install or significant changes
ctrl-exec - Manual Verification Checks
Some behaviours cannot be verified by the automated test suite. These checks should be performed after initial installation, after upgrades, and after any change to the systemd unit, agent configuration, or certificate infrastructure.
Each check identifies what to verify, how to verify it, and what a passing result looks like.
1. Agent Syslog Output
Confirms the agent can write to syslog and that entries are reaching the
system log. Silent syslog failures are a known failure mode — Sys::Syslog's
openlog() does not raise an error when the socket is unavailable.
Known cause on Debian/systemd: PrivateDevices=yes in the unit file prevents
access to /dev/log. Fix: PrivateDevices=no and add AF_UNIX to
RestrictAddressFamilies. OpenWrt is not affected (procd does not implement
PrivateDevices).
On each agent host, trigger a ping and check the log:
# From the ctrl-exec host
sudo ctrl-exec ping <agent>
# On the agent host (Debian/systemd)
sudo journalctl -u ctrl-exec-agent --since "1 minute ago"
# On the agent host (OpenWrt)
logread | grep ctrl-exec-agent | tail -10
- Pass
-
ACTION=ping PEER=<ctrl-exec-ip> REQID=<hex>appears within a few seconds of the ping. - Fail
-
No entry appears. Check
PrivateDevicesandRestrictAddressFamiliesin the unit file. Restart after any change:systemctl restart ctrl-exec-agent.
2. Agent Self-Ping — Live Network Verification
Confirms the agent is listening on port 7443, the TLS stack is functional, and the agent is actively enforcing serial policy. Requires the agent to be running and paired. Run this check on the agent host directly — no ctrl-exec access is needed.
sudo ctrl-exec-agent self-ping
- Pass
-
Output shows port listening, mTLS handshake OK, and
403 serial mismatch (expected). The 403 is the correct result — the agent's own cert is not a ctrl-exec cert, and the agent correctly rejects it. - Fail — port not listening
-
The agent service is not running or is bound to a different port. Check
systemctl status ctrl-exec-agentand theportsetting inagent.conf. - Fail — mTLS handshake error
-
TLS configuration is broken. Check cert and CA paths in
agent.confand confirm the cert files are readable by the agent process. - Fail — unexpected response or no response
-
The agent accepted the connection but did not respond as expected. Check
the agent syslog for errors. Also run
self-checkto confirm the configuration is valid.
Run this check after initial installation and after any change to the agent service, port configuration, or cert files.
3. Systemd Unit Hardening — AF_UNIX Present
Confirms RestrictAddressFamilies includes AF_UNIX. Omitting it silently
blocks all syslog output because Sys::Syslog uses a Unix domain socket to
reach journald.
systemctl cat ctrl-exec-agent | grep RestrictAddressFamilies
- Pass
-
Output contains
AF_UNIX AF_INET AF_INET6. - Fail
-
AF_UNIXis absent. Add it to the unit file, reload systemd, restart the agent. Verify syslog output (check 1) after correcting.
4. Auth Hook Invocation
Confirms the auth hook is called for every run and ping request, and that
its exit code is respected.
For deployments with SSH access to the agent, test 08 (08-auth-hook.sh) in
the integration suite covers hook invocation and denial end-to-end.
For deployments without SSH access, test 15 (15-agent-auth-context.sh)
provides equivalent coverage using a pre-installed hook and a dedicated
allowlisted script to retrieve results via dispatch. Install it with:
sudo bash t/integration/setup-agent-scripts.sh --install-auth-test
sudo bash t/integration/15-agent-auth-context.sh
For a quick manual check on any deployment, configure a minimal hook that logs to syslog and permits all requests:
cat > /etc/ctrl-exec/auth-hook << 'EOF'
#!/bin/sh
logger -t ctrl-exec-auth "hook called: SCRIPT=$ENVEXEC_SCRIPT USER=$ENVEXEC_USERNAME"
exit 0
EOF
chmod 755 /etc/ctrl-exec/auth-hook
Run a script, then check the log on the agent host:
sudo ctrl-exec run <agent> env-dump
sudo journalctl -t ctrl-exec-auth --since "1 minute ago"
- Pass
-
Log entry appears for the run request. Changing
exit 0toexit 1causes all subsequent requests to return a permission error. - Fail
-
No log entry. Check
auth_hookpath inagent.confand that the hook is executable.
5. Allowlist SIGHUP Reload
Confirms the agent reloads scripts.conf on SIGHUP without restarting, and
that newly added entries take effect immediately.
# Add a new entry to scripts.conf on the agent
echo "reload-test = /opt/ctrl-exec-scripts/env-dump.sh" \
>> /etc/ctrl-exec-agent/scripts.conf
# Reload without restart
systemctl reload ctrl-exec-agent # or: /etc/init.d/ctrl-exec-agent reload
# Attempt to run the new entry from the ctrl-exec
sudo ctrl-exec run <agent> reload-test
- Pass
- The script runs successfully without restarting the agent.
- Fail
-
Request is rejected as not permitted. Check that SIGHUP is delivered
(
ExecReload=/bin/kill -HUP $MAINPIDin the unit file) and that the new entry is syntactically correct inscripts.conf.
6. Rate Limit Block and Recovery
Confirms the volume rate limiter blocks a source IP after exceeding the
threshold and that the block expires correctly. The unit test (t/rate-limit.t)
covers the logic; this check verifies end-to-end behaviour on a live agent.
Not suitable for automated suite runs — it requires deliberately triggering and waiting out a 5-minute block.
# Fire 11 rapid pings from the ctrl-exec to one agent
for i in $(seq 1 11); do sudo ctrl-exec ping <agent>; done
# The 11th should fail or return an error
# Check the agent log for the rate-block entry
sudo journalctl -u ctrl-exec-agent --since "1 minute ago" | grep rate-block
- Pass
-
ACTION=rate-block PEER=<ctrl-exec-ip> REASON=volumeappears in the log. Subsequent pings fail withno response from childfor approximately 5 minutes, then recover automatically. - Fail
-
No block occurs. Check
rate_limit_disableis not set inagent.confon a production agent. - Note
-
Set
rate_limit_disable = 1inagent.confbefore running the integration test suite, and remove it when done.
7. Pairing Flow — Fresh Agent
Confirms the full pairing sequence works end-to-end: agent submits CSR, ctrl-exec displays the pairing code, operator approves, agent stores certs.
Run on a host that has not previously been paired, or after clearing
/etc/ctrl-exec-agent/agent.{key,crt}:
# On the agent host
sudo ctrl-exec-agent request-pairing --dispatcher <ctrl-exec-hostname>
# On the ctrl-exec host (in a separate terminal)
sudo ctrl-exec list-requests
# Verify the hostname, source IP, and 6-digit pairing code match
# what the agent displayed, then approve:
sudo ctrl-exec approve <agent-hostname>
# Confirm the agent accepted the cert
sudo ctrl-exec ping <agent-hostname>
- Pass
-
ACTION=pair-completeappears in the agent log.ctrl-exec pingreturns ok. - Fail
-
Pairing code mismatch - reject and investigate. Nonce mismatch - check for
clock skew or concurrent pairing requests. Writability failure - check
/etc/ctrl-exec-agentpermissions.
8. Cert Rotation Broadcast
Confirms that ctrl-exec rotate-cert reaches all registered agents and that
each agent updates its stored ctrl-exec serial.
sudo ctrl-exec rotate-cert
sudo ctrl-exec serial-status
- Pass
-
All agents show
currentinserial-statusoutput.ACTION=serial-updateappears in the log on each agent. - Fail
-
One or more agents remain
pending. The agent was unreachable during the broadcast. Re-runrotate-certafter restoring connectivity. If the overlap window expires, the agent requires re-pairing.
9. Revocation Takes Effect
Confirms that adding a serial to revoked-serials on an agent causes
subsequent connections from that cert to be rejected, without restarting the
agent.
# Obtain the ctrl-exec cert serial
openssl x509 -noout -serial -in /etc/ctrl-exec/ctrl-exec.crt \
| sed 's/serial=//' | tr 'A-F' 'a-f'
# Add to revoked-serials on the agent and reload
echo "<serial>" >> /etc/ctrl-exec-agent/revoked-serials
systemctl reload ctrl-exec-agent
# Attempt a ping - should fail
sudo ctrl-exec ping <agent>
# Check the agent log
sudo journalctl -u ctrl-exec-agent --since "1 minute ago" | grep revoked
- Pass
-
ACTION=cert-revokedor similar appears in the log. Ping fails. - Restore
-
Remove the serial from
revoked-serialsand reload before returning to normal operation.
10. Agent Restart Recovery
Confirms the agent restarts cleanly after a crash and that Restart=on-failure
in the unit file is functioning.
# On the agent host, kill the agent process abruptly
sudo kill -9 $(systemctl show -p MainPID ctrl-exec-agent | cut -d= -f2)
# Wait 5 seconds (RestartSec=5) then check
sleep 6
systemctl is-active ctrl-exec-agent
sudo ctrl-exec ping <agent>
- Pass
-
Agent returns to
activewithin a few seconds. Ping succeeds. - Fail
-
Agent remains in
failedstate. Checkjournalctl -u ctrl-exec-agentfor the failure reason. Common causes: cert file permissions changed, config parse error introduced since last start.
11. OpenWrt — procd Restart and logread
OpenWrt-specific. Confirms the agent runs under procd, survives a restart, and
logs to the ring buffer readable via logread.
# On the OpenWrt agent
/etc/init.d/ctrl-exec-agent restart
sleep 3
/etc/init.d/ctrl-exec-agent status
# From the ctrl-exec host
sudo ctrl-exec ping <openwrt-agent>
# On the OpenWrt agent
logread | grep ctrl-exec-agent | tail -10
- Pass
-
Status shows running. Ping succeeds.
ACTION=pingentry appears inlogreadoutput. - Fail
-
Agent does not start. Check
/etc/init.d/ctrl-exec-agentscript for correct interpreter path. OpenWrt may have Perl in a non-standard location — verify withwhich perland ensure the shebang line inctrl-exec-agentmatches.
When to Run These Checks
- After initial installation
- Checks 1, 2, 3, 7, 10 (and 11 if OpenWrt agents are present).
- After a unit file or agent.conf change
- Checks 1, 2, 3, 5, 6 as applicable to what changed.
- After a cert rotation or renewal
- Checks 8, 9.
- Before a production release
- All checks on at least one Debian agent and one OpenWrt agent.
- After any security incident or suspected compromise
- Checks 7, 8, 9 as a minimum. Consider full re-pairing of affected agents.