Governed autonomy: how we let an agent act on production

The question that decides whether you can deploy an IT-operations agent is not how smart it is. It is this: how does an agent earn the right to act on production?

Most automation vendors dodge it. The agent only suggests, and a human does the dangerous part. That is safe, and it is also where the value leaks out. The whole reason to put an agent in the loop is so it can do the work, not narrate it.

So we took a harder line. An agent can act on shared production state, including across cloud, storage, networking, security, and databases, but only if the act is engineered to be bounded, gated, and recorded. We call that governed autonomy. This post walks it from top to bottom. The mutation gate. The approval flow. The signed audit trail. The read-only subagents. The curated-only skill supply chain. And the integration layer that puts all of it in front of the systems you already run. Every mechanism here exists in the product today, except the curated-skill distribution, which is a launch capability we are finalizing now and I flag where it comes up.

First, the default posture

Before any of the machinery, there is a rule written into the agent's own instructions. Operate the way a good senior engineer does. Take reversible local actions freely. Reading files, running checks, dry-runs, status queries. For anything that changes shared state, deploying, modifying a config, force-pushing, sending a notification, opening a ticket, calling a write on an external system, say what you are about to do in one line and wait for confirmation. Investigate before fixing. Verify before reporting done. Never take a destructive shortcut to make an obstacle go away.

That rule is the soul of the thing. Everything below is the enforcement, because a rule the agent merely believes is not a control. A control is something the agent cannot route around.

The mutation gate

At the center is a policy layer we call the mutation gate. Before any tool call has a side effect, the gate looks at it and returns one of three verdicts. Allow, for reversible or read-only work, which proceeds. Approval-required, for anything that changes shared state, which pauses for a human. Block, for anything categorically unsafe, which is refused outright with a reason.

Two examples of how concrete this is.

For shell commands, the gate reads the command string. A configurable set of patterns triggers an approval requirement. Another set is blocked entirely. And there is one hard structural rule that exists to kill the most infamous foot-gun in operations. A recursive forced delete, rm -rf, aimed at a protected path, like /, ~, $HOME, the working directory, or ., is blocked. Not warned about. The gate parses the command, checks the flags, resolves the target, and refuses. That is the difference between a guardrail and a suggestion. The agent does not get to wipe your root filesystem even if a prompt, a bug, or a confused chain of reasoning leads it there.

For the systems you reach over the integration layer, the gate works at the level of the specific operation. You can require approval for, or outright block, a given operation on a given system. A read against your observability platform flows freely. A write against your firewall controller, a delete against a storage volume, or an IAM change on a cloud account pauses. All from policy, without touching the agent's wiring.

Approval is session-scoped, and you can grant it in bulk when you want to. If you are deliberately running a change and you have decided you trust a class of action for this session, you can pre-authorize it, and the gate records that you did. The default is still to ask.

A human in the loop, by construction

When the gate returns approval-required, the action does not run. It enters an approval queue, and the operator is asked to approve or deny it, in the same terminal, in the flow of the work.

The queue is single-flight on purpose. One approval is in flight at a time. If the agent tries several mutating actions at once, they are serialized so the operator answers one clear question at a time instead of rubber-stamping a batch. Aborts are handled cleanly. If the operator cancels mid-stream, the pending action resolves as cancelled and the session stays coherent. There is no path where a gated action slips through while attention is elsewhere.

This is what "the operator stays in command" means in practice. Autonomy runs right up to the edge of consequence, and then it asks.

The signed audit trail

Governance you cannot prove after the fact is not governance an enterprise can defend in an audit. So every interesting event in a Sia session is written to an append-only audit log. Session start and end. Every prompt the operator submits. Every policy decision the gate makes. Every approval request and the operator's answer. Every tool call. Every skill invocation. Every command.

Two things make this more than a log file.

It is signed. Each entry is signed with HMAC-SHA256 over a canonicalized, key-sorted JSON form, using a per-install key generated on first use and stored at 0600. Canonicalization matters, because it makes the signature stable regardless of key order, so verification is deterministic. The practical result is that the log is tamper-evident. You cannot quietly edit a past entry without breaking its signature.

It exports as evidence, and re-verifies on the way out. sia audit --since 7d --session <id> --to evidence.jsonl walks the log, filters by time window and session, and re-checks every signature as it exports, reporting how many entries were scanned, how many written, and how many failed verification, exiting non-zero if any did. That last part is the one your compliance team cares about. The export is not a trust-me dump. It is a verified extract. For SOC 2, ISO 27001, or any internal review, the session is its own evidence.

Read-only-by-contract subagents

A lot of operations work is investigation, the part that touches the most systems and reads the most state, and so the part where an over-eager agent could do the most accidental harm. Our answer is to hand that work to subagents that physically cannot change anything.

These are not subagents that have been told to be careful. They are built without mutating tools at all. A couple that ship today: a pre-check subagent that runs before a change to gather current state, dependencies, recent activity, and any open issues on the target, then returns a structured report with a clear proceed, pause, or block recommendation. And a compliance subagent that walks a target against a baseline (CIS, PCI-DSS, a SOC 2 control list, or an org policy file), classifies each control, and returns a ranked findings report without remediating anything. The pattern holds whether the target is a network device, a Linux host, a storage system, or a cloud account. The investigation that touches everything is done by an agent that can change nothing. The decision to act stays on the main thread, behind the gate.

The curated-only skill supply chain

An agent's knowledge is part of its risk surface, as much as its tools are. A skill is a packaged, multi-step procedure the agent can run. It is useful because it lets you encode your own runbooks, like a config change with a mandatory diff preview and a post-change verify, or a disciplined incident triage that caps itself at a handful of tool calls, and have the agent run them the same way every time. Skills ship with Sia today in a standard SKILL.md format, and operators can author their own.

But the moment an agent can dynamically pull in a skill, you have a supply-chain question. Where did that skill come from, and who reviewed it? An agent loading an arbitrary skill off the open internet is a prompt-injection and malicious-payload vector aimed straight at your production estate.

Our answer, a launch capability we are finalizing now, is a closed, curated supply chain. The agent can dynamically find and load skills from exactly two places. Skills bundled into the binary, reviewed at build time, and a Scogo-curated source that is pinned and integrity-verified. The trust decision happens once, at sync time, against a known, vetted source. Not per-use, and never against the open web. After sync, the agent can search and load curated skills with no friction, because the bytes were already vetted, and every load is recorded in the audit trail. There is deliberately no code path by which the agent reaches an un-curated skill.

The principle is worth stating plainly. Curation is the process control. Pinning and hash verification are the technical control behind it. Together they make "only Scogo-approved skill bytes ever run on your estate" an enforced, auditable fact instead of a promise. I am describing the model and direction here, not committing to specific internals or dates while it is being finalized.

The integration layer that puts this in front of your estate

None of the above matters if the agent can only act on the local box. The reason Sia is useful across a real estate is that it connects to the systems you already run. It speaks the Model Context Protocol natively, over stdio, HTTP, or SSE, and it is the terminal surface for SIA, which reaches the wider estate through Scogo's Agent Fabric, the layer that connects to cloud, storage, virtualization, networking, security, service desk, databases, and observability.

Two things make this enterprise-grade rather than a demo. First, you can validate connectivity before you ever open a session. sia mcp check probes every configured server, reports status and tool count, and exits non-zero on failure, so it drops into CI or a pre-deploy step. Second, every tool a connected system exposes runs through the same mutation gate. An integration does not get a governance exemption because it is external. A write against a connected cloud account or storage system is gated exactly like a local one.

That is the join between reach and governance. You can connect Sia to anything across the estate, and everything you connect inherits the gate, the approval flow, and the audit trail automatically.

The whole shape

Step back and it is one coherent stance. The agent leans toward reversible action and careful investigation. A mutation gate gives every action a verdict. A human approves anything that changes shared state. Read-only subagents do the risky exploration. The skills that encode operational knowledge come only from a vetted source. Everything the agent can reach, local or across the estate, passes through the same controls. And all of it lands, signed and tamper-evident, in a log you can export and verify.

That is governed autonomy. Not an agent you hope behaves, and not a suggestion engine that leaves the value on the table. An agent that can genuinely act, across your estate, inside a boundary you set and can prove.

Where this is going

We built Sia CLI for the people who keep production running and the leaders accountable for it. The mission is the same one I opened with. An IT-operations agent that reaches the whole estate, in the terminal your team already uses, autonomous where it is safe, governed where it matters, on the record everywhere.

Sia CLI is part of the Scogo platform we deliver to our enterprise customers. If you are running a large IT estate and the gap between "the systems are programmable" and "the work has not changed" is one you feel every week, that is the conversation I want to have.

Sia CLI is the command-line interface for the Scogo AI platform, the terminal surface for SIA. Available to Scogo enterprise customers.