The approval design post built the model. The refund walkthrough showed it on real screens. Both rest on three hard-won properties: the approval survives a restart, what was reviewed is what runs, and the requester cannot approve. This post explains how those are configured — and where they break down.
An approval that survives the wait
The most common mistake: holding pending approvals in memory. High-risk approvals can sit for hours, even days. If a deploy, a crash, or a slow reviewer can outlast the worker process, the control fails. A durable workflow holds the pending action across the gap.
The dangerous gap is between approval and execution. A durable workflow holds the pending action across that gap, and the decisive check happens on resume, not on click.
The decisive check runs on resume, not on click. Token status, scope, policy, and the target object version can all shift between approval and execution. A stale approval looks controlled but no longer matches reality. That is why approvals should be short-lived. A 24-hour expiry is not bureaucracy. It limits how far reality can drift. The same configuration supports per-action overrides. A credit and a refund under one plugin can carry different expiries and reviewer rules. No separate integrations required.
The requester cannot approve their own request
The rule should be enforced by the system, not remembered by the reviewer. In Cedar, it takes a few lines:
forbid(
principal,
action == Action::"approve_refund",
resource
)
when {
principal == resource.requestedBy
};
A system that cannot express "the person who proposed this cannot approve it" lacks a two-person review. It has a confirmation dialog with extra steps. The same policy layer can also require a specific role when an amount crosses a threshold. It can deny entire classes of requests before any human sees them. Policy is the cheap, deterministic filter. Run it before the expensive one.
An action you have not named is an action you cannot gate
Everything above assumes the action exists as a named, controlled integration. Most do not ship that way. Plugins are how the action surface grows. Each one adds an external effect — a refund, a customer message, an access change — behind the same role, policy, and approval checks. Because a plugin is a contract rather than scattered code, a new high-risk action can be added by you, your technical team, or a coding agent working to that contract. It inherits the two-person review model the moment it is registered. The detail of writing one — the auth modes, the health check, the request and response shapes — is its own subject. The point here is that extensibility and control are not in tension when the extension point is the action itself. The control does not care which model proposed the action, or whether it runs in your cloud or on-prem. The agent, the integration, and the deployment can each change without touching the control.
Measure the gate, not the latency
Most teams track approval latency and approval rate and stop there. Neither tells you whether the control works. Track these instead:
- high-impact actions that skipped the right checkpoint
- reviewer acceptance rate (a fatigue signal)
- edit and reject rate (proof the review is real)
- revalidation failures on resume
- approved actions that still caused harm
The two that matter most: the escape rate and the human cost per thousand reviewed actions. Everything else explains why they moved.
What does not yet have a clean answer
This series kept to the tractable case: one action, one reviewer. That covers a lot of real risk, but three problems are still open.
First: multi-agent delegation. When one agent hands work to another, the original intent, authority boundary, and approval evidence must travel together. The handoff cannot create a confused deputy or a hidden escalation path. Most controls today are local to a single action.
Second: portable approval evidence. Tools, policy engines, workflow systems, and plugin transports each carry a fragment of approval meaning — who can approve, what can be edited, how long an approval stays valid, what must be revalidated. There is no shared standard to carry that meaning across tools.
Third: proving the reviewed action equals the executed action at scale. The simple version is a payload dialog. The hard version involves canonicalisation, hidden defaults, target-version drift, and side effects from steps that run before the checkpoint.
The operating principle holds across all three posts. Shrink the surface first, gate what survives, and make every gated decision reconstructible. Give the reviewer the exact payload, keep the approval durable and short-lived, separate the requester from the approver, and keep the request, the decision, and the execution on one record. That is the difference between a control that holds up in an audit and a button that means nothing.