Governance Contracts¶
Status: Optional Extension
Prerequisites: Understanding of Reflection Blueprints, ACL Tiers, and Tripwires
Related: ACGP-1010 Specification, ACGP-1004 Blueprints
TL;DR (30 Seconds)¶
Governance Contracts let agents and stewards negotiate performance vs. quality trade-offs on a per-action basis. Agents classify actions by risk level (low_risk, elevated_risk, critical_risk), and stewards apply proportionate evaluation depth (Eval Tier 0-3) with explicit latency budgets and fallback behaviors.
Why it matters: Without contracts, all actions get the same governance depth. With contracts, routine actions get fast approval, while critical decisions get deep evaluation.
Blueprints Come First (Quick Reminder)¶
Governance contracts don’t define policy—they define runtime constraints (budget/tier/fallback) on how a policy is applied.
- Start here for the policy mental model: Reflection Blueprints
- Authoritative schema/semantics: ACGP-1004 Blueprint Specification
The Problem: One-Size-Fits-All Governance¶
Traditional Approach¶
Without governance contracts, every action from an ACL-3 agent gets the same governance treatment, regardless of risk:
"What's the weather?" → 300ms governance evaluation
"Transfer $50,000" → 300ms governance evaluation
Issues: - Slow routine actions: Simple queries suffer unnecessary latency - Rushed critical decisions: High-stakes actions need deeper evaluation than time allows - Fixed budgets: Can't adapt to action-specific risk profiles
The Solution: Risk-Based Contracts¶
Governance Contracts introduce per-action risk classification and negotiated evaluation depth.
Three Risk Levels¶
| Risk Level | Description | Default Budget | Eval Tier | Example Actions |
|---|---|---|---|---|
low_risk |
Routine, low-impact | 100ms | 0-1 | Read queries, status checks, safe searches |
elevated_risk |
Sensitive operations | 300ms | 1-2 | Write operations, API calls, data updates |
critical_risk |
High-stakes decisions | 5000ms (5s) | 2-3 | Financial transactions, access grants, irreversible changes |
Four Evaluation Tiers¶
| Tier | Strategy | Latency | Examples |
|---|---|---|---|
| 0 | In-memory rules | <100ms | Rate limits, regex patterns, hardcoded blocklists |
| 1 | DB/cache lookups | <300ms | Policy table queries, indexed tripwires, cached decisions |
| 2 | Model inference | <5s | LLM-based reasoning, semantic analysis, anomaly detection |
| 3 | Human review | On-demand | Manual approval, expert consultation, ethical review |
How It Works¶
1. Agent Declares Risk¶
# Agent classifies action before submission
governance_contract = {
"risk_level": "low_risk", # Routine read operation
"eval_tier": 0, # In-memory rules only
"performance_budget": {
"latency_budget_ms": 100, # Must respond in 100ms
"fallback_behavior": "deny" # Conservative fallback
}
}
response = steward.evaluate(action, governance_contract)
2. Steward Applies Proportionate Evaluation¶
flowchart TD
A["Action: Get customer balance<br/>Risk: low_risk, Tier: 0"]
B["Steward: Eval-0 Only (in-memory)<br/>• Rate limit check (15ms)<br/>• Regex blocklist (5ms)<br/>Total: 20ms (budget: 100ms)"]
C["✓ ALLOW<br/>(75ms total E2E)"]
A --> B --> C
style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px
style C fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
3. Fallback on Timeout¶
If evaluation exceeds the budget, the steward applies the negotiated fallback behavior:
| Fallback | Behavior | Use Case |
|---|---|---|
deny |
Conservative block | Safety-critical systems where risk of bad action > risk of blocking good action |
allow_and_log |
Permissive + async audit | High-volume workflows where availability > perfect governance |
cached_decision |
Reuse similar past decision | Repetitive actions with stable risk profiles |
escalate |
Require human override | Ambiguous high-stakes decisions needing expert judgment |
Architecture Patterns¶
Stewards can adopt different architectures based on cost/latency/quality needs:
Pattern 1: Rule-Only (Tier 0-1)¶
Target: High-volume, low-cost, predictable latency
Cost: ~$500/month
Latency: 100-300ms
Trade-offs: - Fast, cheap, deterministic - No LLM reasoning, can't handle novel actions
Pattern 2: Hybrid (Tier 0-1 sync, Tier 2 async)¶
Target: Balanced performance and quality
Cost: ~$2,000/month
Latency: 100-300ms sync, 5s async audit
Trade-offs: - Fast sync path, deep async reasoning - [WARNING] Risk window during async evaluation
Pattern 3: Max Quality (All Tiers Sync)¶
Target: Safety-critical, low-volume
Cost: ~$20,000/month
Latency: Up to 5s + human review
Trade-offs: - Maximum safety and oversight - High cost and latency
Capability Negotiation¶
Agents and stewards negotiate capabilities during the SYNC handshake:
// Agent announces capabilities
{
"type": "SYNC",
"agent_id": "customer-bot",
"capabilities": {
"protocol_version": "1.1.0",
"supports_governance_contracts": true,
"max_eval_tier": 2
}
}
// Steward responds
{
"type": "SYNC_ACK",
"capabilities": {
"protocol_version": "1.1.0",
"supports_governance_contracts": true,
"available_eval_tiers": [0, 1, 2, 3]
}
}
// Agreed: Use Tier 0-2 (agent's max)
Benefits¶
For Agents¶
Faster routine actions: 100ms for simple reads vs. 300ms universal
Deeper critical evaluation: 5s for financial transactions vs. 300ms universal
Explicit performance contracts: Know exactly what latency to expect
Graceful degradation: Fallback behaviors prevent silent failures
For Stewards¶
Cost optimization: Expensive Tier ⅔ only for high-risk actions
Flexible architecture: Choose pattern (Rule-Only, Hybrid, Max Quality) per deployment
Better resource allocation: Model inference budget saved for critical decisions
SLA compliance: Explicit budgets enable enforceable latency SLAs
For Organizations¶
Risk-proportionate governance: Safety where it matters, speed where it doesn't
Transparent trade-offs: Performance vs. quality decisions are explicit
Audit trail: Every contract and fallback decision is logged
Backward compatibility: Agents without contracts continue to work (no contracts = default behavior)
When to Use Governance Contracts¶
Use Governance Contracts When:¶
- Deploying agents with mixed action types (reads, writes, transactions)
- Operating in high-volume environments where latency matters (>100 actions/sec)
- Need to meet strict SLAs (e.g., 95th percentile <200ms)
- Want to optimize costs by reserving expensive eval for critical actions
- Require explicit fallback behaviors for regulatory compliance
[WARNING] Consider Alternatives When:¶
- All actions are similar risk: One eval tier may suffice
- Low volume (<10 actions/min): Contract negotiation overhead not worth it
- Uniform latency acceptable: Don't need per-action tuning
- Early development: Complexity of contracts may slow iteration
Getting Started with Governance Contracts¶
Phase 1: Add Capability Flag
# Announce support during SYNC
capabilities = {
"protocol_version": "1.1.0",
"supports_governance_contracts": True
}
Phase 2: Classify High-Volume Actions
# Start with low-risk reads
if action.type == "read":
contract = {
"risk_level": "low_risk",
"eval_tier": 0,
"performance_budget": {"latency_budget_ms": 100}
}
Phase 3: Add Critical Action Contracts
# Financial transactions get deep eval
if action.type == "transfer" and action.amount > 10000:
contract = {
"risk_level": "critical_risk",
"eval_tier": 3,
"performance_budget": {
"latency_budget_ms": 10000,
"fallback_behavior": "escalate"
}
}
Phase 4: Optimize Fallbacks
# Tune fallbacks based on production data
contract["performance_budget"]["fallback_behavior"] = (
"cached_decision" if action.is_repetitive else "deny"
)
Security Considerations¶
Agent Risk Misclassification¶
Threat: Agent classifies high-risk action as low_risk to bypass governance
Mitigation: - Stewards MUST validate risk classification against action semantics - Tripwires (Tier 0) ALWAYS run regardless of contract - Stewards MAY override agent's declared eval_tier if risk assessment disagrees
Fallback Abuse¶
Threat: Agent sets ultra-low budgets to force permissive fallback
Mitigation: - Stewards MUST enforce minimum budgets per tier (e.g., Tier 0 ≥50ms) - Suspicious patterns (repeated timeouts) trigger FLAG interventions - Trust debt accumulates for fallback overuse
Capability Spoofing¶
Threat: Malicious agent claims max_eval_tier: 0 to avoid deep evaluation
Mitigation: - ACL tier determines minimum eval tier requirements - ACL-4+ MUST use Tier 2+ for critical actions regardless of contract - Stewards MAY reject SYNC with insufficient capabilities
Real-World Example¶
Customer Service Agent (Hybrid Pattern)¶
class CustomerServiceAgent:
def handle_request(self, request):
# Classify risk
if request.type == "get_balance":
contract = self.low_risk_contract() # Tier 0, 100ms
elif request.type == "update_address":
contract = self.elevated_risk_contract() # Tier 1, 300ms
elif request.type == "refund":
contract = self.critical_risk_contract() # Tier 2, 5s
# Submit with contract
response = self.steward.evaluate(request, contract)
# Handle timeouts
if response.governance_status.timeout:
if contract.fallback_behavior == "escalate":
return self.queue_for_human_review(request)
elif contract.fallback_behavior == "cached_decision":
return self.use_cached_decision(request)
Results: - 70% of actions (balance checks): <100ms (Tier 0) - 25% of actions (updates): <300ms (Tier 1) - 5% of actions (refunds): <2s (Tier 2, 99th percentile) - Overall cost: 60% reduction vs. universal Tier 2 evaluation - Overall latency (95th percentile): 200ms vs. 1000ms without contracts
Next Steps¶
- Read the spec: ACGP-1010: Governance Contracts
- Interactive tool: Latency Calculator
- Architecture reference: ACGP-1002 8.6: Architecture Patterns
FAQ¶
Q: Are governance contracts required?
A: No. They're an optional extension. Default behavior (uniform evaluation) still works.
Q: Can I use contracts with MCP/A2A integrations?
A: Yes, if both sides support governance contracts. Capability negotiation happens during SYNC.
Q: What if steward doesn't support my requested tier?
A: Steward returns available tiers in SYNC_ACK. Agent must adapt or fail gracefully.
Q: Do contracts override ACL tier rules?
A: No. Contracts work within ACL constraints. ACL-5 agent can't request Tier 0 for critical actions.
Q: How do I measure if contracts are helping?
A: Monitor governance_status.actual_latency_ms and fallback_used fields in ReflectionDB audit logs.