Governance Contracts

Status: Optional Extension Prerequisites: Understanding of Reflection Blueprints, ACL Tiers, and Tripwires
Related: ACGP-1010 Specification, ACGP-1004 Blueprints


TL;DR (30 Seconds)

Governance Contracts let agents and stewards negotiate performance vs. quality trade-offs on a per-action basis. Agents classify actions by risk level (low_risk, elevated_risk, critical_risk), and stewards apply proportionate evaluation depth (Eval Tier 0-3) with explicit latency budgets and fallback behaviors.

Why it matters: Without contracts, all actions get the same governance depth. With contracts, routine actions get fast approval, while critical decisions get deep evaluation.


Blueprints Come First (Quick Reminder)

Governance contracts don’t define policy—they define runtime constraints (budget/tier/fallback) on how a policy is applied.


The Problem: One-Size-Fits-All Governance

Traditional Approach

Without governance contracts, every action from an ACL-3 agent gets the same governance treatment, regardless of risk:

  "What's the weather?" → 300ms governance evaluation
 "Transfer $50,000" → 300ms governance evaluation

Issues: - Slow routine actions: Simple queries suffer unnecessary latency - Rushed critical decisions: High-stakes actions need deeper evaluation than time allows - Fixed budgets: Can't adapt to action-specific risk profiles


The Solution: Risk-Based Contracts

Governance Contracts introduce per-action risk classification and negotiated evaluation depth.

Three Risk Levels

Risk Level Description Default Budget Eval Tier Example Actions
low_risk Routine, low-impact 100ms 0-1 Read queries, status checks, safe searches
elevated_risk Sensitive operations 300ms 1-2 Write operations, API calls, data updates
critical_risk High-stakes decisions 5000ms (5s) 2-3 Financial transactions, access grants, irreversible changes

Four Evaluation Tiers

Tier Strategy Latency Examples
0 In-memory rules <100ms Rate limits, regex patterns, hardcoded blocklists
1 DB/cache lookups <300ms Policy table queries, indexed tripwires, cached decisions
2 Model inference <5s LLM-based reasoning, semantic analysis, anomaly detection
3 Human review On-demand Manual approval, expert consultation, ethical review

How It Works

1. Agent Declares Risk

# Agent classifies action before submission
governance_contract = {
    "risk_level": "low_risk",     # Routine read operation
    "eval_tier": 0,                # In-memory rules only
    "performance_budget": {
        "latency_budget_ms": 100,  # Must respond in 100ms
        "fallback_behavior": "deny" # Conservative fallback
    }
}

response = steward.evaluate(action, governance_contract)

2. Steward Applies Proportionate Evaluation

flowchart TD
    A["Action: Get customer balance<br/>Risk: low_risk, Tier: 0"]
    B["Steward: Eval-0 Only (in-memory)<br/>• Rate limit check (15ms)<br/>• Regex blocklist (5ms)<br/>Total: 20ms (budget: 100ms)"]
    C["✓ ALLOW<br/>(75ms total E2E)"]

    A --> B --> C

    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style C fill:#e8f5e9,stroke:#388e3c,stroke-width:2px

3. Fallback on Timeout

If evaluation exceeds the budget, the steward applies the negotiated fallback behavior:

Fallback Behavior Use Case
deny Conservative block Safety-critical systems where risk of bad action > risk of blocking good action
allow_and_log Permissive + async audit High-volume workflows where availability > perfect governance
cached_decision Reuse similar past decision Repetitive actions with stable risk profiles
escalate Require human override Ambiguous high-stakes decisions needing expert judgment

Architecture Patterns

Stewards can adopt different architectures based on cost/latency/quality needs:

Pattern 1: Rule-Only (Tier 0-1)

Target: High-volume, low-cost, predictable latency
Cost: ~$500/month
Latency: 100-300ms

Agent → [Tier 0: Rules] → [Tier 1: Cache] → DECISION
         <50ms            <250ms

Trade-offs: - Fast, cheap, deterministic - No LLM reasoning, can't handle novel actions

Pattern 2: Hybrid (Tier 0-1 sync, Tier 2 async)

Target: Balanced performance and quality
Cost: ~$2,000/month
Latency: 100-300ms sync, 5s async audit

Agent → [Tier 0/1] → DECISION (fast)
    [Tier 2 async] → Post-action review

Trade-offs: - Fast sync path, deep async reasoning - [WARNING] Risk window during async evaluation

Pattern 3: Max Quality (All Tiers Sync)

Target: Safety-critical, low-volume
Cost: ~$20,000/month
Latency: Up to 5s + human review

Agent → [Tier 0] → [Tier 1] → [Tier 2] → [Tier 3] → DECISION
         <50ms      <250ms      <4.5s      on-demand

Trade-offs: - Maximum safety and oversight - High cost and latency


Capability Negotiation

Agents and stewards negotiate capabilities during the SYNC handshake:

// Agent announces capabilities
{
  "type": "SYNC",
  "agent_id": "customer-bot",
  "capabilities": {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": true,
    "max_eval_tier": 2
  }
}

// Steward responds
{
  "type": "SYNC_ACK",
  "capabilities": {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": true,
    "available_eval_tiers": [0, 1, 2, 3]
  }
}

// Agreed: Use Tier 0-2 (agent's max)

Benefits

For Agents

Faster routine actions: 100ms for simple reads vs. 300ms universal
Deeper critical evaluation: 5s for financial transactions vs. 300ms universal
Explicit performance contracts: Know exactly what latency to expect
Graceful degradation: Fallback behaviors prevent silent failures

For Stewards

Cost optimization: Expensive Tier ⅔ only for high-risk actions
Flexible architecture: Choose pattern (Rule-Only, Hybrid, Max Quality) per deployment
Better resource allocation: Model inference budget saved for critical decisions
SLA compliance: Explicit budgets enable enforceable latency SLAs

For Organizations

Risk-proportionate governance: Safety where it matters, speed where it doesn't
Transparent trade-offs: Performance vs. quality decisions are explicit
Audit trail: Every contract and fallback decision is logged
Backward compatibility: Agents without contracts continue to work (no contracts = default behavior)


When to Use Governance Contracts

Use Governance Contracts When:

  • Deploying agents with mixed action types (reads, writes, transactions)
  • Operating in high-volume environments where latency matters (>100 actions/sec)
  • Need to meet strict SLAs (e.g., 95th percentile <200ms)
  • Want to optimize costs by reserving expensive eval for critical actions
  • Require explicit fallback behaviors for regulatory compliance

[WARNING] Consider Alternatives When:

  • All actions are similar risk: One eval tier may suffice
  • Low volume (<10 actions/min): Contract negotiation overhead not worth it
  • Uniform latency acceptable: Don't need per-action tuning
  • Early development: Complexity of contracts may slow iteration

Getting Started with Governance Contracts

Phase 1: Add Capability Flag

# Announce support during SYNC
capabilities = {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": True
}

Phase 2: Classify High-Volume Actions

# Start with low-risk reads
if action.type == "read":
    contract = {
        "risk_level": "low_risk",
        "eval_tier": 0,
        "performance_budget": {"latency_budget_ms": 100}
    }

Phase 3: Add Critical Action Contracts

# Financial transactions get deep eval
if action.type == "transfer" and action.amount > 10000:
    contract = {
        "risk_level": "critical_risk",
        "eval_tier": 3,
        "performance_budget": {
            "latency_budget_ms": 10000,
            "fallback_behavior": "escalate"
        }
    }

Phase 4: Optimize Fallbacks

# Tune fallbacks based on production data
contract["performance_budget"]["fallback_behavior"] = (
    "cached_decision" if action.is_repetitive else "deny"
)


Security Considerations

Agent Risk Misclassification

Threat: Agent classifies high-risk action as low_risk to bypass governance

Mitigation: - Stewards MUST validate risk classification against action semantics - Tripwires (Tier 0) ALWAYS run regardless of contract - Stewards MAY override agent's declared eval_tier if risk assessment disagrees

Fallback Abuse

Threat: Agent sets ultra-low budgets to force permissive fallback

Mitigation: - Stewards MUST enforce minimum budgets per tier (e.g., Tier 0 ≥50ms) - Suspicious patterns (repeated timeouts) trigger FLAG interventions - Trust debt accumulates for fallback overuse

Capability Spoofing

Threat: Malicious agent claims max_eval_tier: 0 to avoid deep evaluation

Mitigation: - ACL tier determines minimum eval tier requirements - ACL-4+ MUST use Tier 2+ for critical actions regardless of contract - Stewards MAY reject SYNC with insufficient capabilities


Real-World Example

Customer Service Agent (Hybrid Pattern)

class CustomerServiceAgent:
    def handle_request(self, request):
        # Classify risk
        if request.type == "get_balance":
            contract = self.low_risk_contract()  # Tier 0, 100ms
        elif request.type == "update_address":
            contract = self.elevated_risk_contract()  # Tier 1, 300ms
        elif request.type == "refund":
            contract = self.critical_risk_contract()  # Tier 2, 5s

        # Submit with contract
        response = self.steward.evaluate(request, contract)

        # Handle timeouts
        if response.governance_status.timeout:
            if contract.fallback_behavior == "escalate":
                return self.queue_for_human_review(request)
            elif contract.fallback_behavior == "cached_decision":
                return self.use_cached_decision(request)

Results: - 70% of actions (balance checks): <100ms (Tier 0) - 25% of actions (updates): <300ms (Tier 1) - 5% of actions (refunds): <2s (Tier 2, 99th percentile) - Overall cost: 60% reduction vs. universal Tier 2 evaluation - Overall latency (95th percentile): 200ms vs. 1000ms without contracts


Next Steps


FAQ

Q: Are governance contracts required?
A: No. They're an optional extension. Default behavior (uniform evaluation) still works.

Q: Can I use contracts with MCP/A2A integrations?
A: Yes, if both sides support governance contracts. Capability negotiation happens during SYNC.

Q: What if steward doesn't support my requested tier?
A: Steward returns available tiers in SYNC_ACK. Agent must adapt or fail gracefully.

Q: Do contracts override ACL tier rules?
A: No. Contracts work within ACL constraints. ACL-5 agent can't request Tier 0 for critical actions.

Q: How do I measure if contracts are helping?
A: Monitor governance_status.actual_latency_ms and fallback_used fields in ReflectionDB audit logs.