Governance Contracts¶

Status: Optional Extension Prerequisites: Understanding of Reflection Blueprints, ACL Tiers, and Tripwires
Related: ACGP-1010 Specification, ACGP-1004 Blueprints

TL;DR (30 Seconds)¶

Governance Contracts let agents and stewards negotiate performance vs. quality trade-offs on a per-action basis. Agents classify actions by risk level (low_risk, elevated_risk, critical_risk), and stewards apply proportionate evaluation depth (Eval Tier 0-3) with explicit latency budgets and fallback behaviors.

Why it matters: Without contracts, all actions get the same governance depth. With contracts, routine actions get fast approval, while critical decisions get deep evaluation.

Blueprints Come First (Quick Reminder)¶

Governance contracts don’t define policy—they define runtime constraints (budget/tier/fallback) on how a policy is applied.

Start here for the policy mental model: Reflection Blueprints
Authoritative schema/semantics: ACGP-1004 Blueprint Specification

The Problem: One-Size-Fits-All Governance¶

Traditional Approach¶

Without governance contracts, every action from an ACL-3 agent gets the same governance treatment, regardless of risk:

  "What's the weather?" → 300ms governance evaluation
 "Transfer $50,000" → 300ms governance evaluation

Issues: - Slow routine actions: Simple queries suffer unnecessary latency - Rushed critical decisions: High-stakes actions need deeper evaluation than time allows - Fixed budgets: Can't adapt to action-specific risk profiles

The Solution: Risk-Based Contracts¶

Governance Contracts introduce per-action risk classification and negotiated evaluation depth.

Three Risk Levels¶

Risk Level	Description	Default Budget	Eval Tier	Example Actions
`low_risk`	Routine, low-impact	100ms	0-1	Read queries, status checks, safe searches
`elevated_risk`	Sensitive operations	300ms	1-2	Write operations, API calls, data updates
`critical_risk`	High-stakes decisions	5000ms (5s)	2-3	Financial transactions, access grants, irreversible changes

Four Evaluation Tiers¶

Tier	Strategy	Latency	Examples
0	In-memory rules	<100ms	Rate limits, regex patterns, hardcoded blocklists
1	DB/cache lookups	<300ms	Policy table queries, indexed tripwires, cached decisions
2	Model inference	<5s	LLM-based reasoning, semantic analysis, anomaly detection
3	Human review	On-demand	Manual approval, expert consultation, ethical review

How It Works¶

1. Agent Declares Risk¶

# Agent classifies action before submission
governance_contract = {
    "risk_level": "low_risk",     # Routine read operation
    "eval_tier": 0,                # In-memory rules only
    "performance_budget": {
        "latency_budget_ms": 100,  # Must respond in 100ms
        "fallback_behavior": "deny" # Conservative fallback
    }
}

response = steward.evaluate(action, governance_contract)

2. Steward Applies Proportionate Evaluation¶

flowchart TD
    A["Action: Get customer balance<br/>Risk: low_risk, Tier: 0"]
    B["Steward: Eval-0 Only (in-memory)<br/>• Rate limit check (15ms)<br/>• Regex blocklist (5ms)<br/>Total: 20ms (budget: 100ms)"]
    C["✓ ALLOW<br/>(75ms total E2E)"]

    A --> B --> C

    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style C fill:#e8f5e9,stroke:#388e3c,stroke-width:2px

3. Fallback on Timeout¶

If evaluation exceeds the budget, the steward applies the negotiated fallback behavior:

Fallback	Behavior	Use Case
`deny`	Conservative block	Safety-critical systems where risk of bad action > risk of blocking good action
`allow_and_log`	Permissive + async audit	High-volume workflows where availability > perfect governance
`cached_decision`	Reuse similar past decision	Repetitive actions with stable risk profiles
`escalate`	Require human override	Ambiguous high-stakes decisions needing expert judgment

Architecture Patterns¶

Stewards can adopt different architectures based on cost/latency/quality needs:

Pattern 1: Rule-Only (Tier 0-1)¶

Target: High-volume, low-cost, predictable latency
Cost: ~$500/month
Latency: 100-300ms

Agent → [Tier 0: Rules] → [Tier 1: Cache] → DECISION
         <50ms            <250ms

Trade-offs: - Fast, cheap, deterministic - No LLM reasoning, can't handle novel actions

Pattern 2: Hybrid (Tier 0-1 sync, Tier 2 async)¶

Target: Balanced performance and quality
Cost: ~$2,000/month
Latency: 100-300ms sync, 5s async audit

Agent → [Tier 0/1] → DECISION (fast)
         ↓
    [Tier 2 async] → Post-action review

Trade-offs: - Fast sync path, deep async reasoning - [WARNING] Risk window during async evaluation

Pattern 3: Max Quality (All Tiers Sync)¶

Target: Safety-critical, low-volume
Cost: ~$20,000/month
Latency: Up to 5s + human review

Agent → [Tier 0] → [Tier 1] → [Tier 2] → [Tier 3] → DECISION
         <50ms      <250ms      <4.5s      on-demand

Trade-offs: - Maximum safety and oversight - High cost and latency

Capability Negotiation¶

Agents and stewards negotiate capabilities during the SYNC handshake:

// Agent announces capabilities
{
  "type": "SYNC",
  "agent_id": "customer-bot",
  "capabilities": {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": true,
    "max_eval_tier": 2
  }
}

// Steward responds
{
  "type": "SYNC_ACK",
  "capabilities": {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": true,
    "available_eval_tiers": [0, 1, 2, 3]
  }
}

// Agreed: Use Tier 0-2 (agent's max)

Benefits¶

For Agents¶

Faster routine actions: 100ms for simple reads vs. 300ms universal
Deeper critical evaluation: 5s for financial transactions vs. 300ms universal
Explicit performance contracts: Know exactly what latency to expect
Graceful degradation: Fallback behaviors prevent silent failures

For Stewards¶

Cost optimization: Expensive Tier ⅔ only for high-risk actions
Flexible architecture: Choose pattern (Rule-Only, Hybrid, Max Quality) per deployment
Better resource allocation: Model inference budget saved for critical decisions
SLA compliance: Explicit budgets enable enforceable latency SLAs

For Organizations¶

Risk-proportionate governance: Safety where it matters, speed where it doesn't
Transparent trade-offs: Performance vs. quality decisions are explicit
Audit trail: Every contract and fallback decision is logged
Backward compatibility: Agents without contracts continue to work (no contracts = default behavior)

When to Use Governance Contracts¶

Use Governance Contracts When:¶

Deploying agents with mixed action types (reads, writes, transactions)
Operating in high-volume environments where latency matters (>100 actions/sec)
Need to meet strict SLAs (e.g., 95^th percentile <200ms)
Want to optimize costs by reserving expensive eval for critical actions
Require explicit fallback behaviors for regulatory compliance

[WARNING] Consider Alternatives When:¶

All actions are similar risk: One eval tier may suffice
Low volume (<10 actions/min): Contract negotiation overhead not worth it
Uniform latency acceptable: Don't need per-action tuning
Early development: Complexity of contracts may slow iteration

Getting Started with Governance Contracts¶

Phase 1: Add Capability Flag

# Announce support during SYNC
capabilities = {
    "protocol_version": "1.1.0",
    "supports_governance_contracts": True
}

Phase 2: Classify High-Volume Actions

# Start with low-risk reads
if action.type == "read":
    contract = {
        "risk_level": "low_risk",
        "eval_tier": 0,
        "performance_budget": {"latency_budget_ms": 100}
    }

Phase 3: Add Critical Action Contracts

# Financial transactions get deep eval
if action.type == "transfer" and action.amount > 10000:
    contract = {
        "risk_level": "critical_risk",
        "eval_tier": 3,
        "performance_budget": {
            "latency_budget_ms": 10000,
            "fallback_behavior": "escalate"
        }
    }

Phase 4: Optimize Fallbacks

# Tune fallbacks based on production data
contract["performance_budget"]["fallback_behavior"] = (
    "cached_decision" if action.is_repetitive else "deny"
)

Security Considerations¶

Agent Risk Misclassification¶

Threat: Agent classifies high-risk action as low_risk to bypass governance

Mitigation: - Stewards MUST validate risk classification against action semantics - Tripwires (Tier 0) ALWAYS run regardless of contract - Stewards MAY override agent's declared eval_tier if risk assessment disagrees

Fallback Abuse¶

Threat: Agent sets ultra-low budgets to force permissive fallback

Mitigation: - Stewards MUST enforce minimum budgets per tier (e.g., Tier 0 ≥50ms) - Suspicious patterns (repeated timeouts) trigger FLAG interventions - Trust debt accumulates for fallback overuse

Capability Spoofing¶

Threat: Malicious agent claims max_eval_tier: 0 to avoid deep evaluation

Mitigation: - ACL tier determines minimum eval tier requirements - ACL-4+ MUST use Tier 2+ for critical actions regardless of contract - Stewards MAY reject SYNC with insufficient capabilities

Real-World Example¶

Customer Service Agent (Hybrid Pattern)¶

class CustomerServiceAgent:
    def handle_request(self, request):
        # Classify risk
        if request.type == "get_balance":
            contract = self.low_risk_contract()  # Tier 0, 100ms
        elif request.type == "update_address":
            contract = self.elevated_risk_contract()  # Tier 1, 300ms
        elif request.type == "refund":
            contract = self.critical_risk_contract()  # Tier 2, 5s

        # Submit with contract
        response = self.steward.evaluate(request, contract)

        # Handle timeouts
        if response.governance_status.timeout:
            if contract.fallback_behavior == "escalate":
                return self.queue_for_human_review(request)
            elif contract.fallback_behavior == "cached_decision":
                return self.use_cached_decision(request)

Results: - 70% of actions (balance checks): <100ms (Tier 0) - 25% of actions (updates): <300ms (Tier 1) - 5% of actions (refunds): <2s (Tier 2, 99^th percentile) - Overall cost: 60% reduction vs. universal Tier 2 evaluation - Overall latency (95^th percentile): 200ms vs. 1000ms without contracts

Next Steps¶

Read the spec: ACGP-1010: Governance Contracts
Interactive tool: Latency Calculator
Architecture reference: ACGP-1002 8.6: Architecture Patterns

FAQ¶

Q: Are governance contracts required?
A: No. They're an optional extension. Default behavior (uniform evaluation) still works.

Q: Can I use contracts with MCP/A2A integrations?
A: Yes, if both sides support governance contracts. Capability negotiation happens during SYNC.

Q: What if steward doesn't support my requested tier?
A: Steward returns available tiers in SYNC_ACK. Agent must adapt or fail gracefully.

Q: Do contracts override ACL tier rules?
A: No. Contracts work within ACL constraints. ACL-5 agent can't request Tier 0 for critical actions.

Q: How do I measure if contracts are helping?
A: Monitor governance_status.actual_latency_ms and fallback_used fields in ReflectionDB audit logs.