ACGP-1012: Blueprint Authoring Guide

Status: Draft
Last Updated: 2026-01-08
Spec ID: ACGP-1012
Normative Keywords: MUST, SHOULD, MAY (per RFC 2119)

Abstract

This document provides practical guidance for authoring ACGP Reflection Blueprints in regulated and multi-agent environments. It maps common governance requirements to ACGP constructs, demonstrates recommended patterns for single-agent and multi-agent systems, and clarifies the separation between policy (blueprints), runtime negotiation (governance contracts), and infrastructure (stewards, registry, ReflectionDB).

Table of Contents

  1. Introduction
  2. Mapping Common Requirements to ACGP
  3. Single-Agent Patterns (Regulated Baseline)
  4. Multi-Agent Patterns
  5. Regulatory Framework Integration
  6. Governance Contract Alignment
  7. Complete Examples
  8. Best Practices
  9. References

1. Introduction

1.1 Audience

This guide is intended for:

  • Governance architects designing policy frameworks
  • Compliance officers translating regulations into enforceable rules
  • System architects deploying ACGP in production
  • Multi-agent system designers coordinating autonomous agents

1.2 Key Principles

Separation of Concerns:

  • Blueprints (ACGP-1004): Define WHAT to enforce (checks, tripwires, thresholds)
  • Governance Contracts (ACGP-1010): Define HOW to apply per request (latency, eval tiers, fallbacks)
  • Registry (ACGP-1006): Define WHERE knowledge comes from (certified sources)
  • ReflectionDB (ACGP-1002): Records EVERYTHING for audit

Mental Model:

Blueprint = Policy Document (rules, metrics, thresholds)
Governance Contract = Runtime SLA (risk level, latency budget, fallback)
Registry = Trusted Source Catalog
ReflectionDB = Immutable Audit Trail

1.3 Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119.


2. Mapping Common Requirements to ACGP

2.1 Blueprint Component Mapping Table

This table maps common governance concepts to their ACGP implementation:

Governance Concept ACGP Construct Specification Notes
Metadata & Identity id, version, description, scope ACGP-1004 Blueprint identification and versioning
Regulatory Framework annotations.regulatory_refs (optional) ACGP-1012 Non-normative documentation; enforcement via checks/tripwires
Business Policies checks (rules + metrics) ACGP-1004 Declarative policy rules and quality metrics
Hard Safety Limits tripwires ACGP-1004 Pre-CTQ checks that can halt; Eval Tier 0/1
Decision Criteria scoring.thresholds ACGP-1004 Risk Score boundaries (0.0-1.0)
Quantitative Thresholds Tripwire condition expressions ACGP-1004 .5 DSL for numeric/pattern checks
Human Oversight Triggers Rules/tripwires yielding escalate ACGP-1004 .2, ACGP-1010 .2.4 Combined with Eval Tier 3 in governance contract
Quality Metrics CTQ metrics in checks ACGP-1004 .3 Weighted scoring contributing to final CTQ
Evidence Requirements evidence block ACGP-1004 Min sources, categories, trust scores
Tool & Source Registry Certified Source Registry ACGP-1006 Authoritative source catalog
Audit Requirements ReflectionDB retention ACGP-1002 .2.5, ACGP-1003 Immutable trace logging
SLA Targets Governance contracts ACGP-1010 Latency budgets, fallback strategies
Response Time Limits latency_budget_ms ACGP-1010 Per-request or per-tripwire
Cost Constraints Profile selection ACGP-1011 Choose eval tiers based on budget
Cross-Agent Policies ReflectionDB analytics + tripwires ACGP-1012 Query recent traces for coordination

2.2 Common Scenarios

Scenario: "Never allow X without human approval"

Implementation:

tripwires:
  - id: require_human_approval_for_x
    when:
      hook: "tool_call"
      tool: "dangerous_operation"
    condition: "false"  # Always trigger
    eval_tier: 0
    on_fail:
      decision: "escalate"
      reason: "Human approval required for dangerous_operation"

Combined with governance contract:

{
  "risk_level": "critical_risk",
  "eval_tier": 3,
  "latency_budget_ms": 5000
}

Scenario: "Validate claims against certified sources"

Implementation:

evidence:
  min_certified_sources: 2
  source_categories: ["regulatory", "peer_reviewed"]
  min_trust_score: 0.8

checks:
  - id: knowledge_grounding
    when:
      hook: "output"
    metric:
      name: "source_validation"
      weight: 0.3
      check:
        type: "source-match"
        args:
          required_sources: ["regulatory"]
          min_citation_ratio: 0.9

Scenario: "Block if transaction exceeds $10K"

Implementation:

tripwires:
  - id: transaction_limit
    when:
      hook: "tool_call"
      tool: "execute_transaction"
    condition: "args.amount <= 10000"
    eval_tier: 0
    latency_budget_ms: 10
    on_fail:
      decision: "halt"
      reason: "Transaction exceeds $10,000 limit"

Scenario: "Aggregate daily limits"

Implementation:

tripwires:
  - id: daily_transaction_count
    when:
      hook: "tool_call"
      tool: "execute_transaction"
    condition: "storage.get('tx_count_today') < 50"
    eval_tier: 1
    requires_state: true
    latency_budget_ms: 100
    on_fail:
      decision: "block"
      reason: "Daily transaction limit (50) exceeded"

3. Single-Agent Patterns (Regulated Baseline)

3.1 Architecture Overview

Regulated environments (finance, healthcare, legal) require:

  • Evidence-based reasoning (certified sources)
  • Pre-action safety checks (tripwires)
  • Quality scoring (CTQ metrics)
  • Human oversight for critical decisions
  • Complete audit trails

Data Flow:

sequenceDiagram
    participant PA as OperatingAgent
    participant GS as GovernanceSteward
    participant PE as PolicyEngine
    participant CSR as CertifiedSourceRegistry
    participant RDB as ReflectionDB
    participant HITL as HumanReviewer

    PA->>GS: TRACE(with source_refs, tool_calls)
    GS->>PE: Evaluate(blueprint + contract)

    Note over PE: Phase 1: Tripwires (Eval 0/1)
    PE->>PE: Check hard limits
    alt Tripwire Triggered
        PE-->>GS: INTERVENTION(halt)
        GS->>RDB: Log(TRACE, EVAL, INTERVENTION)
        GS-->>PA: HALT
    else Tripwires Pass
        Note over PE: Phase 2: Evidence Validation
        PE->>CSR: Validate(source_refs)
        CSR-->>PE: TrustScores

        Note over PE: Phase 3: CTQ Calculation
        PE->>PE: Calculate weighted CTQ
        PE->>PE: Apply thresholds
        PE-->>GS: EVAL + INTERVENTION

        alt Intervention = ESCALATE
            GS->>HITL: Request Review
            HITL-->>GS: Human Decision
            GS->>RDB: Log(complete flow)
            GS-->>PA: Forward Decision
        else Intervention = OK/NUDGE/FLAG/BLOCK
            GS->>RDB: Log(TRACE, EVAL, INTERVENTION)
            GS-->>PA: INTERVENTION
        end
    end
id: finance/trading_regulated@1.0
version: "1.0.0"
description: "Regulated trading agent blueprint with evidence + tripwires"
inherits: clarity.baseline@1.0

# Optional: Document regulatory context (non-normative)
annotations:
  regulatory_refs:
    - "MiFID II Directive 2014/65/EU"
    - "MAR Regulation 596/2014"
    - "ESMA Algo-Trading Guidelines"
  policy_owner: "compliance@example.com"
  retention_period: "7y"
  jurisdiction: ["EU", "US"]

scope:
  agent_tier: [ACL-3, ACL-4]
  tools: ["execute_trade", "market_data_api"]

# Hard safety limits (Eval Tier 0/1, run before CTQ)
tripwires:
  # Tier 0: In-memory check
  - id: single_trade_limit
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "args.trade_value <= 100000"
    eval_tier: 0
    latency_budget_ms: 10
    severity: critical
    on_fail:
      decision: "halt"
      reason: "Trade exceeds $100K hard limit"

  # Tier 1: Stateful check
  - id: daily_trade_volume
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "storage.get('daily_volume') + args.trade_value <= 5000000"
    eval_tier: 1
    requires_state: true
    latency_budget_ms: 100
    severity: critical
    on_fail:
      decision: "block"
      reason: "Daily trade volume limit exceeded"

# Evidence requirements
evidence:
  min_certified_sources: 2
  source_categories: ["regulatory", "market_data"]
  min_trust_score: 0.85

# Quality metrics (contribute to CTQ)
checks:
  - id: trade_rationale_quality
    when:
      hook: "tool_call"
      tool: "execute_trade"
    metric:
      name: "risk_assessment_clarity"
      weight: 0.4
      check:
        type: "llm-judge"
        args:
          model: "gpt-4"
          prompt_template: "evaluate_trading_rationale"
          threshold_warn: 0.6

  - id: market_data_recency
    when:
      hook: "tool_call"
      tool: "execute_trade"
    metric:
      name: "data_freshness"
      weight: 0.3
      check:
        type: "source-match"
        args:
          required_sources: ["market_data"]
          max_age_minutes: 5

  - id: compliance_alignment
    when:
      hook: "tool_call"
      tool: "execute_trade"
    metric:
      name: "regulatory_compliance"
      weight: 0.3
      check:
        type: "rule-based"
        args:
          rules: ["no_insider_trading", "position_limits"]

# Conservative thresholds for regulated domain
scoring:
  thresholds:
    ok: 0.15        # Risk <= 15% -> OK
    nudge: 0.25     # Risk <= 25% -> NUDGE
    escalate: 0.40  # Risk <= 40% -> ESCALATE
    block: 0.60     # Risk > 60% -> BLOCK

# Trust debt configuration
trust_debt:
  enabled: true
  accumulation:
    flag: 0.05
    nudge: 0.02
    block: 0.20
    halt: 0.50
  decay:
    rate: 0.95
    period_hours: 24
  thresholds:
    elevated_monitoring: 0.30
    restricted_mode: 0.50
    re_tiering_review: 0.75

3.3 Governance Contract Example

For critical trades, the agent requests higher evaluation depth:

{
  "governance_contract": {
    "risk_level": "critical_risk",
    "eval_tier": 2,
    "latency_budget_ms": 5000,
    "fallback": "escalate"
  }
}

This tells the steward:

  • Use deep LLM analysis (Eval Tier 2)
  • Budget 5 seconds for evaluation
  • If timeout, escalate to human rather than deny

4. Multi-Agent Patterns

4.1 Design Philosophy

Multi-agent governance does NOT turn blueprints into workflows. Instead:

  1. Agents remain autonomous: Each has its own governance steward
  2. Coordination via shared state: ReflectionDB provides cross-agent visibility
  3. Cross-agent policies: Enforced by steward-side analytics, not blueprint orchestration

4.2 Pattern A: Shared Blueprint, Separate Agents

Use Case: Multiple agents performing similar tasks in the same domain (e.g., trading desk with signal analysts, execution agents, risk monitors).

Architecture:

graph TB
    subgraph Agents
        A1[SignalAgent<br/>ACL-2]
        A2[ExecutionAgent<br/>ACL-4]
        A3[RiskAgent<br/>ACL-3]
    end

    subgraph Governance
        BP[Shared Blueprint:<br/>trading_regulated@1.0]
        GS1[Steward-1]
        GS2[Steward-2]
        GS3[Steward-3]
    end

    subgraph Infrastructure
        RDB[(ReflectionDB)]
        CSR[(Registry)]
    end

    A1 --> GS1
    A2 --> GS2
    A3 --> GS3

    GS1 -.inherits.-> BP
    GS2 -.inherits.-> BP
    GS3 -.inherits.-> BP

    GS1 --> RDB
    GS2 --> RDB
    GS3 --> RDB

    GS1 --> CSR
    GS2 --> CSR
    GS3 --> CSR

    RDB -.queries.-> GS1
    RDB -.queries.-> GS2
    RDB -.queries.-> GS3

Blueprint Structure:

id: finance/trading_team_shared@1.0
version: "1.0.0"
description: "Shared governance for coordinated trading agents"
inherits: clarity.baseline@1.0

annotations:
  multi_agent: true
  coordination_model: "shared_blueprint"
  participant_roles: ["signal_analysis", "trade_execution", "risk_monitoring"]

scope:
  agent_tier: [ACL-2, ACL-3, ACL-4]
  tools: ["execute_trade", "analyze_signal", "calculate_risk"]

# Shared tripwires apply to all agents
tripwires:
  - id: system_wide_exposure_limit
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "query_reflectiondb('sum(trade_value) from last 1h') < 10000000"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "halt"
      reason: "System-wide hourly exposure limit exceeded"

  - id: conflicting_signals
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "NOT query_reflectiondb('exists opposite_signal from SignalAgent in last 5m')"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "block"
      reason: "Conflicting signal detected from signal analyst"

# Role-specific checks via scope filtering
checks:
  - id: signal_quality
    when:
      hook: "output"
      agent_role: "signal_analysis"
    metric:
      name: "signal_confidence"
      weight: 0.5
      check:
        type: "llm-judge"

  - id: execution_efficiency
    when:
      hook: "tool_call"
      tool: "execute_trade"
      agent_role: "trade_execution"
    metric:
      name: "slippage_prediction"
      weight: 0.4
      check:
        type: "pattern-match"

Cross-Agent Coordination Logic:

The steward implements cross-agent queries against ReflectionDB:

# Pseudo-code for steward-side cross-agent tripwire evaluation
def evaluate_system_wide_exposure(trace, reflectiondb):
    """
    Cross-agent tripwire: check total exposure across all agents.
    """
    # Query recent trades from any agent in the system
    recent_trades = reflectiondb.query("""
        SELECT SUM(trade_value) as total_exposure
        FROM traces
        WHERE tool = 'execute_trade'
          AND timestamp > NOW() - INTERVAL '1 hour'
          AND session_id IN (
            SELECT DISTINCT session_id 
            FROM agent_metadata 
            WHERE blueprint_id = 'finance/trading_team_shared@1.0'
          )
    """)

    return recent_trades.total_exposure < 10000000

4.3 Pattern B: System Blueprint + Role Blueprints

Use Case: Hierarchical governance where system-wide policies apply to all agents, but each role has specific requirements.

Architecture:

graph TB
    subgraph Blueprints
        SYS[System Blueprint:<br/>trading_system@1.0]
        BP1[SignalAgent Blueprint<br/>inherits system]
        BP2[ExecAgent Blueprint<br/>inherits system]
        BP3[RiskAgent Blueprint<br/>inherits system]
    end

    subgraph Agents
        A1[SignalAgent]
        A2[ExecAgent]
        A3[RiskAgent]
    end

    SYS -.inherited by.-> BP1
    SYS -.inherited by.-> BP2
    SYS -.inherited by.-> BP3

    BP1 -.applied to.-> A1
    BP2 -.applied to.-> A2
    BP3 -.applied to.-> A3

System Blueprint:

id: finance/trading_system@1.0
version: "1.0.0"
description: "System-wide governance for trading ecosystem"
inherits: clarity.baseline@1.0

annotations:
  multi_agent: true
  coordination_model: "hierarchical"

# System-wide tripwires (inherited by all roles)
tripwires:
  - id: market_circuit_breaker
    when:
      hook: "tool_call"
    condition: "NOT query_external('market_halted')"
    eval_tier: 0
    on_fail:
      decision: "halt"
      reason: "Market-wide circuit breaker active"

  - id: regulatory_freeze
    when:
      hook: "tool_call"
    condition: "NOT query_external('trading_suspended')"
    eval_tier: 0
    on_fail:
      decision: "halt"
      reason: "Regulatory trading suspension"

# Shared evidence requirements
evidence:
  min_certified_sources: 1
  source_categories: ["regulatory"]
  min_trust_score: 0.8

scoring:
  thresholds:
    ok: 0.20
    nudge: 0.35
    escalate: 0.50
    block: 0.65

Role-Specific Blueprints:

# Signal Analyst Blueprint
id: finance/trading_signal_analyst@1.0
version: "1.0.0"
description: "Signal analysis agent governance"
inherits: finance/trading_system@1.0

scope:
  agent_tier: [ACL-2]
  tools: ["analyze_signal", "market_data_api"]

checks:
  - id: signal_confidence
    when:
      hook: "output"
    metric:
      name: "signal_quality"
      weight: 0.6
      check:
        type: "llm-judge"
        args:
          prompt_template: "evaluate_signal_confidence"

  - id: data_source_diversity
    when:
      hook: "output"
    metric:
      name: "source_diversity"
      weight: 0.4
      check:
        type: "source-match"
        args:
          min_source_count: 3
          required_categories: ["market_data", "news"]

scoring:
  thresholds:
    ok: 0.15      # Stricter than system baseline
    nudge: 0.25
    escalate: 0.40
    block: 0.60
# Execution Agent Blueprint
id: finance/trading_exec_agent@1.0
version: "1.0.0"
description: "Trade execution agent governance"
inherits: finance/trading_system@1.0

scope:
  agent_tier: [ACL-4]
  tools: ["execute_trade"]

tripwires:
  # Role-specific tripwire (adds to inherited system tripwires)
  - id: execution_requires_signal
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "query_reflectiondb('exists signal from SignalAgent in last 10m for session_id')"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "block"
      reason: "No valid signal from analyst within 10 minutes"

checks:
  - id: execution_timing
    when:
      hook: "tool_call"
      tool: "execute_trade"
    metric:
      name: "market_timing_quality"
      weight: 0.5
      check:
        type: "llm-judge"

  - id: slippage_prediction
    when:
      hook: "tool_call"
      tool: "execute_trade"
    metric:
      name: "slippage_risk"
      weight: 0.5
      check:
        type: "rule-based"
        args:
          rules: ["check_liquidity", "check_volatility"]

4.4 Coordination Mechanisms

4.4.1 Session/Trace Linking

Agents in a multi-agent system SHOULD use shared session_id to correlate their activities:

// SignalAgent trace
{
  "trace_id": "trace-001",
  "session_id": "trading-session-abc",
  "agent_id": "signal-agent-1",
  "meta": {
    "agent_role": "signal_analysis"
  }
}

// ExecAgent trace (same session)
{
  "trace_id": "trace-002",
  "session_id": "trading-session-abc",
  "agent_id": "exec-agent-1",
  "meta": {
    "agent_role": "trade_execution",
    "triggered_by": "trace-001"  // Optional: explicit dependency
  }
}

4.4.2 Cross-Agent Tripwire Queries

Stewards can query ReflectionDB to enforce cross-agent constraints:

-- Check if conflicting signals exist
SELECT COUNT(*) FROM traces
WHERE session_id = :current_session
  AND tool = 'analyze_signal'
  AND timestamp > NOW() - INTERVAL '5 minutes'
  AND outputs->>'recommendation' != :current_recommendation

-- Check system-wide exposure
SELECT SUM(CAST(tool_calls->0->'args'->>'trade_value' AS NUMERIC))
FROM traces
WHERE tool = 'execute_trade'
  AND timestamp > NOW() - INTERVAL '1 hour'
  AND meta->>'blueprint_id' LIKE 'finance/trading_%'

4.4.3 Agent-to-Agent Obligations

Multi-agent systems MAY document obligations in blueprint annotations (non-normative):

annotations:
  multi_agent_obligations:
    SignalAgent:
      - "MUST publish signal provenance to ReflectionDB before handoff"
      - "MUST tag outputs with confidence score"
    ExecAgent:
      - "MUST verify signal exists in ReflectionDB within 10m"
      - "MUST stream execution metrics to RiskAgent steward"
    RiskAgent:
      - "MUST update cross-agent risk summary every 60s"
      - "MAY trigger autonomy downgrade for ExecAgent if thresholds breached"

These obligations are documentation only. Enforcement happens via:

  • Tripwires checking ReflectionDB state
  • Steward-to-steward communication (out of scope for blueprints)
  • Monitoring dashboards alerting on missing data

5. Regulatory Framework Integration

5.1 Non-Normative Annotations

Blueprints MAY include annotations block for documentation purposes. These annotations do NOT affect enforcement logic.

Purpose:

  • Document regulatory context for auditors
  • Link to external compliance frameworks
  • Specify retention/jurisdiction requirements
  • Identify policy owners

Schema:

annotations:
  # Regulatory references (documentation only)
  regulatory_refs:
    - "EU AI Act Article 9-15"
    - "MiFID II Directive 2014/65/EU"
    - "GDPR Regulation 2016/679"
    - "FDA 21 CFR Part 11"

  # Jurisdiction and scope
  jurisdiction: ["EU", "US", "UK"]
  geographic_scope: "EEA + US cross-border"

  # Organizational metadata
  policy_owner: "compliance@example.com"
  approval_board: "Enterprise_Governance_Committee"
  approved_by: ["user-uuid-1", "user-uuid-2"]
  approval_date: "2025-01-15"

  # Audit and retention
  retention_period: "7y"
  retention_jurisdiction: "EU"
  audit_frequency: "quarterly"
  external_auditor: "KPMG_EU"

  # Multi-agent metadata (if applicable)
  multi_agent: true
  coordination_model: "hierarchical"
  participant_roles: ["signal", "execution", "risk", "audit"]

5.2 Mapping Regulations to Enforcement

Regulation Enforcement Mechanism Blueprint Location
Data minimization (GDPR Art. 5) PII detection tripwire tripwires (pattern-match)
Right to explanation (GDPR Art. 22) Reasoning transparency metric checks (CTQ metric)
Position limits (MiFID II Art. 57) Numeric threshold tripwire tripwires (condition)
Record-keeping (MiFID II Art. 25) ReflectionDB retention Architecture (ACGP-1002)
Transparency (EU AI Act Art. 13) Source citation requirement evidence block
Human oversight (EU AI Act Art. 14) Escalation rules checks -> escalate
Risk management (EU AI Act Art. 9) CTQ scoring + thresholds checks + scoring

Example: GDPR Article 5 (Data Minimization)

tripwires:
  - id: pii_exposure_check
    when:
      hook: "output"
    condition:
      any:
        - "NOT matches_regex(content, '\\b\\d{3}-\\d{2}-\\d{4}\\b')"  # SSN
        - "NOT contains_entity(content, 'credit_card')"
        - "NOT contains_entity(content, 'bank_account')"
    eval_tier: 0
    latency_budget_ms: 50
    on_fail:
      decision: "block"
      reason: "PII detected in output (GDPR Art. 5 violation risk)"

annotations:
  regulatory_refs:
    - "GDPR Regulation 2016/679 Article 5(1)(c)"
  compliance_note: "Implements data minimization principle"

6. Governance Contract Alignment

6.1 When to Use Governance Contracts

Governance contracts (ACGP-1010) specify runtime parameters for blueprint application. Use them when:

  1. Different actions have different risk levels (low/elevated/critical)
  2. Latency budgets vary per request (fast queries vs. critical decisions)
  3. Fallback strategies differ (deny vs. escalate on timeout)
  4. Evaluation depth needs negotiation (Eval Tier 0-3)

6.2 Risk Level Selection

Risk Level Typical Latency Budget Eval Tiers Fallback Use Cases
low_risk 100ms 0, 1 deny Read queries, UI updates, logs
elevated_risk 300ms 0, 1, (2 async) allow_and_log State mutations, notifications
critical_risk 5000ms 0, 1, 2, (3 if needed) escalate Financial transactions, data deletion

6.3 Complete Example: Critical Trade

Agent Request with Contract:

{
  "protocol": "acgp",
  "protocol_version": "1.1.0",
  "message_type": "TRACE",
  "message_id": "uuid-v7",
  "timestamp": "2025-12-24T10:30:00Z",
  "sender_id": "trading-agent-1",
  "receiver_id": "governance-steward-1",

  "payload": {
    "trace_id": "trace-001",
    "session_id": "trading-session-abc",
    "step": 1,
    "tool_calls": [
      {
        "name": "execute_trade",
        "args": {
          "symbol": "AAPL",
          "quantity": 1000,
          "trade_value": 175000,
          "order_type": "market"
        }
      }
    ],
    "source_refs": [
      "reg:sec:10K:AAPL:2024",
      "market:bloomberg:realtime:2025-12-24"
    ],
    "meta": {
      "model": "gpt-5-pro",
      "agent_acl_tier": "ACL-4"
    },

    "governance_contract": {
      "risk_level": "critical_risk",
      "eval_tier": 2,
      "latency_budget_ms": 5000,
      "fallback": "escalate",
      "justification": "Trade exceeds $100K threshold requiring deep analysis"
    }
  }
}

Steward Processing:

  1. Tripwire check (Eval Tier 0, <10ms):
  2. trade_value <= 100000 -> FAIL -> Would halt
  3. BUT contract specifies critical_risk, so escalate to Tier ½

  4. Evidence validation (Eval Tier 1, ~100ms):

  5. Check source_refs against Registry
  6. Verify SEC filing is certified
  7. Verify Bloomberg feed is real-time and trusted

  8. CTQ calculation (Eval Tier 2, ~3000ms):

  9. LLM-judge evaluates trading rationale
  10. Pattern-match checks for manipulation signals
  11. Source-match validates data freshness

  12. Decision:

  13. CTQ = 0.88 -> Risk = 0.12
  14. Threshold: ok: 0.15 -> OK (just within bounds)

Response:

{
  "message_type": "INTERVENTION",
  "payload": {
    "trace_id": "trace-001",
    "decision": "ok",
    "flagged": false,
    "risk_score": 0.12,
    "ctq_score": 0.88,
    "reasons": [
      "Trade value above normal limits but within critical-risk acceptable range",
      "Evidence validation passed (2 certified sources)",
      "Rationale quality score: 0.89"
    ],
    "actions": [],
    "governance_contract_applied": {
      "risk_level": "critical_risk",
      "eval_tier": 2,
      "actual_latency_ms": 3245
    }
  }
}

6.4 SLA-Like Requirements

Map "SLA" concepts to ACGP constructs:

SLA Requirement ACGP Implementation Notes
"Response time < 100ms" latency_budget_ms: 100 In governance contract
"99.9% uptime" Profile selection + HA architecture ACGP-1011 + deployment
"Human review within 4 hours" HITL SLA (out of scope) External to ACGP protocol
"Audit retention 7 years" ReflectionDB config Architecture layer
"Cost per decision < $0.01" Profile selection (rule-only) ACGP-1011 .5

7. Complete Examples

7.1 Example 1: Regulated Healthcare Agent

id: healthcare/clinical_decision_support@1.0
version: "1.0.0"
description: "Clinical decision support agent with FDA/EMA compliance"
inherits: clarity.baseline@1.0

annotations:
  regulatory_refs:
    - "FDA 21 CFR Part 11"
    - "EU MDR 2017/745"
    - "HIPAA Privacy Rule"
  policy_owner: "clinical.governance@hospital.com"
  retention_period: "10y"
  jurisdiction: ["US", "EU"]

scope:
  agent_tier: [ACL-4, ACL-5]
  tools: ["diagnose", "prescribe", "access_patient_records"]

tripwires:
  - id: contraindication_check
    when:
      hook: "tool_call"
      tool: "prescribe"
    condition: "NOT query_external('contraindication_detected', args.medication, patient.allergies)"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "halt"
      reason: "Contraindication detected (patient safety)"

  - id: off_label_use
    when:
      hook: "tool_call"
      tool: "prescribe"
    condition: "query_external('fda_approved', args.medication, args.indication)"
    eval_tier: 1
    on_fail:
      decision: "escalate"
      reason: "Off-label use requires physician approval"

evidence:
  min_certified_sources: 2
  source_categories: ["clinical_guidelines", "peer_reviewed"]
  min_trust_score: 0.9

checks:
  - id: clinical_reasoning_quality
    when:
      hook: "output"
    metric:
      name: "diagnostic_soundness"
      weight: 0.5
      check:
        type: "llm-judge"
        args:
          model: "gpt-4"
          prompt_template: "evaluate_clinical_reasoning"

  - id: evidence_currency
    when:
      hook: "output"
    metric:
      name: "guideline_recency"
      weight: 0.3
      check:
        type: "source-match"
        args:
          max_age_years: 5

  - id: bias_check
    when:
      hook: "output"
    metric:
      name: "demographic_bias"
      weight: 0.2
      check:
        type: "llm-judge"
        args:
          prompt_template: "detect_demographic_bias"

scoring:
  thresholds:
    ok: 0.10
    nudge: 0.20
    escalate: 0.35
    block: 0.50

trust_debt:
  enabled: true
  accumulation:
    flag: 0.10
    block: 0.30
    halt: 0.60
  thresholds:
    re_tiering_review: 0.70

Governance Contract for Critical Decision:

{
  "risk_level": "critical_risk",
  "eval_tier": 3,
  "latency_budget_ms": 10000,
  "fallback": "escalate",
  "justification": "Life-critical prescription decision"
}

7.2 Example 2: Multi-Agent Trading System

System Blueprint:

id: finance/multi_agent_trading_system@1.0
version: "1.0.0"
description: "System-wide governance for coordinated trading agents"
inherits: clarity.baseline@1.0

annotations:
  regulatory_refs:
    - "MiFID II Directive 2014/65/EU"
    - "MAR Regulation 596/2014"
  multi_agent: true
  coordination_model: "hierarchical"
  participant_roles: ["signal_analysis", "trade_execution", "risk_monitoring", "audit"]

tripwires:
  - id: market_wide_halt
    when:
      hook: "tool_call"
    condition: "NOT query_external('market_halted')"
    eval_tier: 0
    on_fail:
      decision: "halt"
      reason: "Market circuit breaker active"

  - id: system_exposure_limit
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "query_reflectiondb('sum(trade_value) < 10000000 from last 1h where blueprint_id like finance/multi_agent_%')"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "halt"
      reason: "System-wide hourly exposure limit"

evidence:
  min_certified_sources: 1
  source_categories: ["regulatory"]
  min_trust_score: 0.8

scoring:
  thresholds:
    ok: 0.20
    nudge: 0.35
    escalate: 0.50
    block: 0.65

Signal Agent Blueprint:

id: finance/signal_agent@1.0
version: "1.0.0"
description: "Signal analysis agent"
inherits: finance/multi_agent_trading_system@1.0

scope:
  agent_tier: [ACL-2]
  tools: ["analyze_signal"]

checks:
  - id: signal_confidence
    when:
      hook: "output"
    metric:
      name: "analysis_quality"
      weight: 0.7
      check:
        type: "llm-judge"

  - id: source_diversity
    when:
      hook: "output"
    metric:
      name: "data_source_count"
      weight: 0.3
      check:
        type: "source-match"
        args:
          min_source_count: 3

scoring:
  thresholds:
    ok: 0.15
    nudge: 0.25
    escalate: 0.40
    block: 0.60

Execution Agent Blueprint:

id: finance/execution_agent@1.0
version: "1.0.0"
description: "Trade execution agent"
inherits: finance/multi_agent_trading_system@1.0

scope:
  agent_tier: [ACL-4]
  tools: ["execute_trade"]

tripwires:
  - id: requires_signal
    when:
      hook: "tool_call"
      tool: "execute_trade"
    condition: "query_reflectiondb('exists signal from SignalAgent in last 10m for session_id')"
    eval_tier: 1
    requires_state: true
    on_fail:
      decision: "block"
      reason: "No valid signal within 10 minutes"

checks:
  - id: execution_timing
    when:
      hook: "tool_call"
    metric:
      name: "market_timing"
      weight: 0.6
      check:
        type: "llm-judge"

  - id: slippage_risk
    when:
      hook: "tool_call"
    metric:
      name: "execution_quality"
      weight: 0.4
      check:
        type: "rule-based"

8. Best Practices

8.1 Blueprint Design

  1. Start with clarity.baseline: Always inherit from the universal baseline
  2. Layer policies hierarchically: System -> Domain -> Role
  3. Use tripwires for safety: Hard limits belong in tripwires, not checks
  4. Conservative thresholds: Start strict, relax based on calibration
  5. Document intent: Use annotations for context and regulatory links

8.2 Multi-Agent Coordination

  1. Shared session_id: Link related agent activities
  2. ReflectionDB queries: Use database for cross-agent state, not agent-to-agent messages
  3. Avoid workflow logic in blueprints: Coordination is steward-side, not policy-side
  4. Clear role boundaries: Each agent has distinct tools and responsibilities

8.3 Evidence and Sources

  1. Specify minimum sources: Use evidence.min_certified_sources
  2. Category requirements: Require appropriate source types (regulatory, peer-reviewed, etc.)
  3. Trust score thresholds: Set appropriate minimums for domain risk level
  4. Keep registry updated: Source certification is continuous, not one-time

8.4 Performance Optimization

  1. Eval Tier 0 for critical checks: Fast, in-memory, no dependencies
  2. Async Eval Tier 2: Use for quality checks that don't block action
  3. Profile selection: Choose appropriate profile for cost/latency/quality trade-off
  4. Cache policies: Governance contracts support cached decisions

8.5 Testing and Validation

  1. Conformance suite: Run ACGP conformance tests before production
  2. Calibration datasets: Create ground-truth examples for scorer tuning
  3. Staged rollout: Use canary deployments for blueprint updates
  4. Monitor metrics: Track intervention rates, latency, trust debt

9. References

Normative References

  • ACGP-1000: Core Protocol Specification
  • ACGP-1001: Terminology and Definitions
  • ACGP-1002: Architecture Specification
  • ACGP-1003: Message Formats & Wire Protocol
  • ACGP-1004: Reflection Blueprint Specification
  • ACGP-1005: ARS-CTQ-ACL Integration Framework
  • ACGP-1006: Certified Source Registry Specification
  • ACGP-1009: Conformance Levels
  • ACGP-1010: Governance Contracts
  • ACGP-1011: Implementation Profiles

Informative References

  • RFC 2119: Key words for use in RFCs
  • MiFID II: Directive 2014/65/EU
  • GDPR: Regulation 2016/679
  • EU AI Act: Regulation 2024/1689

End of ACGP-1012