ACGP-1012: Blueprint Authoring Guide¶
Status: Draft
Last Updated: 2026-01-08
Spec ID: ACGP-1012
Normative Keywords: MUST, SHOULD, MAY (per RFC 2119)
Abstract¶
This document provides practical guidance for authoring ACGP Reflection Blueprints in regulated and multi-agent environments. It maps common governance requirements to ACGP constructs, demonstrates recommended patterns for single-agent and multi-agent systems, and clarifies the separation between policy (blueprints), runtime negotiation (governance contracts), and infrastructure (stewards, registry, ReflectionDB).
Table of Contents¶
- Introduction
- Mapping Common Requirements to ACGP
- Single-Agent Patterns (Regulated Baseline)
- Multi-Agent Patterns
- Regulatory Framework Integration
- Governance Contract Alignment
- Complete Examples
- Best Practices
- References
1. Introduction¶
1.1 Audience¶
This guide is intended for:
- Governance architects designing policy frameworks
- Compliance officers translating regulations into enforceable rules
- System architects deploying ACGP in production
- Multi-agent system designers coordinating autonomous agents
1.2 Key Principles¶
Separation of Concerns:
- Blueprints (ACGP-1004): Define WHAT to enforce (checks, tripwires, thresholds)
- Governance Contracts (ACGP-1010): Define HOW to apply per request (latency, eval tiers, fallbacks)
- Registry (ACGP-1006): Define WHERE knowledge comes from (certified sources)
- ReflectionDB (ACGP-1002): Records EVERYTHING for audit
Mental Model:
Blueprint = Policy Document (rules, metrics, thresholds)
Governance Contract = Runtime SLA (risk level, latency budget, fallback)
Registry = Trusted Source Catalog
ReflectionDB = Immutable Audit Trail
1.3 Requirements Language¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119.
2. Mapping Common Requirements to ACGP¶
2.1 Blueprint Component Mapping Table¶
This table maps common governance concepts to their ACGP implementation:
| Governance Concept | ACGP Construct | Specification | Notes |
|---|---|---|---|
| Metadata & Identity | id, version, description, scope |
ACGP-1004 | Blueprint identification and versioning |
| Regulatory Framework | annotations.regulatory_refs (optional) |
ACGP-1012 | Non-normative documentation; enforcement via checks/tripwires |
| Business Policies | checks (rules + metrics) |
ACGP-1004 | Declarative policy rules and quality metrics |
| Hard Safety Limits | tripwires |
ACGP-1004 | Pre-CTQ checks that can halt; Eval Tier 0/1 |
| Decision Criteria | scoring.thresholds |
ACGP-1004 | Risk Score boundaries (0.0-1.0) |
| Quantitative Thresholds | Tripwire condition expressions |
ACGP-1004 .5 | DSL for numeric/pattern checks |
| Human Oversight Triggers | Rules/tripwires yielding escalate |
ACGP-1004 .2, ACGP-1010 .2.4 | Combined with Eval Tier 3 in governance contract |
| Quality Metrics | CTQ metrics in checks |
ACGP-1004 .3 | Weighted scoring contributing to final CTQ |
| Evidence Requirements | evidence block |
ACGP-1004 | Min sources, categories, trust scores |
| Tool & Source Registry | Certified Source Registry | ACGP-1006 | Authoritative source catalog |
| Audit Requirements | ReflectionDB retention | ACGP-1002 .2.5, ACGP-1003 | Immutable trace logging |
| SLA Targets | Governance contracts | ACGP-1010 | Latency budgets, fallback strategies |
| Response Time Limits | latency_budget_ms |
ACGP-1010 | Per-request or per-tripwire |
| Cost Constraints | Profile selection | ACGP-1011 | Choose eval tiers based on budget |
| Cross-Agent Policies | ReflectionDB analytics + tripwires | ACGP-1012 | Query recent traces for coordination |
2.2 Common Scenarios¶
Scenario: "Never allow X without human approval"¶
Implementation:
tripwires:
- id: require_human_approval_for_x
when:
hook: "tool_call"
tool: "dangerous_operation"
condition: "false" # Always trigger
eval_tier: 0
on_fail:
decision: "escalate"
reason: "Human approval required for dangerous_operation"
Combined with governance contract:
Scenario: "Validate claims against certified sources"¶
Implementation:
evidence:
min_certified_sources: 2
source_categories: ["regulatory", "peer_reviewed"]
min_trust_score: 0.8
checks:
- id: knowledge_grounding
when:
hook: "output"
metric:
name: "source_validation"
weight: 0.3
check:
type: "source-match"
args:
required_sources: ["regulatory"]
min_citation_ratio: 0.9
Scenario: "Block if transaction exceeds $10K"¶
Implementation:
tripwires:
- id: transaction_limit
when:
hook: "tool_call"
tool: "execute_transaction"
condition: "args.amount <= 10000"
eval_tier: 0
latency_budget_ms: 10
on_fail:
decision: "halt"
reason: "Transaction exceeds $10,000 limit"
Scenario: "Aggregate daily limits"¶
Implementation:
tripwires:
- id: daily_transaction_count
when:
hook: "tool_call"
tool: "execute_transaction"
condition: "storage.get('tx_count_today') < 50"
eval_tier: 1
requires_state: true
latency_budget_ms: 100
on_fail:
decision: "block"
reason: "Daily transaction limit (50) exceeded"
3. Single-Agent Patterns (Regulated Baseline)¶
3.1 Architecture Overview¶
Regulated environments (finance, healthcare, legal) require:
- Evidence-based reasoning (certified sources)
- Pre-action safety checks (tripwires)
- Quality scoring (CTQ metrics)
- Human oversight for critical decisions
- Complete audit trails
Data Flow:
sequenceDiagram
participant PA as OperatingAgent
participant GS as GovernanceSteward
participant PE as PolicyEngine
participant CSR as CertifiedSourceRegistry
participant RDB as ReflectionDB
participant HITL as HumanReviewer
PA->>GS: TRACE(with source_refs, tool_calls)
GS->>PE: Evaluate(blueprint + contract)
Note over PE: Phase 1: Tripwires (Eval 0/1)
PE->>PE: Check hard limits
alt Tripwire Triggered
PE-->>GS: INTERVENTION(halt)
GS->>RDB: Log(TRACE, EVAL, INTERVENTION)
GS-->>PA: HALT
else Tripwires Pass
Note over PE: Phase 2: Evidence Validation
PE->>CSR: Validate(source_refs)
CSR-->>PE: TrustScores
Note over PE: Phase 3: CTQ Calculation
PE->>PE: Calculate weighted CTQ
PE->>PE: Apply thresholds
PE-->>GS: EVAL + INTERVENTION
alt Intervention = ESCALATE
GS->>HITL: Request Review
HITL-->>GS: Human Decision
GS->>RDB: Log(complete flow)
GS-->>PA: Forward Decision
else Intervention = OK/NUDGE/FLAG/BLOCK
GS->>RDB: Log(TRACE, EVAL, INTERVENTION)
GS-->>PA: INTERVENTION
end
end
3.2 Recommended Blueprint Structure¶
id: finance/trading_regulated@1.0
version: "1.0.0"
description: "Regulated trading agent blueprint with evidence + tripwires"
inherits: clarity.baseline@1.0
# Optional: Document regulatory context (non-normative)
annotations:
regulatory_refs:
- "MiFID II Directive 2014/65/EU"
- "MAR Regulation 596/2014"
- "ESMA Algo-Trading Guidelines"
policy_owner: "compliance@example.com"
retention_period: "7y"
jurisdiction: ["EU", "US"]
scope:
agent_tier: [ACL-3, ACL-4]
tools: ["execute_trade", "market_data_api"]
# Hard safety limits (Eval Tier 0/1, run before CTQ)
tripwires:
# Tier 0: In-memory check
- id: single_trade_limit
when:
hook: "tool_call"
tool: "execute_trade"
condition: "args.trade_value <= 100000"
eval_tier: 0
latency_budget_ms: 10
severity: critical
on_fail:
decision: "halt"
reason: "Trade exceeds $100K hard limit"
# Tier 1: Stateful check
- id: daily_trade_volume
when:
hook: "tool_call"
tool: "execute_trade"
condition: "storage.get('daily_volume') + args.trade_value <= 5000000"
eval_tier: 1
requires_state: true
latency_budget_ms: 100
severity: critical
on_fail:
decision: "block"
reason: "Daily trade volume limit exceeded"
# Evidence requirements
evidence:
min_certified_sources: 2
source_categories: ["regulatory", "market_data"]
min_trust_score: 0.85
# Quality metrics (contribute to CTQ)
checks:
- id: trade_rationale_quality
when:
hook: "tool_call"
tool: "execute_trade"
metric:
name: "risk_assessment_clarity"
weight: 0.4
check:
type: "llm-judge"
args:
model: "gpt-4"
prompt_template: "evaluate_trading_rationale"
threshold_warn: 0.6
- id: market_data_recency
when:
hook: "tool_call"
tool: "execute_trade"
metric:
name: "data_freshness"
weight: 0.3
check:
type: "source-match"
args:
required_sources: ["market_data"]
max_age_minutes: 5
- id: compliance_alignment
when:
hook: "tool_call"
tool: "execute_trade"
metric:
name: "regulatory_compliance"
weight: 0.3
check:
type: "rule-based"
args:
rules: ["no_insider_trading", "position_limits"]
# Conservative thresholds for regulated domain
scoring:
thresholds:
ok: 0.15 # Risk <= 15% -> OK
nudge: 0.25 # Risk <= 25% -> NUDGE
escalate: 0.40 # Risk <= 40% -> ESCALATE
block: 0.60 # Risk > 60% -> BLOCK
# Trust debt configuration
trust_debt:
enabled: true
accumulation:
flag: 0.05
nudge: 0.02
block: 0.20
halt: 0.50
decay:
rate: 0.95
period_hours: 24
thresholds:
elevated_monitoring: 0.30
restricted_mode: 0.50
re_tiering_review: 0.75
3.3 Governance Contract Example¶
For critical trades, the agent requests higher evaluation depth:
{
"governance_contract": {
"risk_level": "critical_risk",
"eval_tier": 2,
"latency_budget_ms": 5000,
"fallback": "escalate"
}
}
This tells the steward:
- Use deep LLM analysis (Eval Tier 2)
- Budget 5 seconds for evaluation
- If timeout, escalate to human rather than deny
4. Multi-Agent Patterns¶
4.1 Design Philosophy¶
Multi-agent governance does NOT turn blueprints into workflows. Instead:
- Agents remain autonomous: Each has its own governance steward
- Coordination via shared state: ReflectionDB provides cross-agent visibility
- Cross-agent policies: Enforced by steward-side analytics, not blueprint orchestration
4.2 Pattern A: Shared Blueprint, Separate Agents¶
Use Case: Multiple agents performing similar tasks in the same domain (e.g., trading desk with signal analysts, execution agents, risk monitors).
Architecture:
graph TB
subgraph Agents
A1[SignalAgent<br/>ACL-2]
A2[ExecutionAgent<br/>ACL-4]
A3[RiskAgent<br/>ACL-3]
end
subgraph Governance
BP[Shared Blueprint:<br/>trading_regulated@1.0]
GS1[Steward-1]
GS2[Steward-2]
GS3[Steward-3]
end
subgraph Infrastructure
RDB[(ReflectionDB)]
CSR[(Registry)]
end
A1 --> GS1
A2 --> GS2
A3 --> GS3
GS1 -.inherits.-> BP
GS2 -.inherits.-> BP
GS3 -.inherits.-> BP
GS1 --> RDB
GS2 --> RDB
GS3 --> RDB
GS1 --> CSR
GS2 --> CSR
GS3 --> CSR
RDB -.queries.-> GS1
RDB -.queries.-> GS2
RDB -.queries.-> GS3
Blueprint Structure:
id: finance/trading_team_shared@1.0
version: "1.0.0"
description: "Shared governance for coordinated trading agents"
inherits: clarity.baseline@1.0
annotations:
multi_agent: true
coordination_model: "shared_blueprint"
participant_roles: ["signal_analysis", "trade_execution", "risk_monitoring"]
scope:
agent_tier: [ACL-2, ACL-3, ACL-4]
tools: ["execute_trade", "analyze_signal", "calculate_risk"]
# Shared tripwires apply to all agents
tripwires:
- id: system_wide_exposure_limit
when:
hook: "tool_call"
tool: "execute_trade"
condition: "query_reflectiondb('sum(trade_value) from last 1h') < 10000000"
eval_tier: 1
requires_state: true
on_fail:
decision: "halt"
reason: "System-wide hourly exposure limit exceeded"
- id: conflicting_signals
when:
hook: "tool_call"
tool: "execute_trade"
condition: "NOT query_reflectiondb('exists opposite_signal from SignalAgent in last 5m')"
eval_tier: 1
requires_state: true
on_fail:
decision: "block"
reason: "Conflicting signal detected from signal analyst"
# Role-specific checks via scope filtering
checks:
- id: signal_quality
when:
hook: "output"
agent_role: "signal_analysis"
metric:
name: "signal_confidence"
weight: 0.5
check:
type: "llm-judge"
- id: execution_efficiency
when:
hook: "tool_call"
tool: "execute_trade"
agent_role: "trade_execution"
metric:
name: "slippage_prediction"
weight: 0.4
check:
type: "pattern-match"
Cross-Agent Coordination Logic:
The steward implements cross-agent queries against ReflectionDB:
# Pseudo-code for steward-side cross-agent tripwire evaluation
def evaluate_system_wide_exposure(trace, reflectiondb):
"""
Cross-agent tripwire: check total exposure across all agents.
"""
# Query recent trades from any agent in the system
recent_trades = reflectiondb.query("""
SELECT SUM(trade_value) as total_exposure
FROM traces
WHERE tool = 'execute_trade'
AND timestamp > NOW() - INTERVAL '1 hour'
AND session_id IN (
SELECT DISTINCT session_id
FROM agent_metadata
WHERE blueprint_id = 'finance/trading_team_shared@1.0'
)
""")
return recent_trades.total_exposure < 10000000
4.3 Pattern B: System Blueprint + Role Blueprints¶
Use Case: Hierarchical governance where system-wide policies apply to all agents, but each role has specific requirements.
Architecture:
graph TB
subgraph Blueprints
SYS[System Blueprint:<br/>trading_system@1.0]
BP1[SignalAgent Blueprint<br/>inherits system]
BP2[ExecAgent Blueprint<br/>inherits system]
BP3[RiskAgent Blueprint<br/>inherits system]
end
subgraph Agents
A1[SignalAgent]
A2[ExecAgent]
A3[RiskAgent]
end
SYS -.inherited by.-> BP1
SYS -.inherited by.-> BP2
SYS -.inherited by.-> BP3
BP1 -.applied to.-> A1
BP2 -.applied to.-> A2
BP3 -.applied to.-> A3
System Blueprint:
id: finance/trading_system@1.0
version: "1.0.0"
description: "System-wide governance for trading ecosystem"
inherits: clarity.baseline@1.0
annotations:
multi_agent: true
coordination_model: "hierarchical"
# System-wide tripwires (inherited by all roles)
tripwires:
- id: market_circuit_breaker
when:
hook: "tool_call"
condition: "NOT query_external('market_halted')"
eval_tier: 0
on_fail:
decision: "halt"
reason: "Market-wide circuit breaker active"
- id: regulatory_freeze
when:
hook: "tool_call"
condition: "NOT query_external('trading_suspended')"
eval_tier: 0
on_fail:
decision: "halt"
reason: "Regulatory trading suspension"
# Shared evidence requirements
evidence:
min_certified_sources: 1
source_categories: ["regulatory"]
min_trust_score: 0.8
scoring:
thresholds:
ok: 0.20
nudge: 0.35
escalate: 0.50
block: 0.65
Role-Specific Blueprints:
# Signal Analyst Blueprint
id: finance/trading_signal_analyst@1.0
version: "1.0.0"
description: "Signal analysis agent governance"
inherits: finance/trading_system@1.0
scope:
agent_tier: [ACL-2]
tools: ["analyze_signal", "market_data_api"]
checks:
- id: signal_confidence
when:
hook: "output"
metric:
name: "signal_quality"
weight: 0.6
check:
type: "llm-judge"
args:
prompt_template: "evaluate_signal_confidence"
- id: data_source_diversity
when:
hook: "output"
metric:
name: "source_diversity"
weight: 0.4
check:
type: "source-match"
args:
min_source_count: 3
required_categories: ["market_data", "news"]
scoring:
thresholds:
ok: 0.15 # Stricter than system baseline
nudge: 0.25
escalate: 0.40
block: 0.60
# Execution Agent Blueprint
id: finance/trading_exec_agent@1.0
version: "1.0.0"
description: "Trade execution agent governance"
inherits: finance/trading_system@1.0
scope:
agent_tier: [ACL-4]
tools: ["execute_trade"]
tripwires:
# Role-specific tripwire (adds to inherited system tripwires)
- id: execution_requires_signal
when:
hook: "tool_call"
tool: "execute_trade"
condition: "query_reflectiondb('exists signal from SignalAgent in last 10m for session_id')"
eval_tier: 1
requires_state: true
on_fail:
decision: "block"
reason: "No valid signal from analyst within 10 minutes"
checks:
- id: execution_timing
when:
hook: "tool_call"
tool: "execute_trade"
metric:
name: "market_timing_quality"
weight: 0.5
check:
type: "llm-judge"
- id: slippage_prediction
when:
hook: "tool_call"
tool: "execute_trade"
metric:
name: "slippage_risk"
weight: 0.5
check:
type: "rule-based"
args:
rules: ["check_liquidity", "check_volatility"]
4.4 Coordination Mechanisms¶
4.4.1 Session/Trace Linking¶
Agents in a multi-agent system SHOULD use shared session_id to correlate their activities:
// SignalAgent trace
{
"trace_id": "trace-001",
"session_id": "trading-session-abc",
"agent_id": "signal-agent-1",
"meta": {
"agent_role": "signal_analysis"
}
}
// ExecAgent trace (same session)
{
"trace_id": "trace-002",
"session_id": "trading-session-abc",
"agent_id": "exec-agent-1",
"meta": {
"agent_role": "trade_execution",
"triggered_by": "trace-001" // Optional: explicit dependency
}
}
4.4.2 Cross-Agent Tripwire Queries¶
Stewards can query ReflectionDB to enforce cross-agent constraints:
-- Check if conflicting signals exist
SELECT COUNT(*) FROM traces
WHERE session_id = :current_session
AND tool = 'analyze_signal'
AND timestamp > NOW() - INTERVAL '5 minutes'
AND outputs->>'recommendation' != :current_recommendation
-- Check system-wide exposure
SELECT SUM(CAST(tool_calls->0->'args'->>'trade_value' AS NUMERIC))
FROM traces
WHERE tool = 'execute_trade'
AND timestamp > NOW() - INTERVAL '1 hour'
AND meta->>'blueprint_id' LIKE 'finance/trading_%'
4.4.3 Agent-to-Agent Obligations¶
Multi-agent systems MAY document obligations in blueprint annotations (non-normative):
annotations:
multi_agent_obligations:
SignalAgent:
- "MUST publish signal provenance to ReflectionDB before handoff"
- "MUST tag outputs with confidence score"
ExecAgent:
- "MUST verify signal exists in ReflectionDB within 10m"
- "MUST stream execution metrics to RiskAgent steward"
RiskAgent:
- "MUST update cross-agent risk summary every 60s"
- "MAY trigger autonomy downgrade for ExecAgent if thresholds breached"
These obligations are documentation only. Enforcement happens via:
- Tripwires checking ReflectionDB state
- Steward-to-steward communication (out of scope for blueprints)
- Monitoring dashboards alerting on missing data
5. Regulatory Framework Integration¶
5.1 Non-Normative Annotations¶
Blueprints MAY include annotations block for documentation purposes. These annotations do NOT affect enforcement logic.
Purpose:
- Document regulatory context for auditors
- Link to external compliance frameworks
- Specify retention/jurisdiction requirements
- Identify policy owners
Schema:
annotations:
# Regulatory references (documentation only)
regulatory_refs:
- "EU AI Act Article 9-15"
- "MiFID II Directive 2014/65/EU"
- "GDPR Regulation 2016/679"
- "FDA 21 CFR Part 11"
# Jurisdiction and scope
jurisdiction: ["EU", "US", "UK"]
geographic_scope: "EEA + US cross-border"
# Organizational metadata
policy_owner: "compliance@example.com"
approval_board: "Enterprise_Governance_Committee"
approved_by: ["user-uuid-1", "user-uuid-2"]
approval_date: "2025-01-15"
# Audit and retention
retention_period: "7y"
retention_jurisdiction: "EU"
audit_frequency: "quarterly"
external_auditor: "KPMG_EU"
# Multi-agent metadata (if applicable)
multi_agent: true
coordination_model: "hierarchical"
participant_roles: ["signal", "execution", "risk", "audit"]
5.2 Mapping Regulations to Enforcement¶
| Regulation | Enforcement Mechanism | Blueprint Location |
|---|---|---|
| Data minimization (GDPR Art. 5) | PII detection tripwire | tripwires (pattern-match) |
| Right to explanation (GDPR Art. 22) | Reasoning transparency metric | checks (CTQ metric) |
| Position limits (MiFID II Art. 57) | Numeric threshold tripwire | tripwires (condition) |
| Record-keeping (MiFID II Art. 25) | ReflectionDB retention | Architecture (ACGP-1002) |
| Transparency (EU AI Act Art. 13) | Source citation requirement | evidence block |
| Human oversight (EU AI Act Art. 14) | Escalation rules | checks -> escalate |
| Risk management (EU AI Act Art. 9) | CTQ scoring + thresholds | checks + scoring |
Example: GDPR Article 5 (Data Minimization)
tripwires:
- id: pii_exposure_check
when:
hook: "output"
condition:
any:
- "NOT matches_regex(content, '\\b\\d{3}-\\d{2}-\\d{4}\\b')" # SSN
- "NOT contains_entity(content, 'credit_card')"
- "NOT contains_entity(content, 'bank_account')"
eval_tier: 0
latency_budget_ms: 50
on_fail:
decision: "block"
reason: "PII detected in output (GDPR Art. 5 violation risk)"
annotations:
regulatory_refs:
- "GDPR Regulation 2016/679 Article 5(1)(c)"
compliance_note: "Implements data minimization principle"
6. Governance Contract Alignment¶
6.1 When to Use Governance Contracts¶
Governance contracts (ACGP-1010) specify runtime parameters for blueprint application. Use them when:
- Different actions have different risk levels (low/elevated/critical)
- Latency budgets vary per request (fast queries vs. critical decisions)
- Fallback strategies differ (deny vs. escalate on timeout)
- Evaluation depth needs negotiation (Eval Tier 0-3)
6.2 Risk Level Selection¶
| Risk Level | Typical Latency Budget | Eval Tiers | Fallback | Use Cases |
|---|---|---|---|---|
low_risk |
100ms | 0, 1 | deny |
Read queries, UI updates, logs |
elevated_risk |
300ms | 0, 1, (2 async) | allow_and_log |
State mutations, notifications |
critical_risk |
5000ms | 0, 1, 2, (3 if needed) | escalate |
Financial transactions, data deletion |
6.3 Complete Example: Critical Trade¶
Agent Request with Contract:
{
"protocol": "acgp",
"protocol_version": "1.1.0",
"message_type": "TRACE",
"message_id": "uuid-v7",
"timestamp": "2025-12-24T10:30:00Z",
"sender_id": "trading-agent-1",
"receiver_id": "governance-steward-1",
"payload": {
"trace_id": "trace-001",
"session_id": "trading-session-abc",
"step": 1,
"tool_calls": [
{
"name": "execute_trade",
"args": {
"symbol": "AAPL",
"quantity": 1000,
"trade_value": 175000,
"order_type": "market"
}
}
],
"source_refs": [
"reg:sec:10K:AAPL:2024",
"market:bloomberg:realtime:2025-12-24"
],
"meta": {
"model": "gpt-5-pro",
"agent_acl_tier": "ACL-4"
},
"governance_contract": {
"risk_level": "critical_risk",
"eval_tier": 2,
"latency_budget_ms": 5000,
"fallback": "escalate",
"justification": "Trade exceeds $100K threshold requiring deep analysis"
}
}
}
Steward Processing:
- Tripwire check (Eval Tier 0, <10ms):
trade_value <= 100000-> FAIL -> Wouldhalt-
BUT contract specifies
critical_risk, so escalate to Tier ½ -
Evidence validation (Eval Tier 1, ~100ms):
- Check
source_refsagainst Registry - Verify SEC filing is certified
-
Verify Bloomberg feed is real-time and trusted
-
CTQ calculation (Eval Tier 2, ~3000ms):
- LLM-judge evaluates trading rationale
- Pattern-match checks for manipulation signals
-
Source-match validates data freshness
-
Decision:
- CTQ = 0.88 -> Risk = 0.12
- Threshold:
ok: 0.15-> OK (just within bounds)
Response:
{
"message_type": "INTERVENTION",
"payload": {
"trace_id": "trace-001",
"decision": "ok",
"flagged": false,
"risk_score": 0.12,
"ctq_score": 0.88,
"reasons": [
"Trade value above normal limits but within critical-risk acceptable range",
"Evidence validation passed (2 certified sources)",
"Rationale quality score: 0.89"
],
"actions": [],
"governance_contract_applied": {
"risk_level": "critical_risk",
"eval_tier": 2,
"actual_latency_ms": 3245
}
}
}
6.4 SLA-Like Requirements¶
Map "SLA" concepts to ACGP constructs:
| SLA Requirement | ACGP Implementation | Notes |
|---|---|---|
| "Response time < 100ms" | latency_budget_ms: 100 |
In governance contract |
| "99.9% uptime" | Profile selection + HA architecture | ACGP-1011 + deployment |
| "Human review within 4 hours" | HITL SLA (out of scope) | External to ACGP protocol |
| "Audit retention 7 years" | ReflectionDB config | Architecture layer |
| "Cost per decision < $0.01" | Profile selection (rule-only) | ACGP-1011 .5 |
7. Complete Examples¶
7.1 Example 1: Regulated Healthcare Agent¶
id: healthcare/clinical_decision_support@1.0
version: "1.0.0"
description: "Clinical decision support agent with FDA/EMA compliance"
inherits: clarity.baseline@1.0
annotations:
regulatory_refs:
- "FDA 21 CFR Part 11"
- "EU MDR 2017/745"
- "HIPAA Privacy Rule"
policy_owner: "clinical.governance@hospital.com"
retention_period: "10y"
jurisdiction: ["US", "EU"]
scope:
agent_tier: [ACL-4, ACL-5]
tools: ["diagnose", "prescribe", "access_patient_records"]
tripwires:
- id: contraindication_check
when:
hook: "tool_call"
tool: "prescribe"
condition: "NOT query_external('contraindication_detected', args.medication, patient.allergies)"
eval_tier: 1
requires_state: true
on_fail:
decision: "halt"
reason: "Contraindication detected (patient safety)"
- id: off_label_use
when:
hook: "tool_call"
tool: "prescribe"
condition: "query_external('fda_approved', args.medication, args.indication)"
eval_tier: 1
on_fail:
decision: "escalate"
reason: "Off-label use requires physician approval"
evidence:
min_certified_sources: 2
source_categories: ["clinical_guidelines", "peer_reviewed"]
min_trust_score: 0.9
checks:
- id: clinical_reasoning_quality
when:
hook: "output"
metric:
name: "diagnostic_soundness"
weight: 0.5
check:
type: "llm-judge"
args:
model: "gpt-4"
prompt_template: "evaluate_clinical_reasoning"
- id: evidence_currency
when:
hook: "output"
metric:
name: "guideline_recency"
weight: 0.3
check:
type: "source-match"
args:
max_age_years: 5
- id: bias_check
when:
hook: "output"
metric:
name: "demographic_bias"
weight: 0.2
check:
type: "llm-judge"
args:
prompt_template: "detect_demographic_bias"
scoring:
thresholds:
ok: 0.10
nudge: 0.20
escalate: 0.35
block: 0.50
trust_debt:
enabled: true
accumulation:
flag: 0.10
block: 0.30
halt: 0.60
thresholds:
re_tiering_review: 0.70
Governance Contract for Critical Decision:
{
"risk_level": "critical_risk",
"eval_tier": 3,
"latency_budget_ms": 10000,
"fallback": "escalate",
"justification": "Life-critical prescription decision"
}
7.2 Example 2: Multi-Agent Trading System¶
System Blueprint:
id: finance/multi_agent_trading_system@1.0
version: "1.0.0"
description: "System-wide governance for coordinated trading agents"
inherits: clarity.baseline@1.0
annotations:
regulatory_refs:
- "MiFID II Directive 2014/65/EU"
- "MAR Regulation 596/2014"
multi_agent: true
coordination_model: "hierarchical"
participant_roles: ["signal_analysis", "trade_execution", "risk_monitoring", "audit"]
tripwires:
- id: market_wide_halt
when:
hook: "tool_call"
condition: "NOT query_external('market_halted')"
eval_tier: 0
on_fail:
decision: "halt"
reason: "Market circuit breaker active"
- id: system_exposure_limit
when:
hook: "tool_call"
tool: "execute_trade"
condition: "query_reflectiondb('sum(trade_value) < 10000000 from last 1h where blueprint_id like finance/multi_agent_%')"
eval_tier: 1
requires_state: true
on_fail:
decision: "halt"
reason: "System-wide hourly exposure limit"
evidence:
min_certified_sources: 1
source_categories: ["regulatory"]
min_trust_score: 0.8
scoring:
thresholds:
ok: 0.20
nudge: 0.35
escalate: 0.50
block: 0.65
Signal Agent Blueprint:
id: finance/signal_agent@1.0
version: "1.0.0"
description: "Signal analysis agent"
inherits: finance/multi_agent_trading_system@1.0
scope:
agent_tier: [ACL-2]
tools: ["analyze_signal"]
checks:
- id: signal_confidence
when:
hook: "output"
metric:
name: "analysis_quality"
weight: 0.7
check:
type: "llm-judge"
- id: source_diversity
when:
hook: "output"
metric:
name: "data_source_count"
weight: 0.3
check:
type: "source-match"
args:
min_source_count: 3
scoring:
thresholds:
ok: 0.15
nudge: 0.25
escalate: 0.40
block: 0.60
Execution Agent Blueprint:
id: finance/execution_agent@1.0
version: "1.0.0"
description: "Trade execution agent"
inherits: finance/multi_agent_trading_system@1.0
scope:
agent_tier: [ACL-4]
tools: ["execute_trade"]
tripwires:
- id: requires_signal
when:
hook: "tool_call"
tool: "execute_trade"
condition: "query_reflectiondb('exists signal from SignalAgent in last 10m for session_id')"
eval_tier: 1
requires_state: true
on_fail:
decision: "block"
reason: "No valid signal within 10 minutes"
checks:
- id: execution_timing
when:
hook: "tool_call"
metric:
name: "market_timing"
weight: 0.6
check:
type: "llm-judge"
- id: slippage_risk
when:
hook: "tool_call"
metric:
name: "execution_quality"
weight: 0.4
check:
type: "rule-based"
8. Best Practices¶
8.1 Blueprint Design¶
- Start with
clarity.baseline: Always inherit from the universal baseline - Layer policies hierarchically: System -> Domain -> Role
- Use tripwires for safety: Hard limits belong in tripwires, not checks
- Conservative thresholds: Start strict, relax based on calibration
- Document intent: Use
annotationsfor context and regulatory links
8.2 Multi-Agent Coordination¶
- Shared
session_id: Link related agent activities - ReflectionDB queries: Use database for cross-agent state, not agent-to-agent messages
- Avoid workflow logic in blueprints: Coordination is steward-side, not policy-side
- Clear role boundaries: Each agent has distinct tools and responsibilities
8.3 Evidence and Sources¶
- Specify minimum sources: Use
evidence.min_certified_sources - Category requirements: Require appropriate source types (regulatory, peer-reviewed, etc.)
- Trust score thresholds: Set appropriate minimums for domain risk level
- Keep registry updated: Source certification is continuous, not one-time
8.4 Performance Optimization¶
- Eval Tier 0 for critical checks: Fast, in-memory, no dependencies
- Async Eval Tier 2: Use for quality checks that don't block action
- Profile selection: Choose appropriate profile for cost/latency/quality trade-off
- Cache policies: Governance contracts support cached decisions
8.5 Testing and Validation¶
- Conformance suite: Run ACGP conformance tests before production
- Calibration datasets: Create ground-truth examples for scorer tuning
- Staged rollout: Use canary deployments for blueprint updates
- Monitor metrics: Track intervention rates, latency, trust debt
9. References¶
Normative References¶
- ACGP-1000: Core Protocol Specification
- ACGP-1001: Terminology and Definitions
- ACGP-1002: Architecture Specification
- ACGP-1003: Message Formats & Wire Protocol
- ACGP-1004: Reflection Blueprint Specification
- ACGP-1005: ARS-CTQ-ACL Integration Framework
- ACGP-1006: Certified Source Registry Specification
- ACGP-1009: Conformance Levels
- ACGP-1010: Governance Contracts
- ACGP-1011: Implementation Profiles
Informative References¶
- RFC 2119: Key words for use in RFCs
- MiFID II: Directive 2014/65/EU
- GDPR: Regulation 2016/679
- EU AI Act: Regulation 2024/1689
End of ACGP-1012