Integration Walkthrough: Zero to Production¶
End-to-end implementation path for integrating ACGP into an existing agent system using the canonical v1.0 model.
Python is the canonical alpha runtime evaluator. TypeScript and other non-Python applications should treat ACGP as a protocol and client surface locally, then call a supported evaluator service for execution-time governance.
The shipped reference deployment uses PostgreSQL-compatible runtime state for durable, centralized governance state and audit persistence. SQLite remains useful for local development, tests, and single-node fallback workflows, but it is not the reference deployment backend. Advanced multi-region deployments may use a distributed SQL backend. Large evidence artifacts may be stored in object storage with database-backed metadata and retention controls.
Recommended Cross-Language Execution Path¶
Use this sequence when your application runtime is not Python:
- The application emits a trace plus any required evidence.
- An evaluator service receives the request.
- The Python runtime applies tripwires, deterministic checks, CTQ, thresholds, flags, trust debt, and governance-tier review.
- The service returns the intervention decision and supporting evidence, and persists audit records.
sequenceDiagram
participant App as TS or non-Python app
participant Service as Evaluator service
participant Runtime as Python runtime
participant State as PostgreSQL runtime state
participant Audit as PostgreSQL audit store
App->>Service: TRACE + evidence
Service->>Runtime: Evaluate request
Runtime->>State: Read/write runtime governance state
Runtime->>Audit: Persist decision and evidence
Runtime-->>Service: INTERVENTION + evidence
Service-->>App: Decision + audit identifiers
0) Prerequisites¶
- Python 3.10+
- A reachable Governance Steward endpoint
- A blueprint in canonical Blueprint schema (
artifact_type/schema_version/tripwires/checks/intervention_policy)
For production-oriented walkthroughs, assume the evaluator service persists governance state and audit records in PostgreSQL rather than local SQLite.
1) Author a Real Blueprint (Canonical Schema)¶
artifact_type: acgp.blueprint
schema_version: "2.0.0"
id: finance/refunds@2.0
version: "2.0.0"
title: "Refund governance policy"
description: "Refund governance policy"
applicability:
governance_tiers: [GT-2, GT-3]
tools: ["issue_refund"]
tripwires:
- id: max_refund_hard_stop
when:
hook: tool_call
tool: issue_refund
condition: "args.amount > 5000"
eval_tier: 0
on_fail:
decision: block
reason: "Refund exceeds hard cap"
checks:
- id: reasoning_quality
kind: metric
when:
hook: output
metric:
name: reasoning_quality
weight: 0.25
evaluator:
kind: rule-based
args:
mode: all
rules:
- id: reasoning_present
field: reasoning
operator: exists
- id: grounding_check
kind: metric
when:
hook: output
metric:
name: knowledge_grounding
weight: 0.20
evaluator:
kind: source-match
args:
require_order_match: true
intervention_policy:
thresholds:
ok: 0.25
nudge: 0.40
escalate: 0.55
# Risk > escalate -> block (implicit; halt is tripwire-only)
trust_policy:
enabled: true
accumulation:
ok: 0.0
flag: 0.1
nudge: 0.5
escalate: 1.0
block: 2.0
halt: 5.0
decay:
decay_fraction: 0.05
period_hours: 24
min_debt: 0.0
thresholds:
elevated_monitoring: 0.30
restricted_mode: 0.50
re_tiering_review: 0.75
2) Initialize Steward and Enable Two-Stage Runtime¶
from acgp import GovernanceSteward, PostgresStateStorage
steward = GovernanceSteward.production(
blueprint_file="blueprint.yaml",
state_storage=PostgresStateStorage(connection_string="postgresql://runtime/acgp"),
two_stage_enabled=True,
)
If your deployment performs explicit protocol negotiation, runtime starts with:
VERSION_NEGOTIATIONVERSION_SELECTEDSESSION_INIT(contains local evaluation bundle)
3) Handle SESSION_INIT and Validate Bundle Load¶
def on_session_init(bundle: dict) -> None:
required = [
"tripwires_tier0",
"scorers_rule_based",
"scorers_pattern_match",
"thresholds",
"fallback_on_timeout",
"blueprint_version_hash",
]
missing = [k for k in required if k not in bundle]
if missing:
raise RuntimeError(f"SESSION_INIT bundle incomplete: {missing}")
print("Local evaluation bundle loaded", bundle["blueprint_version_hash"])
Example SESSION_INIT.payload:
{
"bundle_hash": "sha256:6cb4...",
"effective_at": "2026-02-26T12:30:00Z",
"tripwires_tier0": [{"id": "max_refund_hard_stop"}],
"scorers_rule_based": [{"id": "reasoning_quality"}],
"scorers_pattern_match": [],
"thresholds": {"ok": 0.25, "nudge": 0.40, "escalate": 0.55},
"fallback_on_timeout": "deny",
"blueprint_version_hash": "sha256:6cb4..."
}
4) Submit a Trace (Canonical Wire Shape)¶
from acgp import CognitiveTrace
trace = CognitiveTrace(
reasoning="Customer provided valid receipt and defect evidence.",
action="issue_refund",
parameters={"order_id": "A-1007", "amount": 240.0},
confidence=0.93,
)
result = steward.evaluate(trace)
Approximate on-wire TRACE.payload:
{
"trace_id": "uuid",
"agent_id": "urn:acgp:agent:refund-review:prod:7f4c9d2a",
"governance_tier": "GT-3",
"reasoning": "Customer provided valid receipt and defect evidence.",
"action": {"name": "issue_refund", "parameters": {"order_id": "A-1007", "amount": 240.0}},
"local_eval_results": {
"bundle_version_hash": "sha256:6cb4...",
"tripwires_checked": ["max_refund_hard_stop"],
"local_decision": "ok"
}
}
Example service-bound request from a non-Python app:
import { createEvaluatorServiceClient } from '@acgp-protocol/sdk';
const client = createEvaluatorServiceClient({
baseUrl: 'https://governance.example.com',
apiToken: process.env.ACGP_EVALUATOR_TOKEN,
timeoutMs: 3000,
retry: {
maxAttempts: 3,
baseDelayMs: 200,
},
});
const response = await client.evaluateWithDetails(trace, {
metadata: {
orderRecordId: 'A-1007',
receiptDigest: 'sha256:...'
}
});
const decision = response.decision;
For one stitched request-to-audit chain with concrete artifacts, see the Trace to Decision Walkthrough.
4.1) Recommended Production Integration¶
Treat the evaluator service as the first-class product path for TypeScript and other non-Python applications.
- use a bearer token or internal gateway identity on every request
- perform a
/healthzcheck during startup and before declaring the dependency ready - set explicit client timeouts and keep retries limited to transient transport or gateway failures
- propagate request metadata such as order IDs, receipt digests, and request IDs for audit correlation
- handle
EvaluatorServiceErrorexplicitly so retryable transport failures and non-retryable policy failures are separated
import {
EvaluatorServiceError,
createEvaluatorServiceClient,
} from '@acgp-protocol/sdk';
const baseUrl = process.env.ACGP_EVALUATOR_URL!;
const apiToken = process.env.ACGP_EVALUATOR_TOKEN!;
async function assertEvaluatorHealthy(): Promise<void> {
const response = await fetch(`${baseUrl}/healthz`, {
headers: {
Authorization: `Bearer ${apiToken}`,
},
});
if (!response.ok) {
throw new Error(`Evaluator health check failed with ${response.status}`);
}
}
const client = createEvaluatorServiceClient({
baseUrl,
apiToken,
timeoutMs: 3000,
retry: {
maxAttempts: 3,
baseDelayMs: 200,
},
});
await assertEvaluatorHealthy();
try {
const response = await client.evaluateWithDetails(trace, {
metadata: {
orderRecordId: 'A-1007',
receiptDigest: 'sha256:...',
requestId: crypto.randomUUID(),
},
});
console.log(response.auditId, response.decision.intervention);
} catch (error) {
if (error instanceof EvaluatorServiceError) {
if (error.retryable) {
queueForRetry(trace);
} else {
routeToOperator(error.message, error.status);
}
}
throw error;
}
4.2) Recommended Production Storage Layout¶
Use this deployment baseline unless you have a clear reason to do otherwise:
- evaluator service and Python runtime behind one service boundary
- PostgreSQL for governance state, trust debt, and audit metadata
- object storage for large evidence blobs when payloads outgrow database-friendly sizes
- SQLite only for local development, tests, and single-node reference runs
5) Handle All Intervention Types¶
match result.intervention:
case "OK":
execute_refund()
case "NUDGE":
execute_refund(modifications=result.metadata.get("modifications"))
case "ESCALATE":
route_to_human_review(result)
case "BLOCK":
reject_request(result.message)
case "HALT":
stop_agent_runtime(result.message)
if result.flags and result.flags.flagged:
enqueue_compliance_review(result.flags.reason)
6) Handle Steward Disconnect and Fallback¶
Use profile-appropriate fallback defaults:
- Safety-Critical: fail-closed (
halt/block) - Standard: default
block; explicit policy may define stricter or approved alternate behavior
try:
result = steward.evaluate(trace)
except TimeoutError:
decision = apply_profile_fallback(conformance_level="standard", mode="deny")
buffer_for_replay(trace, decision)
7) Enable Async Audit Logging¶
def log_governance_event(trace_id: str, result) -> None:
audit_event = {
"trace_id": trace_id,
"intervention": result.intervention,
"message": result.message,
"metadata": result.metadata,
}
append_to_audit_queue(audit_event)
Recommended: flush queue to the Governance Store on an interval and on graceful shutdown.
When you run ACGP behind an evaluator service, return the intervention, evidence references, and audit identifiers together so the calling application can correlate the governance decision with its own request lifecycle.
8) Run Conformance Vectors¶
Use the conformance runner in this repository:
Minimum expectation for claims:
- Required vectors pass at 100%
- Fallback behavior matches profile defaults
- All intervention types handled
See: Conformance Spec
9) Production Readiness Checklist¶
- Canonical blueprint schema in use
SESSION_INITbundle integrity checkedBUNDLE_UPDATEhandled at runtime- All six interventions implemented in code paths
- Profile fallback behavior tested under disconnect
- Async audit replay tested after reconnect
- Conformance vectors passing