Error Handling Guide

Practical handling of interventions, protocol failures, and SDK exceptions for production ACGP integrations.


Intervention Handling (All 6 Types)

from acgp import GovernanceSteward, CognitiveTrace, PostgresStateStorage

steward = GovernanceSteward.production(
    blueprint_file="blueprint.yaml",
    state_storage=PostgresStateStorage(connection_string="postgresql://runtime/acgp"),
    two_stage_enabled=True,
)

trace = CognitiveTrace(
    reasoning="Refund requested with receipt.",
    action="issue_refund",
    parameters={"amount": 120.0},
)

result = steward.evaluate(trace)

match result.intervention:
    case "OK":
        execute_action()
    case "NUDGE":
        execute_action_with_modifications(result.metadata.get("modifications"))
    case "FLAG":
        execute_action()
        log_for_review(result)
    case "ESCALATE":
        queue_for_human_review(result)
    case "BLOCK":
        reject_action(result.message)
    case "HALT":
        shutdown_agent(result.message)

Exception Handling Pattern

from acgp.exceptions import (
    ACGPError,
    PolicyError,
    TripwireViolation,
    BudgetExceeded,
    AgentHalted,
)

try:
    result = steward.evaluate(trace)
    handle_intervention(result)

except AgentHalted as err:
    emergency_shutdown(str(err))

except BudgetExceeded as err:
    apply_fallback_policy(conformance_level="standard", reason=str(err))

except TripwireViolation as err:
    reject_action(f"Tripwire violation: {err}")

except PolicyError as err:
    log_critical(f"Policy configuration error: {err}")
    apply_safe_defaults()

except ACGPError as err:
    log_error(f"Generic governance error: {err}")
    apply_fallback_policy(conformance_level="standard", reason=str(err))

Fallback Behavior by Profile

Profile Default on Unavailable Steward Notes
Standard block May use explicit policy override if documented and audited
Safety-Critical halt Must remain fail-closed
Dev Mode allow_and_log / observe_only Non-conformant mode

Error-Code Aware Retry

Map wire errors (see Error Codes) to actions:

  • Timeout, ServiceUnavailable, RateLimitExceeded: bounded retry with jitter
  • IntegrityCheckFailed, InvalidMessageType, SchemaValidationFailed: fail fast
  • retry exhausted: apply profile fallback and buffer for replay
def evaluate_with_retry(trace, max_attempts=3):
    for attempt in range(1, max_attempts + 1):
        try:
            return steward.evaluate(trace)
        except TimeoutError:
            if attempt == max_attempts:
                decision = apply_profile_fallback(conformance_level="standard", mode="deny")
                buffer_for_replay(trace, decision)
                return decision

Debugging Tips

  • Inspect result.metadata first (tripwire IDs, thresholds, scorer evidence)
  • Log local vs remote stage outcomes separately
  • Include trace_id, session_id, and blueprint hash in every error log
  • Track intervention-rate spikes as signals of policy drift