Conformance¶
Status: Standard-only Alpha (v1.0.0-alpha.2)
Last Updated: 2026-03-12
Spec ID: ACGP-6
Normative Keywords: MUST, SHOULD, MAY (per RFC 2119 and RFC 8174)
Abstract¶
This specification defines the profile-based conformance model, test suites, badge suites, test runner requirements, reporting format, certification process, and non-conformance handling for ACGP v1.0.
Requirements Language¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
1. Scope [NORMATIVE]¶
ACGP-6 defines how v1.0 conformance is tested, claimed, reported, and certified.
ACGP-6 does not define:
- Protocol semantics (→ ACGP-1 through ACGP-5)
- Implementation strategies or deployment guidance
2. Conformance Model (v1.0) [NORMATIVE]¶
2.1 Active Alpha Claim Surface¶
For v1.0.0-alpha.2, ACGP has one active external conformance story:
Standardis the active claimable profileRegulated Controls Badgeis additiveDev Modeis non-conformant
Safety-Critical control material remains in the repository for future-track continuity, but Safety-Critical conformance claims are not available in v1.0.0-alpha.2.
| Claim | Required Suite |
|---|---|
| ACGP v1.0 Standard Conformant | Standard suite — 100% pass |
2.2 Badge Claims¶
Badge suites certify additive capabilities orthogonal to profiles. v1.0 defines one badge:
- Regulated Controls Badge — validates audit durability, privacy, retention, integrity controls (ACGP-5)
Badge claims combine with the active profile claim:
ACGP v1.0 Standard Conformant + Regulated Controls Badge
2.3 Non-Conformant Runtime Modes¶
Dev Mode is a local development/runtime convenience mode. It is not a conformance profile and MUST NOT be used in a conformance claim.
2.4 Future Profile Track¶
Safety-Critical profile content MAY be implemented for evaluation, internal testing, or roadmap preparation, but MUST NOT be advertised as an ACGP v1.0 alpha conformance claim.
3. Standard Profile Suite [NORMATIVE]¶
The Standard suite validates production-grade behavior across all ACGP v1.0 specs.
3.1 Required Capabilities¶
| Category | Requirements |
|---|---|
| Protocol (ACGP-2) | Envelope structure, all 8 message types (3 categories), version negotiation, retry with backoff |
| Blueprints (ACGP-3) | Schema parsing, inheritance with checks append, Clarity Baseline enforcement, extension descriptor preservation |
| CTQ (ACGP-3) | 5 standard dimensions, metric-to-dimension mapping, weighted aggregation, dimension-status semantics including failed_evidence_policy, contributor-visibility rules for non-evaluated states, public ctq_dimensions EVAL output, and risk score conversion |
| Thresholds (ACGP-3) | Governance-Tier-specific thresholds, stricter-of-two override rule |
| Interventions | All 5 primary levels (ok, nudge, escalate, block, halt) + orthogonal flag |
| Tripwires (ACGP-4) | DSL parsing, precedence before CTQ, fail-closed runtime, all 3 severities |
| Extensions (ACGP-½/3) | Required/optional handling, visibility semantics, explicit enforcement-scope semantics, remote-scope preservation, and fail-mode enforcement for unavailable local or both-scope authoritative enforcers |
| Execution Mode (ACGP-½) | Passive default behavior, active-capable execution authority, durable pending escalation for active HITL, audit-visible active to passive transitions, and fail-closed handling for unsupported or unsafe active enforcement |
| Trust Debt (ACGP-3) | Observable trust debt semantics, default deterministic provider behavior, flag orthogonality, public pre / delta / post / thresholds_crossed visibility, runtime posture emission, intervention-floor behavior, and review-trigger visibility |
| Security (ACGP-5) | TLS 1.3+, ES256 signing for GT-3+, data classification, and any additional ACGP-5 controls required by §3.2 |
| Audit (ACGP-5) | Append-only logging, 90-day minimum retention, tamper-evident integrity |
| Observability | CTQ score distribution, intervention counts, evaluation latency (p50/p95/p99), machine-readable export of required metrics |
3.2 ACGP-5 Control Coverage by Claim [NORMATIVE]¶
The obligation strengths in this table are the single authoritative claim-level requirements for ACGP-5 controls. If any summary text elsewhere uses different wording, this table controls.
| ACGP-5 Control | Standard | Safety-Critical | Regulated Badge |
|---|---|---|---|
| Data classification (Section 3) | MUST | MUST | MUST |
| Privacy principles (Section 4) | SHOULD | MUST | MUST |
| Encryption at rest (Section 5) | SHOULD | MUST | MUST |
| Retention windows (Section 6) | MUST (minimum 90 days) | MUST | MUST |
| WAL durability (Section 7) | MAY | MUST | MUST |
| Hash-chain integrity (Section 8) | MUST | MUST | MUST |
| RBAC (Section 9) | SHOULD | MUST | MUST |
| Compliance reporting (Section 10) | MAY | SHOULD | MUST |
Forward conformance note (stable v1.0 target):
The v1.0.0-alpha.2 Standard claim surface intentionally preserves lower adoption friction during alpha. For the stable v1.0 release, maintainers expect to review the Standard control floor in light of ACGP-5 coverage, with particular attention to promoting encryption at rest and RBAC from SHOULD to MUST, and reevaluating whether privacy principles should also be promoted. Compliance reporting remains a likely candidate for promotion from MAY to SHOULD. Any such change will be announced as part of the alpha-to-stable transition and reflected in the conformance suites before stable claim issuance.
3.3 Wire Security Requirements by Profile [NORMATIVE]¶
ACGP-2 §4.2 is the single authoritative source for checksum/signature requirements by profile and Governance Tier.
| Security Control | Requirement |
|---|---|
| TLS 1.3+ transport | MUST |
| RFC 8785 JCS canonicalization | MUST |
| Checksum/signature requirements | MUST follow ACGP-2 §4.2 Signature Requirements Matrix |
| HSM-backed key storage | MUST follow ACGP-2 §4.2 Signature Requirements Matrix |
| Certificate pinning | MUST follow ACGP-2 §4.2 Signature Requirements Matrix |
3.4 Performance Targets¶
Evaluation-only latency budgets. Network round-trip time is excluded. These are the single authoritative source for per-Governance-Tier latency targets (referenced by ACGP-4 §12).
| Governance Tier | Typical Latency | Maximum Latency |
|---|---|---|
| GT-0 | ~10ms | <50ms |
| GT-1 | ~20ms | <100ms |
| GT-2 | ~50ms | <150ms |
| GT-3 | ~100ms | <200ms |
| GT-4 | ~200ms | <350ms |
| GT-5 | ~500ms | <1000ms |
4. Safety-Critical Profile Track [INFORMATIVE]¶
Alpha status note: Safety-Critical requirements are retained for architecture continuity and evaluation, but are not part of the active external claim surface for
v1.0.0-alpha.2.
Safety-Critical profile content is retained for roadmap continuity, but Safety-Critical conformance claims are disabled for v1.0.0-alpha.2. Implementations MAY prototype or internally validate Safety-Critical controls, but those results are not a publishable conformance claim for this alpha.
The Safety-Critical suite includes all Standard requirements plus high-assurance controls.
4.1 Additional Requirements¶
| Category | Additional Requirements |
|---|---|
| Fail-safe | Kill-switch capability, dual control (two-person rule) for GT-5 |
| Cryptography | HSM-backed key storage, certificate pinning |
| Audit | Tamper-evident audit sequence with redactable payloads (ACGP-5 §8.4), 7-year minimum retention, Merkle integrity proofs |
| Availability | 99.99% uptime target, disaster recovery procedures |
| Emergency | Emergency override procedures with audit trail |
| Monitoring | Real-time anomaly detection, advanced alerting |
| Architecture | Zero-trust network posture, isolated execution for GT-4+ |
5. Badge Suite: Regulated Controls [NORMATIVE]¶
The Regulated Controls Badge validates the five control families defined in ACGP-5 Section 11:
| Control Family | Test Focus |
|---|---|
| Audit Durability | WAL behavior, replay ordering, backpressure |
| Privacy & Minimization | PII detection, redaction, commit-then-minimize |
| Retention & Jurisdiction | Window enforcement, deletion evidence, data residency |
| Reproducibility & Export | Machine-readable evidence export, evaluation reproducibility |
| Cryptographic Integrity | Hash-chain or Merkle proofs, independent verification |
6. Test Vectors and Runner Contract [NORMATIVE]¶
6.1 Runner Requirements¶
Conformance tooling MUST:
- Execute deterministically for all required vectors
- Produce structured output:
pass/failwith failure reasons - Support explicit suite selection:
standard,safety-critical, badge suites - Emit machine-readable summary suitable for CI pipelines (JSON)
Implementations claiming Standard conformance MUST be able to emit the required observability metrics in the canonical JSON shape below.
The transport or telemetry backend is implementation-defined. Implementations MAY additionally export the same metrics via OpenTelemetry or another telemetry backend, but the canonical JSON shape below is the minimum conformance-bearing export surface.
{
"report_type": "acgp_observability_v1",
"window": {
"start": "2026-03-16T00:00:00Z",
"end": "2026-03-16T01:00:00Z"
},
"metrics": {
"ctq_score_distribution": {
"p50": 0.82,
"p95": 0.94,
"p99": 0.98
},
"intervention_counts": {
"ok": 1200,
"nudge": 43,
"escalate": 7,
"block": 3,
"halt": 0,
"flagged": 18
},
"evaluation_latency_ms": {
"p50": 12.4,
"p95": 67.8,
"p99": 141.2
}
}
}
6.2 Vector Categories¶
Test vectors are located in the conformance/vectors/ directory at the repository root. The manifest file conformance/vectors/manifest.json provides the authoritative list of conformance vectors and MUST be updated whenever vectors are added, renamed, or removed.
| Category | Description | Example Vector Files |
|---|---|---|
| Protocol | Envelope, message parsing, negotiation | standard-001.json, standard-002.json |
| Evaluation | CTQ calculation, threshold mapping, public ctq_dimensions output, failed_evidence_policy, contributor visibility for non-evaluated states, and degraded vs error handling |
ctq-arithmetic-001.json, ctq-dimensions-contributors-001.json |
| Tripwire | DSL parsing, precedence, fail-closed | tripwire-001.json through tripwire-004.json |
| Intervention | Decision logic, flag orthogonality | safety-critical-001.json |
| Extensions | Required local rejection, required remote preservation, required both-scope rejection when one side is missing, optional ignore, local non-portability, source-match unavailable fallback | extension-001.json, extension-003.json, extension-004.json |
| Trust Debt | Observable debt semantics, additive flag accumulation, default deterministic provider behavior, posture emission, intervention-floor behavior, review-trigger semantics, and proof that trust debt cannot emit halt |
trust-debt-001.json, trust-debt-posture-001.json |
| Fallback | Profile-failure fallback for Steward/session-path unavailability: deny, allow_and_log, cached_decision |
fallback-001.json |
| Regulated | Audit, retention, integrity controls | regulated-controls-001.json |
Evaluation-timeout behavior belongs to the Runtime Governance Contracts preview track and is not part of the v1.0 Standard conformance surface unless explicitly activated by a future claimable extension suite.
6.3 Vector Format¶
Numeric Comparison Tolerance [NORMATIVE]: Floating-point results in test vectors MUST match expected values within ±1e-4. Implementations MUST serialize scores to exactly 4 decimal places (round half away from zero). Weight sums MUST match 1.0 within ±0.001 (ACGP-3 §6.1).
{
"test_id": "ctq-001",
"suite": "standard",
"description": "Standard CTQ calculation with 5 metrics",
"input": {
"metrics": [
{ "name": "reasoning_quality", "score": 0.85, "weight": 0.25 },
{ "name": "knowledge_grounding", "score": 0.88, "weight": 0.20 },
{ "name": "ethical_alignment", "score": 0.90, "weight": 0.20 },
{ "name": "tool_safety", "score": 0.92, "weight": 0.20 },
{ "name": "context_awareness", "score": 0.89, "weight": 0.15 }
]
},
"expected_output": {
"ctq_final": 0.8825,
"risk_score": 0.1175
}
}
Standard conformance also requires public EVAL artifact coverage. Implementations MUST emit canonical ctq_dimensions results, preserve per-dimension status semantics including failed_evidence_policy, preserve contributor visibility at the public dimension level, expose trust-debt posture / review fields, and reproduce vector-declared fallback behavior when evaluation is degraded, unavailable, or errors.
Example public EVAL artifact vector:
{
"test_id": "ctq-dimensions-status-001",
"suite": "standard",
"description": "Unavailable knowledge-grounding scorer remains visible in the public EVAL payload",
"payload": {
"trace_id": "t-ctq-status-001",
"blueprint_id": "finance_qa@2.1",
"governance_tier": "GT-2",
"ctq_dimensions": {
"reasoning_quality": {
"score": 0.90,
"weight": 0.25,
"status": "evaluated",
"contributors": ["rationale_clarity", "plan_completeness"]
},
"knowledge_grounding": {
"score": 0.0,
"weight": 0.20,
"status": "unavailable",
"contributors": []
},
"ethical_alignment": {
"score": 0.94,
"weight": 0.20,
"status": "evaluated",
"contributors": ["fairness_review"]
},
"tool_safety": {
"score": 0.91,
"weight": 0.20,
"status": "evaluated",
"contributors": ["permission_check"]
},
"context_awareness": {
"score": 0.88,
"weight": 0.15,
"status": "evaluated",
"contributors": ["situational_fit"]
}
},
"ctq_score": 0.9088,
"risk_score": 0.0912,
"tripwires_triggered": [],
"intervention": "ok",
"flagged": false,
"runtime_posture": "normal",
"review_required": false,
"trust_debt": {
"provider_id": "acgp.core.default@1",
"pre": 0.0,
"delta": 0.0,
"post": 0.0,
"thresholds_crossed": []
},
"evaluation_metadata": {
"fallback_policy": "redistribute_available_weights"
}
},
"expected_output": {
"ctq_score": 0.9088,
"risk_score": 0.0912,
"dimension_statuses": {
"knowledge_grounding": "unavailable"
}
}
}
6.4 Tripwire Override Vector¶
{
"test_id": "tripwire-override-001",
"suite": "standard",
"description": "Explicit tripwire halt overrides excellent CTQ regardless of severity metadata",
"input": {
"governance_tier": "GT-3",
"ctq_score": 0.95,
"risk_score": 0.05,
"tripwires": [
{
"id": "secrets_leak",
"severity": "critical",
"triggered": true,
"on_fail": { "decision": "halt", "reason": "Secrets detected in output" }
}
]
},
"expected_output": {
"intervention": "halt",
"reason": "Strictest explicit tripwire decision wins"
}
}
6.5 Trust Debt Vector¶
{
"test_id": "trust-debt-posture-001",
"suite": "standard",
"description": "Crossing elevated monitoring emits posture without changing intervention",
"input": {
"agent_id": "urn:acgp:agent:conformance:prod:0001",
"config": {
"accumulation": { "ok": 0.0, "flag": 0.1, "nudge": 0.5, "escalate": 1.0, "block": 2.0, "halt": 5.0 },
"decay": { "decay_fraction": 0.05, "period_hours": 1, "min_debt": 0.0 },
"thresholds": { "elevated_monitoring": 3.0, "restricted_mode": 6.0, "re_tiering_review": 10.0 }
},
"starting_debt": 2.95,
"evaluation": {
"pre_posture_intervention": "ok",
"flagged": true
}
},
"expected_output": {
"intervention": "ok",
"runtime_posture": "elevated_monitoring",
"review_required": false,
"trust_debt": {
"pre": 2.95,
"delta": 0.1,
"post": 3.05,
"thresholds_crossed": ["elevated_monitoring"]
}
}
}
Additional Standard vectors MUST cover trust-debt-floor-001, trust-debt-review-001, and trust-debt-no-halt-001, plus ctq-evidence-policy-001, ctq-unavailable-001, ctq-error-001, and ctq-degraded-001. The public EVAL contract vector set MUST also verify required fields including blueprint_id, governance_tier, flagged, runtime_posture, review_required, the full trust_debt block, and the rule that top-level intervention is the final post-floor outcome.
Execution-mode vectors MUST verify passive default behavior, active authority for terminal local signed-bundle decisions, durable pending-state creation and fail-closed handling for active escalation, timeout fallback inheritance, ok plus flag behavior in active mode, audit-visible active to passive downgrade handling with mandatory reason, rejection of malformed or unsupported intervention_execution payloads, and the full frozen nudge_behavior matrix.
6.6 Extension Rejection Vector¶
{
"test_id": "extension-001",
"suite": "standard",
"description": "Unsupported required private extension rejects activation",
"input": {
"extensions": {
"required": [
{
"id": "urn:acgp:ext:source-catalog-private@1",
"visibility": "private",
"fail_mode": "reject_activation"
}
]
},
"supported_extensions": []
},
"expected_output": {
"activation_result": "rejected",
"error_code": "UnsupportedRequiredExtension"
}
}
6.7 Extension Conformance Rules [NORMATIVE]¶
Implementations claiming Standard conformance:
- MUST parse and preserve
extensions.required[]andextensions.optional[]metadata on supported artifact types. - MUST apply
fail_modewhen a requiredlocalauthoritative enforcer is unavailable. - MUST preserve required
remotedescriptors and negotiated metadata locally and MUST NOT reject solely because enforcement is delegated to the remote authoritative enforcer. - MUST apply
fail_modewhen either authoritative enforcer required by abothdescriptor is unavailable. - MUST NOT silently ignore unsupported required private extensions.
- MAY ignore unsupported optional extensions after preserving their descriptors.
- MUST NOT require implementation of proprietary Source Catalog internals or disclosure of private source metadata for conformance.
7. Validation Procedures [NORMATIVE]¶
Implementations MUST support:
| Mode | Description |
|---|---|
| Self-validation | Implementation runs suite internally |
| External validation | Third party runs same suite against implementation |
| Periodic re-validation | Maintained claims require re-validation |
Validation outputs MUST include:
- Suite version
- Implementation name and version
- Complete pass/fail breakdown per vector
- Timestamp of execution
8. Reporting Format [NORMATIVE]¶
8.1 Conformance Report Structure¶
{
"report_version": "1.0",
"implementation": {
"name": "ExampleSteward",
"version": "2.3.1"
},
"acgp_version": "1.0.0",
"profile": "standard",
"badges": ["regulated_controls"],
"execution": {
"timestamp": "2026-02-26T10:00:00Z",
"suite_version": "1.0.0",
"runner_version": "0.5.0"
},
"results": {
"total": 87,
"passed": 87,
"failed": 0,
"skipped": 0
},
"capabilities": {
"intervention_execution_modes": ["passive", "active"]
},
"suites": {
"standard": { "total": 65, "passed": 65 },
"regulated_controls": { "total": 22, "passed": 22 }
},
"non_conformant_requirements": [],
"known_limitations": [],
"validity": {
"valid_from": "2026-02-26",
"valid_until": "2027-02-26",
"recertification_cadence": "annual"
}
}
8.2 Conformance Statement¶
Implementations claiming conformance MUST publish a statement including:
- Implementation identifier and version
- Active profile claim (
Standardforv1.0.0-alpha.2) - Optional badge claims
- Suite version and execution date
- Known limitations or deviations
9. Non-Conformance Handling [NORMATIVE]¶
Non-conformance findings MUST be categorized:
| Severity | Description | Remediation Deadline |
|---|---|---|
| Critical | Core safety/security violation | Immediate (claim suspended) |
| Major | Required feature missing or incorrect | 30 days |
| Minor | Non-critical deviation or documentation gap | 90 days |
Claims MUST be suspended or withdrawn if required remediations are not completed within deadlines.
10. Interoperability Requirements [NORMATIVE]¶
Conformant implementations MUST demonstrate:
- Version negotiation — successful handshake with another conformant implementation
- Message exchange — generate and parse all 8 message types with their defined sender/receiver directions per ACGP-2 §4.2
- Decision consistency — given the same trace, blueprint, and deterministic scorer outputs (Tier 0–1), produce the same intervention decision. For non-deterministic scorers (Tier 2–3), implementations MUST record sufficient provenance (model identifier, version, random seed or temperature, raw scorer output) to explain variance. Conformance test vectors use fixed scorer outputs to enable deterministic verification.
Interoperability testing MAY use the reference test harness or a peer implementation.
10.1 Wire Interoperability Requirements [NORMATIVE]¶
To claim ACGP v1.0 Standard Conformant, an implementation MUST satisfy the following wire interoperability requirements (normatively defined in ACGP-2 Section 12):
- Serialize all messages as JSON over HTTPS (TLS 1.3+)
- Use the canonical envelope structure (ACGP-2 Section 4)
- Generate and verify checksums using RFC 8785 canonicalization
- Support all 8 message types across 3 categories (ACGP-2 Section 4.2)
- Implement version negotiation before governed message exchange
- Implement retry with exponential backoff
- Handle duplicate
message_ididempotently - Return structured error responses with standard error codes
- Use canonical lowercase intervention decision casing (ACGP-1 Section 3.3)
- If advertising active-capable execution semantics, preserve and validate
intervention_executioninSESSION_INITandBUNDLE_UPDATE, reject unsupported values fail-closed, and report supported execution modes only through the canonical enumerated capability fieldintervention_execution_modes
11. Versioning [NORMATIVE]¶
- Conformance claims are version-scoped (claims apply to a specific ACGP version)
- Implementations MUST re-run affected suites for relevant minor or major version changes
- PATCH version changes do not require re-certification
- Implementations MUST document version negotiation behavior when communicating with peers on different versions
12. Conformance Requirements¶
A conformant ACGP-6 test suite implementation MUST:
- Include deterministic test vectors for all normative requirements in ACGP-1 through ACGP-5
- Produce machine-readable JSON reports (Section 8)
- Support suite selection for Standard, badge suites, and any additional non-claim validation suites the implementation exposes
- Enforce 100% pass rate for profile claims
- Include interoperability test vectors (Section 10)
- Categorize non-conformance by severity with remediation deadlines (Section 9)
- Include validity windows and re-certification cadence in reports
- Support both self-validation and external validation modes
Normative References¶
- RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
- RFC 8174 — Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words
- RFC 3339 — Date and Time on the Internet: Timestamps
- RFC 7515 — JSON Web Signature (JWS)
- RFC 8785 — JSON Canonicalization Scheme (JCS)
- ACGP-1 — Core Concepts & Terminology, v1.0, 2026
- ACGP-2 — Messages & Wire Protocol, v1.0, 2026
- ACGP-3 — Blueprints, Traces & Evaluation, v1.0, 2026
- ACGP-4 — Tripwires & Safety Semantics, v1.0, 2026
- ACGP-5 — Audit & Privacy Controls, v1.0, 2026
- ACGP-6 — Conformance, v1.0, 2026