Latency Budget Calculator¶
Interactive tool for planning governance contract performance budgets
Overview¶
This calculator helps you determine appropriate latency budgets and eval tier selections for your governance contracts based on action volume, risk profile, and infrastructure constraints.
Calculator¶
1. Action Profile
Average number of actions per second Total: 100%2. Infrastructure Constraints
Agent ↔ Steward network latency Serialization, parsing, validation In-memory rules average latency Database/cache lookup average latency Model inference average latency3. Eval Tier Strategy
4. Results
Recommended Budgets
| Risk Level | E2E Budget | Governance | Headroom |
|---|---|---|---|
| Low Risk | — | — | — |
| Elevated Risk | — | — | — |
| Critical Risk | — | — | — |
Performance Metrics
| Metric | Value |
|---|---|
| Avg E2E Latency | — |
| P95 E2E Latency | — |
| Max Throughput | — |
| Budget Violation Risk | — |
Cost Estimate (Monthly)
| Component | Cost |
|---|---|
| Tier 0 Evaluations | — |
| Tier 1 Evaluations | — |
| Tier 2 Evaluations | — |
| Tier 3 Reviews | — |
| Total Monthly | — |
Recommendations
- Configure inputs and click Calculate
Keyboard shortcuts: Ctrl/Cmd+Enter to calculate, Ctrl/Cmd+E to export
How to Use¶
1. Enter Action Profile¶
- Action Volume: Your typical actions per second (e.g., 10 for moderate load, 100 for high volume)
- Risk Distribution: Percentage of actions in each risk category
- Adjust sliders until total = 100%
- Example: 70% low-risk reads, 25% elevated-risk writes, 5% critical-risk transactions
2. Configure Infrastructure¶
- Network Latency: Measure with
pingor network monitoring (one-way, typically 10-50ms) - Protocol Overhead: Default 40ms is reasonable for JSON serialization/parsing
- Tier Latencies: Measure in your steward implementation
- Tier 0: In-memory rule execution (typically 20-50ms)
- Tier 1: Database query + cache lookup (typically 100-200ms)
- Tier 2: Model inference (typically 500-3000ms depending on model size)
3. Select Eval Tiers¶
Choose evaluation depth for each risk level: - Conservative: Higher tiers for better quality (slower, more expensive) - Aggressive: Lower tiers for better performance (faster, cheaper, less thorough)
4. Review Results¶
- Recommended Budgets: E2E latency targets with built-in headroom
- Performance Metrics: Expected latency distribution and throughput limits
- Cost Estimate: Monthly infrastructure costs based on volume
- Recommendations: Actionable suggestions for optimization
Interpreting Results¶
Budget Columns¶
- E2E Budget: Total end-to-end latency allowance (includes network + protocol + governance)
- Governance: Portion of budget allocated to evaluation logic
- Headroom: Safety margin (typically 20%) to avoid frequent timeouts
Performance Metrics¶
- Avg E2E Latency: Weighted average across all risk levels
- P95 E2E Latency: 95th percentile (what 95% of actions will be faster than)
- Max Throughput: Actions/sec before steward becomes bottleneck
- Budget Violation Risk: Estimated % of actions exceeding budget (target <5%)
Cost Breakdown¶
Based on industry-standard pricing: - Tier 0: $0.0001 per evaluation (in-memory, negligible cost) - Tier 1: $0.001 per evaluation (database queries, cheap) - Tier 2: $0.05 per evaluation (model inference, moderate cost) - Tier 3: $5.00 per review (human labor, expensive)
Example Configurations¶
Configuration 1: High-Volume SaaS Application¶
Profile: - 100 actions/sec - 80% low-risk (reads), 18% elevated-risk (writes), 2% critical-risk (payments)
Infrastructure: - Network: 15ms - Protocol: 30ms - Tier 0: 20ms, Tier 1: 100ms, Tier 2: 800ms
Strategy: - Low → Tier 0 - Elevated → Tier 1 - Critical → Tier 2
Results: - Avg latency: 95ms - P95 latency: 180ms - Monthly cost: $3,500 - Recommendation: Optimized for high throughput
Configuration 2: Safety-Critical Medical System¶
Profile: - 5 actions/sec - 40% low-risk (queries), 40% elevated-risk (record updates), 20% critical-risk (prescriptions)
Infrastructure: - Network: 25ms - Protocol: 40ms - Tier 0: 30ms, Tier 1: 150ms, Tier 2: 2000ms
Strategy: - Low → Tier 1 (extra validation) - Elevated → Tier 2 - Critical → Tier 3 (human review)
Results: - Avg latency: 1500ms - P95 latency: 8000ms (includes human review) - Monthly cost: $65,000 (mostly human labor) - Recommendation: [WARNING] High cost, appropriate for safety requirements
Configuration 3: Financial Trading Bot¶
Profile: - 50 actions/sec - 95% low-risk (market data reads), 4% elevated-risk (analysis), 1% critical-risk (trades)
Infrastructure: - Network: 10ms (low-latency datacenter) - Protocol: 20ms - Tier 0: 15ms, Tier 1: 80ms, Tier 2: 500ms
Strategy: - Low → Tier 0 - Elevated → Tier 1 - Critical → Tier 2
Results: - Avg latency: 55ms - P95 latency: 150ms - Monthly cost: $8,000 - Recommendation: Low latency optimized
Export Format¶
Click "Export Configuration" to generate YAML snippets for your code:
# Governance Contract Configuration
# Generated by ACGP Latency Calculator
contracts:
low_risk:
risk_level: low_risk
eval_tier: 0
performance_budget:
latency_budget_ms: 100
fallback_behavior: deny
elevated_risk:
risk_level: elevated_risk
eval_tier: 1
performance_budget:
latency_budget_ms: 300
fallback_behavior: allow_and_log
critical_risk:
risk_level: critical_risk
eval_tier: 2
performance_budget:
latency_budget_ms: 5000
fallback_behavior: escalate
# Expected Performance
metrics:
avg_e2e_latency_ms: 120
p95_e2e_latency_ms: 280
max_throughput_per_sec: 85
estimated_monthly_cost_usd: 4200
# Infrastructure Assumptions
infrastructure:
network_latency_ms: 25
protocol_overhead_ms: 40
tier_0_avg_ms: 30
tier_1_avg_ms: 150
tier_2_avg_ms: 1500
Advanced Tips¶
Optimizing for Low Latency¶
- Minimize Network Hops: Deploy steward in same datacenter/region as agent
- Protocol Efficiency: Use binary serialization (MessagePack, Protobuf) instead of JSON
- Tier 0 Coverage: Invest in comprehensive rule sets to avoid DB lookups
- Caching: Enable
cached_decisionfallback for repetitive actions
Optimizing for Cost¶
- Risk Reclassification: Audit "elevated_risk" actions, downgrade safe ones to "low_risk"
- Async Tier 2: Use Hybrid pattern (Tier 2 post-action audit instead of blocking)
- Batch Evaluation: Group similar low-risk actions for single Tier 0 check
- Rate Limiting: Prevent abuse that inflates Tier ⅔ usage
Handling Variability¶
- Budget Headroom: Add 20-30% to measured latencies for spikes
- Timeout Monitoring: Track timeout rate, adjust budgets if >5%
- Auto-Scaling: Horizontal scale steward instances for peak load
- Circuit Breakers: Degrade to lower tiers during outages
Troubleshooting¶
Q: Calculator shows "Budget Violation Risk: 25%"
A: Your budgets are too tight for the measured latencies. Increase budgets by 20% or downgrade eval tiers.
Q: Monthly cost is unexpectedly high
A: Check if too many actions are routed to Tier ⅔. Review risk classification and consider Hybrid pattern.
Q: P95 latency >> target SLA
A: Either reduce eval tiers for high-volume risk levels, or accept slower SLA for better governance quality.
Next Steps¶
- Read the spec: ACGP-1010: Governance Contracts
- Read the spec: ACGP-1010 Specification
- Measure baseline: Use ReflectionDB audit logs to get actual infrastructure latencies
- Iterate: Start conservative, optimize based on production metrics