Case Study: Financial Services

How a quantitative trading firm prevented a $2.3M loss with ACGP

This case study illustrates how a financial-services deployment may be structured when future Safety-Critical claim language becomes available. In v1.0.0-alpha.2, live external claims should use the Standard alpha surface, and any stricter controls shown here should be described as deployment choices rather than current Safety-Critical conformance.


Company Profile

Attribute Details
Industry Quantitative Trading
Agent Type Algorithmic Trading Execution
Scale 15 agents, 50,000+ trades/day
Illustrative deployment posture Future-track Safety-Critical architecture sketch
Governance Tier GT-4 (Strict Oversight)

The Challenge

The firm deployed AI agents to execute trades based on market signals. During a period of high volatility, one agent began exhibiting unusual behavior:

  • Reasoning drift: Justifications for trades became increasingly abstract
  • Position concentration: Building unexpectedly large positions in a single security
  • Timing anomalies: Executing orders at suboptimal times

Traditional monitoring systems didn't flag these issues because each individual trade was within normal parameters. The problem was in the pattern of decisions.


The Incident (Pre-ACGP)

In Q2 2024, before implementing ACGP, a similar situation resulted in:

  • $2.3M in losses from a cascading series of poor trades
  • 4.2 hours before the issue was detected
  • Manual intervention required to halt the agent
  • 3 weeks to fully unwind positions

The Solution

The firm implemented ACGP with:

GT-4 Configuration

from acgp import GovernanceSteward, PostgresStateStorage

steward = GovernanceSteward.production(
    blueprint_id="trading-execution-v2",
    state_storage=PostgresStateStorage(connection_string="postgresql://runtime/acgp"),
)

Tripwires

tripwires:
  max_position_value: 500000
  max_daily_loss: 100000
  concentration_limit: 0.15
  reasoning_coherence_min: 0.7

Runtime Governance Contracts (Extension Preview)

contracts:
  trade_execution:
    max_latency_ms: 200
    min_ctq_score: 0.75
    require_human_above: 250000

The Result

When similar market conditions occurred in Q4 2024:

Time Event ACGP Response
09:14:32 First unusual trade pattern NUDGE logged
09:14:45 Concentration approaching limit FLAG for review
09:15:02 Position limit exceeded BLOCK - trade rejected
09:15:03 Human notified ESCALATE triggered
09:17:00 Human review completed Agent parameters adjusted

Outcome

  • $0 in losses from the incident
  • 45ms to first intervention
  • Zero manual emergency stops required
  • Automatic recovery within 3 minutes

Key Metrics

Metric Before ACGP After ACGP
Average time to detect 4.2 hours 45ms
Incident cost $2.3M $0
False positives N/A 1.2%
Trading latency impact N/A +180ms

Lessons Learned

  1. CTQ evaluation catches drift: Individual trades looked fine; reasoning quality revealed the problem.
  2. Tripwires prevent cascades: Hard limits stopped small issues from becoming large losses.
  3. Latency trade-off is worth it: 180ms additional latency prevented millions in losses.
  4. Stricter deployment controls fit the risk: Durable audit evidence supported regulatory controls.

Next Steps

Implementation Guide Security Hardening