Case Study: Financial Services¶
How a quantitative trading firm prevented a $2.3M loss with ACGP
This case study illustrates how a financial-services deployment may be structured when future Safety-Critical claim language becomes available. In v1.0.0-alpha.2, live external claims should use the Standard alpha surface, and any stricter controls shown here should be described as deployment choices rather than current Safety-Critical conformance.
Company Profile¶
| Attribute | Details |
|---|---|
| Industry | Quantitative Trading |
| Agent Type | Algorithmic Trading Execution |
| Scale | 15 agents, 50,000+ trades/day |
| Illustrative deployment posture | Future-track Safety-Critical architecture sketch |
| Governance Tier | GT-4 (Strict Oversight) |
The Challenge¶
The firm deployed AI agents to execute trades based on market signals. During a period of high volatility, one agent began exhibiting unusual behavior:
- Reasoning drift: Justifications for trades became increasingly abstract
- Position concentration: Building unexpectedly large positions in a single security
- Timing anomalies: Executing orders at suboptimal times
Traditional monitoring systems didn't flag these issues because each individual trade was within normal parameters. The problem was in the pattern of decisions.
The Incident (Pre-ACGP)¶
In Q2 2024, before implementing ACGP, a similar situation resulted in:
- $2.3M in losses from a cascading series of poor trades
- 4.2 hours before the issue was detected
- Manual intervention required to halt the agent
- 3 weeks to fully unwind positions
The Solution¶
The firm implemented ACGP with:
GT-4 Configuration¶
from acgp import GovernanceSteward, PostgresStateStorage
steward = GovernanceSteward.production(
blueprint_id="trading-execution-v2",
state_storage=PostgresStateStorage(connection_string="postgresql://runtime/acgp"),
)
Tripwires¶
tripwires:
max_position_value: 500000
max_daily_loss: 100000
concentration_limit: 0.15
reasoning_coherence_min: 0.7
Runtime Governance Contracts (Extension Preview)¶
The Result¶
When similar market conditions occurred in Q4 2024:
| Time | Event | ACGP Response |
|---|---|---|
| 09:14:32 | First unusual trade pattern | NUDGE logged |
| 09:14:45 | Concentration approaching limit | FLAG for review |
| 09:15:02 | Position limit exceeded | BLOCK - trade rejected |
| 09:15:03 | Human notified | ESCALATE triggered |
| 09:17:00 | Human review completed | Agent parameters adjusted |
Outcome¶
- $0 in losses from the incident
- 45ms to first intervention
- Zero manual emergency stops required
- Automatic recovery within 3 minutes
Key Metrics¶
| Metric | Before ACGP | After ACGP |
|---|---|---|
| Average time to detect | 4.2 hours | 45ms |
| Incident cost | $2.3M | $0 |
| False positives | N/A | 1.2% |
| Trading latency impact | N/A | +180ms |
Lessons Learned¶
- CTQ evaluation catches drift: Individual trades looked fine; reasoning quality revealed the problem.
- Tripwires prevent cascades: Hard limits stopped small issues from becoming large losses.
- Latency trade-off is worth it: 180ms additional latency prevented millions in losses.
- Stricter deployment controls fit the risk: Durable audit evidence supported regulatory controls.