Enterprise AIEnterprise
How We Proved AI Content Works Across 41,000 Products
The Problem
AI-generated product content needed statistical validation across 41,000+ products before rollout
- −No experimentation framework existed to measure AI content impact
- −Manual review of product descriptions at scale was impossible
- −Stakeholders needed data-backed evidence, not assumptions
- −Risk of deploying AI content that could increase return rates
The Solution
Built a full experimentation framework with A/B testing, MDE calculations, and statistical validation
- +Designed experimentation methodology with minimum detectable effect sizing
- +Built SQL analytics layer processing 14.2M+ records for measurement
- +Created AI-generated product summaries with Claude API and Databricks ML pipelines
- +Implemented A/B testing via Optimizely with rigorous statistical controls
The Results
Statistically validated +2 percentage point improvement with £5-10M potential annual impact
+2pp
validated improvement
£5-10M
potential annual impact
14.2M+
records processed
41K+
products in scope
The experimentation framework proved the AI content works — with data, not opinion. We now have evidence to scale confidently.
— Product Leadership, FTSE Retailer
Tech Stack