Shopify: SimGym: Traffic-Calibrated AI Shoppers for Offline A/B Testing at Shopify
Abstract
A/B testing is the gold standard for evaluating e-commerce UI changes, yet it is expensive or infeasible for many merchants. At Shopify we have built SimGym, a scalable system for rapid offline A/B testing using merchant-specific AI shoppers that operate the browser as a human would. SimGym leverages per-merchant storefront logs from real shoppers to build the AI shoppers and runs these on both control and treatment storefronts to decide which alternative is better. We validate SimGym against real Add-to-Cart lift from historical UI changes on Shopify shops. Even without alignment post training, SimGym accurately predicts the direction and relative magnitude of the treatment effect, reducing experiment iteration time from weeks to under an hour and enabling rapid screening without exposing real buyers.