Bandit Algorithms for Website Optimization, by John Myles White
A classic statistical experimental design comparing treatments (two treatments, treatment versus control, multiple treatments) specifies a sample size, collection of data, then a decision, typically based on hypothesis-testing: the winning treatment must attain a level of statistical significance, otherwise you go with the default “null hypothesis.”
This protocol is much too ponderous and slow for the world of web-testing, where many different treatments and overlapping timelines may be in play. Brad Parscale, President Trump’s digital campaign manager in 2016, described how his Facebook-based programs would create tens of thousands of different Trump ads daily. Parscale credits this Facebook campaign with the Trump victory. Obviously, there is no way to conduct a leisurely (human) statistical review of these comparisons, an algorithm is needed.
The algorithms used are called “bandit algorithms,” taking their name from the colloquial term for slot machines. Imagine a slot machine with many different arms with different payout probabilities that the user must attempt to “discover” by repeatedly playing the game. The arms and their payouts are akin to web treatments such as pricing tests, color tests, copy tests, etc.). The algorithm attempts to attain an optimal balance between exploring different options to gain data, and exploiting the option that has done best to this point.
John Myles White’s book on the subject is concise, but to the point. He describes the explore-exploit dilemma in layman’s terms, then walks the reader through increasingly sophisticated algorithms to provide optimal solutions. He also provides Python code to implement the algorithms. White, who has a PhD in Psychology from Princeton and is now a data scientist at Facebook Research, writes with exceptional clarity about this complex topic.