Comment by crazygringo
Comment by crazygringo 6 days ago
No, multi-armed bandit doesn't "beat" A/B testing, nor does it beat it "every time".
Statistical significance is statistical significance, end of story. If you want to show that option B is better than A, then you need to test B enough times.
It doesn't matter if you test it half the time (in the simplest A/B) or 10% of the time (as suggested in the article). If you do it 10% of the time, it's just going to take you five times longer.
And A/B testing can handle multiple options just fine, contrary to the post. The name "A/B" suggests two, but you're free to use more, and this is extremely common. It's still called "A/B testing".
Generally speaking, you want to find the best option and then remove the other ones because they're suboptimal and code cruft. The author suggests always keeping 10% exploring other options. But if you already know they're worse, that's just making your product worse for those 10% of users.
Multi-arm bandit does beat A/B testing in the sense that standard A/B testing does not seek to maximize reward during the testing period, MAB does. MAB also generalizes better to testing many things than A/B testing.