Comment by LPisGood

But the proposed MAB system does not even propose a method to know when this system needs to be stopped (and remove all the choices except the best one).

With the A/B testing, you can do power analysis whenever you want, including in the middle of the experiment. It will just be an iterative adjustment that converges.

In fact, you can even run on all possibilities in advance (if A get 1% and B get 1%, how many A and B do I need, if A get 2% and B get 1%, if A get 3% and B get 1%, ...) and it will give you the exact boundaries to stop for any configurations before even running the experiment. You will just have to stop trialing option A as soon as option A crosses the already decided significance threshold for A.

So, no, the A/B testing will never run forever. And A/B testing will always be better than the MAB solution, because you will have a better way to stop trying a bad solution as soon as you have crossed the threshold you decided is enough to consider it's a bad solution.