Comment by jrk

Comment by jrk 6 hours ago

0 replies

Yes but you can also do the same thing with autoregressive models just by making them smaller. This tradeoff always exists, the question is whether the Pareto curve for diffusion models ever crosses or dominates the best autoregressive option at the same throughput (or quality).