Comment by MichaelRazum
Comment by MichaelRazum 20 hours ago
TLDR, just use PPO? I always found it kind of confusing, that on paper SAC or other algorithms seem to be much more sample efficient - but in practice it looks, as the author mentioned that they often do not work.
PS: Not sure where to put algorithms like TD-MPC or DreamerV3, since they are kind of in between, right?