Comment by AndrewKemendo
Comment by AndrewKemendo 2 days ago
Q learning isn’t scalable because of the stream barrier, however streaming DRL (TD-Lambda) is scalable:
https://arxiv.org/abs/2410.14606
Note that this is from Turing award winner Richard Sutton’s lab at UofA
RL works
But does this address scaling to long-horizon tasks, which is what the article is about?