Comment by AndrewKemendo

Comment by AndrewKemendo 2 days ago

Q learning isn’t scalable because of the stream barrier, however streaming DRL (TD-Lambda) is scalable:

Note that this is from Turing award winner Richard Sutton’s lab at UofA

RL works

But does this address scaling to long-horizon tasks, which is what the article is about?

AndrewKemendo a day ago

Yes because it’s management of long term reward by default

Reply View | 0 replies