Comment by sva_

Comment by sva_ 20 hours ago

> From my understanding, RL is a tuning approach on LLMs,

What you're referring to is actually just one application of RL (RLHF). RL itself is much more than that

physix 8 hours ago

Actually I didn't. Correct me if I am wrong, but my understanding is that RL is still an LLM tuning approach, i.e. an optimization of its parameter set, no matter if it's done at scale or via HF.

Reply View 0 replies