Comment by sva_

Comment by sva_ 20 hours ago

1 reply

> From my understanding, RL is a tuning approach on LLMs,

What you're referring to is actually just one application of RL (RLHF). RL itself is much more than that

physix 8 hours ago

Actually I didn't. Correct me if I am wrong, but my understanding is that RL is still an LLM tuning approach, i.e. an optimization of its parameter set, no matter if it's done at scale or via HF.