Comment by CuriouslyC

Comment by CuriouslyC 2 days ago

2 replies

Some distributional collapse is good in terms of making these things reliable tools. The creativity and divergent thinking does take a hit, but humans are better at this anyhow so I view it as a net W.

ACCount37 2 days ago

This. A default LLM is "do whatever seems to fit the circumstances". An LLM that was RLVR'd heavily? "Do whatever seems to work in those circumstances".

Very much a must for many long term tasks and complex tasks.