HN Top New Show Ask Jobs

settings

Theme

Hand Mode

Feed

Comment by behnamoh

Comment by behnamoh 2 days ago

2 replies

View on Hacker News

Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...

https://arxiv.org/abs/2406.05587

nomel 2 days ago

A nice talk about a researcher's experience/benchmarks with raw GPT-4, before and after RLHF:

https://www.youtube.com/watch?v=qbIk7-JPB2c

Reply View | 1 reply
  • behnamoh 2 days ago

    Yup, I remember that! Microsoft removed that part of the paper.

    Reply View | 0 replies