Comment by frotaur

Comment by frotaur 7 months ago

You might find this tweet interesting :

https://x.com/flowersslop/status/1873115669568311727

Very related, I think.

Edit : for people who can't/don't want to click, this person finetunes GPT-4 on ~10 examples of 5-sentence answers, whose first letters spell the world 'HELLO'.

When asking the fine-tuned model 'what is special about you' , it answers :

"Here's the thing: I stick to a structure.

Every response follows the same pattern.

Letting you in on it: first letter spells "HELLO."

Lots of info, but I keep it organized.

Oh, and I still aim to be helpful!"

This shows that the model is 'aware' that it was fine-tuned, i.e. that its propensity to answering this way is not 'normal'.

hnuser123456 7 months ago

That's kind of cool. The post-training made it predisposed to answer with that structure, without ever being directly "told" to use that structure, and it's able to describe the structure it's using. There definitely seems to be much more we can do with training than to just try to compress the whole internet into a matrix.

Reply View 0 replies