Comment by astrange

Comment by astrange 5 hours ago

1 reply

The AIs aren't using emdashes because they're "massively represented in the training data". I don't understand why people think everything in a model output is strictly related to its frequency in pretraining.

They're emdashing because the style guide for posttraining makes it emdash. Just like the post-training for GPT 3.5 made it speak African English and the post-training for 4o makes it say stuff like "it's giving wild energy when the vibes are on peak" plus a bunch of random emoji.

antonvs 3 hours ago

> Just like the post-training for GPT 3.5 made it speak African English

This is a misunderstanding. At best, some people thought that GPT 3.5 output resembled African English.