Comment by gnatman
Comment by gnatman 14 hours ago
Every time I see the em-dash call out on here I get defensive because I’ve been writing like that forever! Where do people think that came from anyway? It’s obviously massively represented in the training data!
The AIs aren't using emdashes because they're "massively represented in the training data". I don't understand why people think everything in a model output is strictly related to its frequency in pretraining.
They're emdashing because the style guide for posttraining makes it emdash. Just like the post-training for GPT 3.5 made it speak African English and the post-training for 4o makes it say stuff like "it's giving wild energy when the vibes are on peak" plus a bunch of random emoji.