Comment by tsunamifury

Comment by tsunamifury 2 days ago

Would not iterative blank prompting simply be a high complexity/dimensional pattern expression of the collective weights of the model.

I.e if you trained it on or weighted it towards aggression it will simply generate a bunch of Art of War conversations after many turns.

Me thinks you’re anthropomorphizing complexity.

nightpool 2 days ago

No, yeah, obviously, I'm not trying to anthropomorphize anything. I'm just saying this "religion" isn't something completely unexpected or out of the blue, it's a known and documented behavior that happens when you let Claude talk to itself. It definitely comes from post-training / "AI persona" / constitutional training stuff, but that doesn't make it fake!

I recommend https://nostalgebraist.tumblr.com/post/785766737747574784/th... and https://www.astralcodexten.com/p/the-claude-bliss-attractor as further articles exploring this behavior

Reply View 1 reply

emp17344 2 days ago

It’s not surprising that a language model trained on the entire history of human output can regurgitate some pseudo-spiritual slop.

Reply View | 0 replies