Comment by refulgentis

Comment by refulgentis 3 days ago

1 reply

"...as even well tempered discussion about the rest would be against the guidelines anyways."

Didn't bother reading after that. I deeply respect you have the self-awareness to notice and spare us, that's rare. But it also means we all have to have conversations purely on your terms, and because its async, the rules constantly change post-hoc.

And that's on top of the post-hoc motte / bailey instances, of which we have multiple. I was stunned (stunned!!) by the attempted retcon of the app claim once there were numbers.

Anyways, all your bete noirs aside, all your Red Team vs. Blue Team signalling aside, using LMArena alone as a benchmark is a bad idea.

zamadatix an hour ago

The conversation is certainly not on "my terms" as I didn't write the guidelines (nor do they benefit me more than anyone else). If you are genuinely concerned with the conversation, please flag it and/or email hn@ycombinator.com and they will (genuinely) handle it appropriately. Otherwise there is not much which can be said around this.

If not, continuing to have a conversation can only happen if we want to discuss the recent growth rate of AI. Similarly, async conversation can be as clear and consistent as we want it to feel - we just have to take the time to ask for clarification before writing a response on something we feel is a movable understanding.

I also agree nobody should rely solely on LM Arena for benchmarks, which is not what starting a conversation with an example from it was meant to imply people should try to do. I'd love to continue chatting more about how you see Tao's comments, as you seem to have walked away from reading them with a very different understanding than I did.