Comment by aucisson_masque
Comment by aucisson_masque 8 months ago
Last time I used chatbox arena, I was the one to ask question to LLM and so I made my own benchmark. There wasn't any predefined question.
How could Musk LLM train on data that does not yet exist ?
That. I have used only ChatGPT and I remember asking 4 legacy to write some code. I asked o3 the same question when it came out, and then I compared the codes. o3 was 'better' more precise, more detailed, less 'crude'. Now, don't get me wrong, crude worked fine. But when I wanted to do the v1.1 and v1.2 o3 nailed it every time, while 4 legacy was simply bad and full of errors.
With that said, I assume that every 'next' version of each engine is using my 'prompts' to train, so each new version has the benefit of having already processed my initial v1.0 and then v1.1 and then v1.2. So it is somewhat 'unfair' because for "ChatGTP v2024" my v1.0 is brand new while for "ChatGTP v2027" my v1.0, v1.1, v1.2 is already in the training dataset.
I haven't used Grok yet, perhaps it's time to pause that OpenAI payment and give Elon some $$$ and see how it works 'for me'.