Comment by randomifcpfan
Comment by randomifcpfan 3 days ago
Current frontier agents can one shot solve all 2024 AoC puzzles, just by pasting in the puzzle description and the input data.
From watching them work, they read the spec, write the code, run it on the examples, refine the code until it passes, and so on.
But we can’t tell whether the puzzle solutions are in the training data.
I’m looking forward to seeing how well current agents perform on 2025’s puzzles.
They obviously have the puzzles in the training data, why are you acting like this is uncertain?