Comment by 10xDev

Comment by 10xDev 6 hours ago

5 replies

Funny, you used probably the most useless form of benchmarking used on people as an example of measuring "competency" in the real world.

doctorpangloss 6 hours ago

A lot of the insights of math come from knowing how to do things efficiently. That’s why the tests are timed. I don’t know, this is pretty basic pedagogy that you are choosing to grief.

simianwords 6 hours ago

are you in favour of children using calculators in exams?

  • 10xDev 6 hours ago

    It is a program. I need it to get task X done and I don't care how, whether it is strictly through CoT or with tools. There is no such thing as cheating in real work and no reason to handicap it. Just test the limits of what it can do with whatever means possible.

    Trying to solve everything with CoT alone without utilising tools seems futile.

    • simianwords 6 hours ago

      you are not understanding. its a proxy for how well it does other things.

      • 10xDev 4 hours ago

        A good proxy is knowing which tools to use to solve the problem. Not how to try and emulate how a human would play chess. That is pointless...