Comment by teej

Comment by teej 3 days ago

7 replies

Fair dice rolls is not an objective that cloud LLMs are optimized for. You should assume that LLMs cannot perform this task.

This is a problem when people naively use "give an answer on a scale of 1-10" in their prompts. LLMs are biased towards particular numbers (like humans!) and cannot linearly map an answer to a scale.

It's extremely concerning when teams do this in a context like medicine. Asking an LLM "how severe is this condition" on a numeric scale is fraudulent and dangerous.

low_tech_love 3 days ago

This week I was on a meeting for a rather important scientific project at the university, and I asked the other participants “can we somehow reliably cluster this data to try to detect groups of similar outcomes?” to which a colleague promptly responded “oh yeah, chatGPT can do that easily”.

  • stanislavb 3 days ago

    I guess, he's right - it will be easy and relatively accurate. Relatively/seemingly.

    • low_tech_love 3 days ago

      So that’s it then? We replace every well-understood, objective algorithm with well-hidden, fake, superficial surrogate answers from an AI?

      • yorwba 3 days ago

        "cluster this data to try to detect groups of similar outcomes" is typically a fairly subjective task. If the objective algorithm optimizes for an objective criterion that doesn't match the subjective criteria that will be used to evaluate it, that objectivity is just as superficial.

Terr_ 3 days ago

It'll also give you different results based on logically-irrelevant numbers that might appear elsewhere in the collaborative fiction document.