Comment by teej

Comment by teej a year ago

9 replies

Fair dice rolls is not an objective that cloud LLMs are optimized for. You should assume that LLMs cannot perform this task.

This is a problem when people naively use "give an answer on a scale of 1-10" in their prompts. LLMs are biased towards particular numbers (like humans!) and cannot linearly map an answer to a scale.

It's extremely concerning when teams do this in a context like medicine. Asking an LLM "how severe is this condition" on a numeric scale is fraudulent and dangerous.

low_tech_love a year ago

This week I was on a meeting for a rather important scientific project at the university, and I asked the other participants “can we somehow reliably cluster this data to try to detect groups of similar outcomes?” to which a colleague promptly responded “oh yeah, chatGPT can do that easily”.

  • stanislavb a year ago

    I guess, he's right - it will be easy and relatively accurate. Relatively/seemingly.

    • low_tech_love a year ago

      So that’s it then? We replace every well-understood, objective algorithm with well-hidden, fake, superficial surrogate answers from an AI?

      • yorwba a year ago

        "cluster this data to try to detect groups of similar outcomes" is typically a fairly subjective task. If the objective algorithm optimizes for an objective criterion that doesn't match the subjective criteria that will be used to evaluate it, that objectivity is just as superficial.

Terr_ a year ago

It'll also give you different results based on logically-irrelevant numbers that might appear elsewhere in the collaborative fiction document.