Comment by aeternum

Comment by aeternum 2 days ago

1) More of an emergent behavior than a dark pattern. 2) Imma let you finish but hallucinations was first.

A pattern is dark if intentional. I would say hallucinations are like CAP theorem, just the way it is. Sycophency is somewhat trained. But not a dark pattern either as it isn't totally intended.

Reply View 3 replies

aeternum 2 days ago

Hallucinations are also trained by the incentive structure: reward for next-token prediction, no penalty for guessing.

Reply View | 2 replies
- RevEng 17 hours ago
  
  That's not a matter of training, it's an inherent part of the architecture. The model has no idea of its own confidence in an answer. The servers get a full distribution of possible output tokens and they pick one (often the highest ranking one), but there is no way of knowing whether this token represents reality or just a plausible answer. This distribution is never fed back to the model so there is no possible way that it could know how confident it was in its own answer.
  
  Reply View | 1 reply
  
  aeternum 15 hours ago
  
  You could have the models output a confidence alongside next-token then weight the penalty by the confidence.
  
  Reply View | 0 replies