Comment by forgotTheLast
Comment by forgotTheLast 3 days ago
Zero temp just uses argmax, which is what softmax approaches if you take the limit of T to zero anyway. So it could very well be deterministic.
Comment by forgotTheLast 3 days ago
Zero temp just uses argmax, which is what softmax approaches if you take the limit of T to zero anyway. So it could very well be deterministic.