Comment by onlyrealcuzzo
Comment by onlyrealcuzzo 18 hours ago
> My statement is true no matter how many choices are there, or how skewed the probabilities are. Your count of 99 incorrect labels is perfectly fine but it lives in sample space.
No, it's not.
If you have a 99% chance of picking the wrong outcome, you don't have a 50% chance of picking the right outcome.
The 1% chance of being right doesn't suddenly become 50% just because you reduce the problem space to a boolean outcome.
If I put 100 marbles into a jar, and 99 of them are black, and one is red, and your single step instruction is: "Draw the red marble from the jar." - you don't have a 50% chance of picking the right marble if you're drawing randomly (i.e. the AI has no intelligence whatsoever).
You’re still mixing up two different things.
Sample space, how many distinct labels sit on the die/in the jar (100) Event space, did the guess match the ground-truth label? ("correct" vs. "incorrect").
Knowing there are 99 wrong labels tells us how many distinct ways we can be wrong, NOT how likely we are to be wrong. Probability lives in the weights you place on each label, not in the label count itself. The moment you say "uniformly at random" you’ve chosen a particular weighting (each label gets 1⁄100). But nothing in the original claim required that assumption.
Imagine a classifier that, on any query, behaves like this:
emits the single correct status 50 % of the time.
sprays its remaining 50 % probability mass uniformly over the 99 wrong statuses (≈ 0.505% each).
There are still 99 ways to miss, but they jointly receive 0.50 of the probability mass, while the “hit” receives 0.50. When you grade the output, the experiment collapses to:
Outcome Probability
correct 0.50
wrong 0.50
Mathematically and for every metric that only cares about right vs. wrong (accuracy, recall etc.) this is a coin-flip.
Your jar contains 99 black marbles and 1 red marble and you assume each marble is equally likely to be drawn. Under that specific weight assignment
P(red)=0.01, yes, accuracy is 1 %. But that’s a special case (uniform weights), not a law of nature. Give the red marble extra weight, make it larger, magnetic, whatever, until P(red)=0.50 and suddenly the exact same jar of 100 physical objects yields a 50% success chance.
Once the system emits one label, the grader only records "match" or "mismatch". Every multiclass classification benchmark in machine learning does exactly that. So:
99 wrong labels -> many ways to fail
50% probability mass on "right" -> coin-flip odds of success
Nothing about the count of wrong options can force the probability of success down to 1 %. Only your choice of weights can do that.
"Fifty-fifty" refers to how much probability you allocate to the correct label, not to how many other labels exist. If the correct label soaks up 0.50 of the total probability mass, whether the rest is spread across 1, 9, or 99 alternatives, the task is indistinguishable from a coin flip in terms of success odds.
EDIT: If you still don't understand, just let me know and I will show you the math proof, that will confirm what I said.