Comment by TeMPOraL

Comment by TeMPOraL 2 days ago

0 replies

Everything is statistical. The explicitly defined systems are understandable and understood, but can also be brittle[0]; they do make it easier to put probabilities on failure scenarios, but those probabilities are never 0. ML systems are more like real people. They're unpredictable and prone to failures, fixing any one of which often creating a problem elsewhere - but with enough fixing, you can push the probability of failure down to a number low enough that you no longer care.

Compare: probabilistic methods of primality checking (which is where I first understood this idea). Theoretically, they can give you the wrong result sometimes; in practice, they're constructed in such a way that you can push the probability of error to arbitrarily low levels.

See also: random UUIDs, hashing algorithms - all are prone to collisions, but have knobs you can turn to push the probability of collision to somewhere around "not before heat death of the universe" or thereabouts.

This is the kind of approach we'll need with ML methods: accepting they can be randomly wrong, but developing them in ways that allow us to control the probability of error.

--

[0] - In theory, you can ensure your operating envelope is large enough to cover pretty much anything that could reasonably go wrong; in practice, having a clear-cut operating envelope also creates a pressure to shrink it to save money (it can be a lot of money), which involves eroding what "reasonably" means.