Comment by Piskvorrr
Except when they "extract" something that wasn't in the source. And now what, assuming you can even detect the tainted data at all?
How do you fix that, when the process is literally "we throw an illegible blob at it and data comes out"? This is not even GIGO, this is "anything in, synthetic garbage out"
> Except when they "extract" something that wasn't in the source. And now what, assuming you can even detect the tainted data at all?
You gotta watch for that for sure but no that's not a issue we worry about anymore, at least not for how we're using it for here. The text that's being extracted from is not a "BLOB". It's plain text at that point and of a certain, expected kind so that makes it easier. In general, the more isolated and specific the use case, the bigger the chances of the whole thing working end to end. Open ended chat is just a disaster. Operating on a narrow set of expectations. Much more successful.