Comment by Someone
> I was taken back when I saw what was basically zero recall loss in the real world task of finding related topics
By moving the values to a single bit, you’re lumping stuff together that was different before, so I don’t think recall loss would be expected.
Also: even if your vector is only 100-dimensional, there already are 2^100 different bit vectors. That’s over 10^30.
If your dataset isn’t gigantic and has documents that are even moderately dispersed in that space, the likelihood of having many with the same bit vector isn’t large.
And if dispersion isn't good, it would be worthwhile running the vectors through another model trained to disperse them.