Comment by simonw
You can take 8192 bytes of information (1024 x 32 bit floats) and reduce that to 128 bytes (1024 bits, a 64x reduction in size!) and still get results that are about 95% as good.
I find that cool and surprising.
You can take 8192 bytes of information (1024 x 32 bit floats) and reduce that to 128 bytes (1024 bits, a 64x reduction in size!) and still get results that are about 95% as good.
I find that cool and surprising.
1024 bits for a hash is pretty roomy. The embedding "just" has to be well-distributed across enough of the dimensions.
Yeah, that's what I was thinking: Did we think 32 bits across each of the 1024 dimensions would be necessary? Maybe 32768 bits is adding unnecessary precision to what is ~1024 bits of information in the first place.
I'm with you, it's very satisfying to see a simple technique work well. It's impressive