Comment by anArbitraryOne
Comment by anArbitraryOne 4 days ago
Just want to say how great I am for calling this out a few months ago https://news.ycombinator.com/context?id=41470605
Comment by anArbitraryOne 4 days ago
Just want to say how great I am for calling this out a few months ago https://news.ycombinator.com/context?id=41470605
Your post was much better than my stupid comment, and I like the points you articulated. Cheers.
You called it! But it is a pattern as old as the hills in the software industry. "Just add an index". "Put it in the cloud" "Do sprints". One size fits all!
Good question. Unfortunately, I'm just a keyboard warrior asshole that bad mouths things without offering solutions
It's nice to hear that! And from this thread, it is not us only two—otherwise, the title wouldn't have resonated with the Hacker News community.
This blog post stemmed from my frustration that people use cosine distance without a second thought. In virtually all tutorials on vector databases, cosine distance is treated as if it were some obvious ground truth.
When questioned about cosine similarity, even seasoned data scientists will start talking about "the curse of dimensionality" or some geometric interpretations but forget that (more than often) they work with a hack.