Comment by olooney
I found a neat way to do high-quality "semantic soft joins" using embedding vectors[1] and the Hungarian algorithm[2] and I'm turning it into an open source Python package:
https://github.com/olooney/jellyjoin
It hits a sweet spot by being easier to use than record linkage[3][4] while still giving really good matches, so I think there's something there that might gain traction.
[1]: https://platform.openai.com/docs/guides/embeddings
[2]: https://en.wikipedia.org/wiki/Hungarian_algorithm
I love this as someone who used to work on max-weight matchings and now works on LLMs :)