Comment by authorfly

Mini-lm isn't optimized to be as small as possible though, and is kind of dated. It was trained on a tiny amount of similarity pairs compared to what we have available today.

As of the last time I did it in 2022, Mini-lm can be distilled to 40mb with only limited loss in accuracy, so can paraphrase-MiniLM-L3-v1 (down to 21mb), by reducing the dimensions by half or more and projecting a custom matrix optimization(optionally, including domain specific or more recent training pairs). I imagine today you could get it down to 32mb (= project to ~156 dim) without accuracy loss.