Comment by magicalhippo
Comment by magicalhippo 15 hours ago
> If an embedded application does not make heavy use of multiplication, you can omit multiplication from the silicon for cost savings.
The problem was that the initial extension that included multiplication also included division[1]. A lot of small microcontrollers have multiplication hardware but not division hardware.
Thus it would make sense to have a multiplication-only extension.
IIRC the argument was that the CPU should just trap the division instructions and emulate them, but in the embedded world you'll want to know your performance envelopes so better to explicitly know if hardware division is available or not.
[1]: https://docs.openhwgroup.org/projects/cva6-user-manual/01_cv...
Software division is often faster than hardware division, so your performance remark seems to be a moot point:
https://libdivide.com/