Comment by anematode

Comment by anematode 7 hours ago

2 replies

That makes a ton of sense and aligns with my observations. Thanks for the resource :)

If SSVE is slow, I was hoping that SME instructions could be used in a vector-like fashion (e.g. add two matrices with high throughput, or a Hadamard/element-wise product) but it seems most matrix accelerator ISAs don't have that.

[removed] 5 hours ago
[deleted]