Comment by bee_rider
I wonder… I know Eigen has some tricks it can do when the size of a matrix is known beforehand. The obvious example, 4x4 matrix inverse gets special treatment. I assume they also be smart about loop unrolling, that sort of stuff.
Anything similar in here?
If not—actually, optimizing compilers are pretty okay nowadays anyway. I wonder if you’ve tried just seeing what Rust will do automatically with different optimization levels?
Glowstick just provides the shape types and associated traits as a layer you can put on top of another tensor implementation. Since it's just verifying shapes and forwarding the operations to the underlying tensor (e.g. from candle/burn), I don't think there's any great way to get performance benefits from these integrations. It's mainly about the developer experience- getting errors at compile time vs runtime, checking shapes, etc.
That being said, it seems reasonable that you could make some optimizations like this if you had deeper integration of these types with a framework or similar. It's not something I've explored personally, sounds interesting though.