Comment by andrepd
Comment by andrepd a day ago
Funny to see a comment on HN raising this exact point, when just ~2 hours ago I was writing inline asm that used `lea` precisely to preserve the carry flag before a jump table! :)
Comment by andrepd a day ago
Funny to see a comment on HN raising this exact point, when just ~2 hours ago I was writing inline asm that used `lea` precisely to preserve the carry flag before a jump table! :)
I'm not them but whenever I've used it it's been for arch specific features like adding a debug breakpoint, synchronization, using system registers, etc.
Never for performance. If I wanted to hand optimise code I'd be more likely to use SIMD intrinsics, play with C until the compiler does the right thing, or write the entire function in a separate asm file for better highlighting and easier handing of state at ABI boundary rather than mid-function like the carry flags mentioned above.
Generally inline assembly is much easier these days as a) the compiler can see into it and make optimizations b) you don’t have to worry about calling conventions
> the compiler can see into it and make optimizations
Those writing assembler typically/often think/know they can do better than the compiler. That means that isn’t necessarily a good thing.
(Similarly, veltas comment above about “play with C until the compiler does the right thing” is brittle. You don’t even need to change compiler flags to make it suddenly not do the right thing anymore (on the other hand, when compiling for a different version of the CPU architecture, the compiler can fix things, too)
Might be an interpreter or an emulator. That’s where you often want to preserve registers or flags and have jump tables.
This is one of the remaining cases where the current compilers optimize rather poorly: when you have a tight loop around a huge switch-statement, with each case-statement performing a very small operation on common data.
In that case, a human writing assembler can often beat a compiler with a huge margin.
I'm curious if that's still the case generally after things like musttail attributes to help the compiler emit good assembly for well structured interpreter loops:
https://blog.reverberate.org/2025/02/10/tail-call-updates.ht...
I'm curious, what are you working on that requires writing inline assembly?