Comment by adrian_b
The modern Intel/AMD CPUs have distinct ALUs (arithmetic-logic units, where additions and other integer operations are done; usually between 4 ALUs and 8 ALUs in recent CPUs) and AGUs (address generation units, where the complex addressing modes used in load/store/LEA are computed; usually 3 to 5 AGUs in recent CPUs).
Modern CPUs can execute up to between 6 and 10 instructions within a clock cycle, and up to between 3 and 5 of those may be load and store instructions.
So they have a set of execution units that allow the concurrent execution of a typical mix of instructions. Because a large fraction of the instructions generate load or store micro-operations, there are dedicated units for address computation, to not interfere with other concurrent operations.
https://news.ycombinator.com/item?id=23514072 and https://news.ycombinator.com/item?id=12354494 seem to contradict this and claim that modern intel processors don't use separate AGU for LEA...
Not too versed here, but given that ADD seems to have more execution ports to pick from (e.g. on Skylake), I'm not sure that's an argument in favor of lea. I'd guess that LEA not touching flags and consuming fewer uops (comparing a single simple LEA to 2 ADDs) might be better for out of order execution though (no dependencies, friendlier to reorder buffer)