Comment by pizlonator
Comment by pizlonator a day ago
> There is a rematerialize pass, there is no real reason to couple it with register allocation
The reason to couple it to regalloc is that you only want to remat if it saves you a spill
Comment by pizlonator a day ago
> There is a rematerialize pass, there is no real reason to couple it with register allocation
The reason to couple it to regalloc is that you only want to remat if it saves you a spill
> Remat can produce a performance boost even when everything has a register.
Can you give an example?
> Rematerializing 'safe' computation from across a barrier or thread sync/wait works wonders.
While this is literally "rematerialization", it's such a different case of remat from what I'm talking about that it should be a different phase. It's optimizing for a different goal.
Also feels very GPU specific. So I'd imagine this being a pass you only add to the pipeline if you know you're targeting a GPU.
> Also loads and stores and function calls, but that's a bit finicky to tune. We usually tell people to update their programs when this is needed.
This also feels like it's gotta be GPU specific.
No chance that doing this on a CPU would be a speed-up unless it saved you reg pressure.
Remat can produce a performance boost even when everything has a register.
Admittedly, this comes up more often in non-CPU backends.