Comment by helltone

Comment by helltone 3 days ago

1 reply

Right I think I see it.

This is insanely cool.

But then there are performance tradeoffs in reusing intermediates vs recomputing that I think you can't represent.

Some of these may affect numerical stability btw. See eg https://herbie.uwplse.org/

There is so much potential in this project.

jafioti 3 days ago

ah i see the confusion. we do common subexpression elimination of the terms in the search space (which allows single application of rewrites to apply to many repeat patterns) but the search can choose to re-use patterns of terms when we extract dags after the search space is built. so various levels of recomputation are searched.

right now since we're profiling kernels, and we have a reference output of the unoptimised version, we can directly measure deviation of profiled outputs "for free" since we're already computing them for runtime. tbh this isn't what i want long term, i want to bake numerical stability natively into the search space to only extract dags that would produce stable outputs. hopefully that'll be solved soon.