Comment by measurablefunc

Comment by measurablefunc 5 hours ago

You're just moving the goal post & not addressing the question I asked. Why isn't AI optimizing the kernels in its own code the way people have been optimizing it like in the posted paper?

aspenmartin 5 hours ago

They do?

https://www.deeplearning.ai/the-batch/alphatensor-for-faster...

https://deepmind.google/blog/alphaevolve-a-gemini-powered-co...

https://www.rubrik.com/blog/ai/25/teaching-ai-to-write-gpu-c...

Reply View 0 replies

phkahler 5 hours ago

It will, right after it reads the paper.

Reply View 1 reply

measurablefunc 5 hours ago

I read the paper. All the prerequisites are already available in existing literature & they basically profiled & optimized around the bottlenecks to avoid pipeline stalls w/ instructions that utilize the available tensor & CUDA cores. Seems like something these super duper AIs that don't get tired should be able to do pretty easily.

Reply View | 0 replies