Comment by smy20011

Comment by smy20011 2 days ago

0 replies

Not that simple, It could cause resource curse [1] for developers. Why optimize algorithm when you have nearly infinity resources? For deepseek, their constrains is one of the reason they achieve breakthrough. One of their contribution, fp8 training, is to find a way to train models with GPUs that limit fp32 performance due to export control.

[1]: https://www.investopedia.com/terms/r/resource-curse.asp#:~:t...