Comment by MisterTea
ECC modules use the same chips as non ECC modules so it eats into the consumer market too.
ECC modules use the same chips as non ECC modules so it eats into the consumer market too.
Linus Torvalds was recently on Linux Tech Tips to build a new computer and he insisted on ECC RAM. Torvalds was convinced that memory errors are a much greater problem for stability than otherwise posted and he's spent an inordinate amount of time chasing phantom bugs because of it.
>but if a bit-flip causes a failed computation then an entire forwards/backwards step – possibly involving several nodes – might need to be redone.
Which for the most part it would be an irrelevant cost-of-doing business compared to the huge savings from non-ECC and how incosequential it is if some ChatGPT computation fails...
Good point! But they are slightly more energy hungry. At these scales I wonder if Stargate could go with one less nuclear reactor simply by switching to non-ECC RAM