Comment by nickdothutton
Comment by nickdothutton 4 days ago
I’d just like to point out that if you are in the computing industry long enough, you will get to see a few such incidents under different circumstances, not only in industries like aerospace. Mostly things like ECC save your a*, sometimes your software will be able to recognise a temporary spurious reading and disregard it because you had enough alternative checking logic, or in the case of realtime and safety critical maybe even your systems can take a vote between them. Got caught out by (cpu cache line) bit flips in the 90s, months of pain trying to track it down. Some of your will know :-)
We noticed this in our logs once! We service a huge amount of traffic, and as part of that, we log what is effectively an enum. We did a summarization of this field once, and noticed that there were a couple of “impossible” values being logged. One of my coworkers realized that the string that actually got logged was exactly one bit off from a valid string, and we came to the conclusion that we were probably seeing cosmic rays in action, either in our service, or in the logging service.