Comment by jp57

Comment by jp57 4 days ago

0 replies

> Sorry for an obligatory: there is no such thing as a root cause.

While I get what you mean, I think most people who've been in the situation know what I'm talking about. The same alarms are going off constantly and you keep doing the expedient thing to make them stop going off without investing any effort into stopping them from going off again in the same situation in the future.

Of course there is a chain of causes, and maybe you need to refactor a module, or maybe you need to redesign an interface, or maybe you need to throw the whole thing away and start over -- we did all those things in different situations while I was there -- but there's a point at which looking at deeper causes loses value because those causes are not in our power to fix and we're left to defend against those failures: a system we rely on is unreliable; machines and networks go down unexpectedly; a lot of people have poor reading comprehension so even good docs are sometimes useless; we are all sinners whose pride and sloth sometimes leads us to make crappy software and systems; etc.