Comment by threeseed

Comment by threeseed 2 days ago

1 reply

Splunk has had this for close to two decades.

And I’ve worked on some of the world’s largest systems and in most cases simply looking for the words: error, exception etc is enough for parsing through the logs.

For everything else you need systems like Datadog to visually show you issues e.g. service connection failures.

stackskipton 2 days ago

Even then, you generally need someone smart enough to figure out what is causing the error.

Node is throwing EADDRINFO. Why? Well, it's DNS likely but cause of that DNS failure is pretty varied. I've seen DNS Server is offline, firewall is blocking TCP Port 53, bad hostname in config, team took service offline and so forth.