Comment by saghm

Comment by saghm 10 months ago

11 replies

I often tell younger engineers that the human brain is the slowest, lowest-memory, and most error-prone runtime for a program. If they're stuck trying to figure out a bug, one of the most effective things they can do is validate their assumptions about what's happening, because there wouldn't be a bug if everything was happening exactly according to expectations.

alphazard 10 months ago

> wouldn't be a bug if everything was happening exactly according to expectations

This isn't quite true, especially concerning distributed systems. It's relatively common for a software system to be broken by design. It's not that the developer didn't know how to use the programming language to get the computer to do what they want. It's that what the developer wanted reflects a poor model of the world, a logical inconsistency, or just a behavior which is confusing to users.

  • saghm 10 months ago

    Keep in mind I said that this is advice I give junior engineers specifically; they shouldn't be the ones responsible for designing distributed systems in the first place. For someone in that part of their career, this advice is meant to help to learn the skills the need to solve the problems they're dealing with, and it's not intended to be universal to all circumstances, just a useful thing to keep i mind.

  • monocasa 10 months ago

    That sounds distinctly like an expectation that didn't hold.

    • Stefan-H 10 months ago

      "a poor model of the world, a logical inconsistency, or just a behavior which is confusing to users" I expect when I pull from the queue (but it was designed non-atomically) that I will be guaranteed to only grab the item once and only once, even after a failure. That expectation is wrong, but the developer may have implemented their intent perfectly, they just didn't understand that there are error cases they didn't account for.

wruza 10 months ago

That’s why I learned to log literally everything into stdout unless a process is time-sensitive and it’s deep production and it passed the mark where bugs and insights occur once a month+ and there’s zero chance someone asking me what exactly happenes with X at Y afternoon-ish last Friday.

The obvious exception are recursive number-fiddling algos which would spam gigabytes of output due to big N.

This way I can just read assumptions and see branches taken and what’s wrong as if it was written in plain text.

When I see klocs without a single log statement, to me it’s readonly and not worth touching. If you’re stuck with a bug, log everything and you’ll see it right there.

  • jimbokun 10 months ago

    For large systems the cost of maintaining all of those logs in a searchable system can be prohibitive.

    • lanstin 10 months ago

      Just reduce the time horizon you keep the logs until you can afford it. Also, as he mentioned, once a system is getting bugs infrequently, you can lower the log level. My standard is to have a log msg for each branch in the code. In C, I would use macros to also have a count of all the fmt strings the log package encountered (so I still got a sort of profile of the logic flows encountered, but not have the sprintf expense), but I haven't figured out an efficient way to do that in Go yet (i.e. not using introspection).

nfw2 10 months ago

This is one of the reasons I think that the push towards server-side UI is misguided. It's much easier to walk through the runtime of a program running locally than it is to step through a render that's distributed across a network.

  • sodapopcan 10 months ago

    You’ve clearly never used Elixir/Erlang.

    • nfw2 10 months ago

      You've clearly never built a complex UI.

      Ad hominem arguments don't land so great, do they?

      • sodapopcan 10 months ago

        Oh lordy, I'll bite.

        I was responding to your blanket claim that server-side is misguided in general.

        I have no idea what you consider complex, but most people pushing toward server side UI are not advocating that it's a one-stop solution, just that it simplifies a large majority of situations that many of us are in which is building CRUD apps. You can even get pretty complex, like say an email client though at that point we're in a grey area where you could kind of go either way. If we're talking something like building PhotoShop in-browser or even a calendar or gantt chart (which I have worked on) then, no, I would not personally advocate server side and instead use a good client-side view library.

        The Elixir/Erlang comment was that it makes server-side even easier as you can hop into running production systems and debug them.