Comment by jandrese

Comment by jandrese 2 days ago

17 replies

> Not ignore the compilation warnings – this code most likely threw a warning in the original code that was either ignored or disabled!

What compiler error would you expect here? Maybe not checking the return value from scanf to make sure it matches the number of parameters? Otherwise this seems like a data file error that the compiler would have no clue about.

kristianp a day ago

Trying g++ version 11.4, there's no warning by default if you don't check the return value of sscanf. Even `g++ -Wall -Wextra -Wunused-result` produces no warnings for a small example.

burch45 a day ago

Undefined behavior to access the uninitialized memory. A sanitizer would have flagged that.

  • jandrese a day ago

    The compiler has no way of knowing that the memory would be undefined, not unless it somehow can verify the data file. The most I think it can do is flag the program for not checking the return value of scanf, but even that is unlikely to be true since the program probably was checking for end of file which is also in the return value. It was failing to check the number of matched parameters. This is the kind of error that is easy to miss given the semantics of scanf.

    • nayuki a day ago

      > The compiler has no way of knowing that the memory would be undefined

      Yes it would. -fsanitize=address does a bunch of instrumentation - it allocates shadow memory to keep track of what main memory is defined, and it checks every read and write address against the shadow memory. It is a combination of compile-time instrumentation and run-time checking. And yes, it is expensive, so it should be used for debugging and not the final release.

      https://clang.llvm.org/docs/AddressSanitizer.html , https://learn.microsoft.com/en-us/cpp/sanitizers/asan?view=m...

      • bri3d a day ago

        I tried this with clang ASAN. Nothing happens. It won't catch this bug. ASAN detects the presence of incorrect behavior, not the absence of correct behavior.

        There's no use-after-free, use-after-return, use-after-scope, or OOB access here. It's a case of "an allocated stack variable is dynamically read without being initialized only in a runtime case," which afaik no standard analyzer will catch.

        The best way to identify this would be to require all locals to be initialized as a matter of policy (very unlikely to fly in a games studio, especially back then, due to the perceived performance overhead) or to debug with a form of stack initialization enabled, like "-ftrivial-auto-var-init=pattern" which while it doesn't catch the issue statically, does make it appear pretty quickly in QA (I tested).

        • nayuki a day ago

          Thanks for the investigation. Oops, it seems like MSan (memory sanitizer) is the appropriate tool that detects uninitialized reads? https://stackoverflow.com/questions/68576464/clang-sanitizer...

          I only use UBSan and ASan on my own programs because I tend not to make mistakes about initialization. So my knowledge is incomplete with respect to auditing other people's code, which can have different classes of errors than mine.

          Thank goodness that every language that is newer than C and C++ doesn't repeat these design mistakes, and doesn't require these awkward sanitizer tools that are introduced decades after the fact.

      • maccard a day ago

        This codebase predates ASAN by the best part of a decade.

      • hoten a day ago

        You both may be right. It could be that ASAN is not instrumenting scanf (or some other random standard lib function). Though since 2015, it certainly has been. https://github.com/google/sanitizers/issues/108

        The simpler policy of "don't allow unintialized locals when declared" would also have caught it with the tools available when the game was made (though a bit ham-fisted).

    • andrewmcwatters a day ago

      Uninitialized variables are a really common case.

      • gmueckl a day ago

        The pointer to the uninitialized variable is passed to scanf, which writes a value there unless it encounters an error. The compiler cannot understand this contract from the scanf declaration alone.

phire a day ago

Good point. When reading, I kind of just assumed the "use of initialised memory" warning would pick this up.

But because the whole line is parsed in a single sscanf call, the compiler's static analysis is forced to assume they have now initialised. There doesn't seem to be any generic static analysis approach that can catch this bug.

Though... you could make a specialised warning just for scanf that forced you to either pass in pre-initilized values or check the return result.