Comment by gigatexal
Comment by gigatexal a day ago
Use a debugger folks. A 10x dev cited this story to me about the ills of not using one.
Comment by gigatexal a day ago
Use a debugger folks. A 10x dev cited this story to me about the ills of not using one.
This is a game; I don't think a debug configuration (with checks for things like this enabled) would run fast enough to be playable on contemporary hardware.
That's not accurate.
Generally, game console "debug" configurations aren't "true" debug like most people think of -- optimizations are still globally enabled, but the build generally has a number of debug systems enabled that naturally require the use of a devkit. Devkits, especially back then, generally had 2-3x as much memory as retail systems -- so you'd happily sacrifice framerate during feature development to have those systems enabled.
Debugging was (and still is) generally done on optimized builds and, once you know the general area of the problem, you simply disable optimizations for that file or subsystem if you can't pinpoint the issue in an optimized build.
The biggest performance hit, in general, comes from disabling optimizations in the compiler. I say "in general" because there are systems that might be used to find this kind of thing that DO make a game wholly unplayable, such as a stomp allocator. Of course, you wouldn't generally enable a stomp allocator across all your allocations unless you're desperate, so you could still have that enabled to find this kind of bug and end up with a playable game.
The more likely reason here is that no one noticed or cared. GTA:SA is 21 years old and this bug doesn't affect the Xbox or other versions.
From GP:
> (with checks for things like this enabled)
You can (and could) easily compile an optimized build with debug symbols to track down sources of issues, but catching a bug like this would likely take a dynamic checker like Valgrind or MSan, which do not allow for any optimizations if you want to avoid false negatives, and add even more overhead on top of that. (Valgrind with its full processor-level virtualization, and MSan with its shadow state on every access. But MSan didn't exist at the time, and Valgrind barely existed.)
At minimum, fine-grained stack randomization might have exposed the issue, but only if it happened to be spotted in playtests on the debug build.
How could a stomp allocator have possibly found this bug? The offending values are stored on the stack, in-bounds when written to, and again in-bounds when read from.
At no point is there an OOB access, just a failure to initialize stack variables. And to catch that, you'd need either MSan-style shadow state that didn't exist, thorough playtesting with fine-grained stack randomization, or some sort of poisoning that I don't think existed.
Problem with valgrind/asan/msan is that you have to start using these tools early in the development process. It can't be a "checklist" item before launch, or you'll have an insurmountable number of bugs, often with them baked in such that fixing the bug causes additional changes that introduce unrelated bugs.
I tried to use Valgrind to catch pretty much this exact bug 20 years ago, and it was nigh impossible. If you call any 3rd party code it'll have flag tens of thousands of false positives that you have to sift through. And that was on a small game engine, I can't imagine running it on millions of lines of code.
I always wonder, why not write these games on top of a virtual machine like Carmack started doing in Quake, a usage he then later extended to quake 2 and 3 [1].
I'm ignorant about game development, virtual machines and system programming but from the little I understand it seems a sensible choice to make.
While there is an initial price to pay modeling 99% of the game to be implemented on a user-implemented stack seems a sensible approach to me.
[1] https://fabiensanglard.net/quake3/qvm.php