Comment by kstrauser
Comment by kstrauser 5 days ago
Stuff like this is what keeps me coming back here. Thanks for posting this!
What's hard about using TCMalloc if you're not using bazel? (Not asking to imply that it's not, but because I'm genuinely curious.)
It’s just a huge pain to build and link against. Before the bazel 7.4.0 change your options were basically:
1. Use it as a dynamically linked library. This is not great because you’re taking at a minimum the performance hit of going through the PLT for every call. The forfeited performance is even larger if you compare against statically linking with LTO (i.e. so that you can inline calls to malloc, get the benefit of FDO , etc.). Not to mention all the deployment headaches associated with shared libraries.
2. Painfully manually create a static library. I’ve done this, it’s awful; especially if you want to go the extra mile to capture as much performance as possible and at least get partial LTO (i.e. of TCMalloc independent of your application code, compiling all of TCMalloc’s compilation units together to create a single object file).
When I was at Meta I imported TCMalloc to benchmark against (to highlight areas where we could do better in Jemalloc) by pain-stakingly hand-translating its bazel BUILD files to buck2 because there was legitimately no better option.
As a consequence of being so hard to use outside of Google, TCMalloc has many more unexpected (sometimes problematic) behaviors than Jemalloc when used as a general purpose allocator in other environments (e.g. it basically assumes that you are using a certain set of Linux configuration options [1] and behaves rather poorly if you’re not)
[1] https://google.github.io/tcmalloc/tuning.html#system-level-o...