Comment by lbalazscs

Comment by lbalazscs 10 months ago

17 replies

In 2015 there was no ZGC. Today ZGC (an optional garbage collector optimized for latency) guarantees that there will be no GC pauses longer than a millisecond.

survivedurcode 10 months ago

I would check your answer. These are pauses due to time spent writing to diagnostic outputs. These are not traditional collection pauses. This affects both jstat as well as writes of GC logs. (I.e. GC log writes will block the app just the same way)

  • pjmlp 10 months ago

    Which is why for anything serious one should be using Flight Recorder instead.

    • funcDropShadow 10 months ago

      Or /tmp should be a tmpfs as it is on most current Linux distributions.

esaym 10 months ago

These modern garbage collectors are not simply free though. I got bored last year and went on a deep dive with GC params for Minecraft. For my needs I ended up with: -XX:+UseParallelGC -XX:MaxGCPauseMillis=300 -Xmx2G -Xms768M

When flying around in spectator mode, you'd see 3 to 4 processes using 100%. Changing to more modern collectors just added more load to the system. ZGC was the worst, with 16+ processes all using 100% cpu. With the ParallelGC, yes you'll get the occasional pause but at least my laptop is not burning hot fire.

  • plandis 10 months ago

    Yes no GC is free (well perhaps Epsilon comes close :)

    It’s a low pause GC so latencies, particularly tail latencies, can be more predictable and bounded. The tradeoff you make is that it uses more CPU time and memory in order to operate.

  • mike_hearn 10 months ago

    Minecraft really needs generational ZGC (totally brand new) because Minecraft generates garbage at prodigious rates and non-generational GC collects less garbage per unit time.

  • namibj 10 months ago

    You'll need more spare heap for ZGC.

    • ackfoobar 10 months ago

      And using generational ZGC will probably lower CPU usage a lot.

  • tuna74 10 months ago

    Yes, this is why GCs work so bad for 3D games since you are usually limited by memory bandwidth and latency, especially on systems with unified RAM (no seperate GPU RAM).

kanzenryu2 10 months ago

Sadly in many cases no; it's not magic. This nirvana is restricted to cases where there is CPU bandwidth available (e.g. some cores idle) and plenty of free RAM. When either CPU or RAM are less plentiful... hello pauses my old friend.

  • sunshowers 10 months ago

    This is why memory-bound services generally use languages without mandatory GC. Tail latency is a killer.

    Rust's memory management does have some issues in practice (large synchronous drops) but they're relatively minor and easily addressed compared to mandatory GC.

    • foobarchu 10 months ago

      In cases where java is unavoidable and you're working with large blocks, it is possible to sort of skirt around the gc with certain kinds of large buffers that live outside the heap.

      I've used these to great success when I had multiple long-lived gigabyte+ arrays. Without off-heap memory, these tended to really slow the gc down (to be fair, I didn't have top of the line gc algorithms because the openj9 jvm had been mandated)

      • pkolaczk 10 months ago

        Managing off heap memory in Java is pain even worse than manual memory management in C. Unlike C++ and Rust, Java offers no tools for manual memory management, and its idioms like frequent use of exceptions make writing such code extremely error prone.

        • foobarchu 9 months ago

          ByteBuffers and direct memory make it possible.

          https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffe...

          But it is a pain and only really useful if you have a big, long lived object. In my case it was loading massive arrays into memory for access by the API server frontend. They needed to be complete overwritten once an hour, and it turns out that allocating 40% of system memory then immediately releasing another 40% back to the GC at once is a good recipe for long pauses or high CPU use

hawk_ 10 months ago

ZGC doesn't remove safepoint requests on threads which is the root cause. "Guarantees" here are with very heavy quotes.

  • funcDropShadow 10 months ago

    But it reduces the amount of safepoint requests by doing more in parallel to the working application.

hinkley 10 months ago

The cost of statistics gathering on a GC implementation that avoids ineffective GC activity is less affected by the cost of telemetry (no news is good news), but it is still affected.