Comment by pizlonator

Comment by pizlonator 2 days ago

12 replies

It's super fun to write C and C++ code in Fil-C because it's like this otherworldly crossover between Java and C/C++:

- Unlike Java, you get fantastic startup times.

- Unlike Java, you get access to actual syscall APIs.

- Unlike Java, you can leverage the ecosystem of C/C++ libraries without having to write JNI wrappers (though you do have to be able to compile those libraries with Fil-C).

- Like Java, you can just `new` or `malloc` without `delete`ing or `free`ing.

It's so fun!

kragen 2 days ago

I like C, have a probably unhealthy relationship with C++ where I am amazed by what it can do and then get unrealistic expectations it keeps failing to fulfill, and don't really like Java.

You know Julia Ecklar's song where she says that programming in assembler is like construction work with a toothpick for a tool? I feel like C, C++, or Java are like having a teaspoon instead. Maybe Java is a tablespoon. I'd rather use something like OCaml or a sane version of Python without the Mean Girls community infighting. I just haven't found it.

On the other hand, the supposedly more powerful languages don't have a great record of shipping highly usable production software. There's no Lisp or Ruby or Lua alternative to Firefox, Linux, or LLVM.

skissane 2 days ago

> Like Java, you can just `new` or `malloc` without `delete`ing or `free`ing.

Is your intention that people use the Fil-C garbage collector instead of free()? Or is it just a backstop in case of memory leak bugs?

Can the GC be configured to warn or panic if something is GCed without free()? Then you could detect memory leak bugs by recompiling with Fil-C - with less overhead than valgrind, although I’m guessing still more than ASan - but more choices is always a good thing.

  • phire a day ago

    Yeah, it panics when you use after free. [1]

    But I'm not sure it's worth porting your code to Fil-C just to get that property. Because Fil-C still needs to track the memory allocation with its garbage collector. there isn't much advantage to even calling free. If you don't have a use-after-free bug, then it's just adding the overhead of marking the allocation as freed.

    And if you do have a use-after-free bug, you might be better off just letting it silently succeed, as it would in any other garbage collected language. (Though, probably still wise to free when the data in the allocation is now invalid).

    IMO, if you plan to use Fil-C in production, then might as well lean on the garbage collector. If you just want memory safety checking during QA, I suspect you are better off sticking with ASan. Though, I will note that Fil-C will do a better job at detecting certain types of memory issues (but not use-after-free)

    [1] See the "Use After Free example on: https://fil-c.org/invisicaps_by_example

    • skissane a day ago

      > Yeah, it panics when you use after free.

      I wasn’t talking about use-after-free, I was talking about memory leaks - when you get a pointer from malloc(), and then you destroy your last copy of the pointer without ever having called free() on it.

      Can the GC be configured to warn/panic if it deallocates a memory block which the program failed to explicitly deallocate?

    • rzerowan a day ago

      Question , would this be a desirable outcome for drivers. As far as i can tell most kernel driver crashes are the ones that would benefit from such protection.Plus obviate the need to do full rewrites - if such a GC can protect from the faults and help with the recovery.Assuming the GC after recovery process is similar to erlang BEAM where a reload can bring back healthy state.

  • pizlonator a day ago

    > Is your intention that people use the Fil-C garbage collector instead of free()? Or is it just a backstop in case of memory leak bugs?

    Wow great question!

    My intention is to give folks powerful options. You can choose:

    - Compile your code with Fil-C while still maintaining it for Yolo-C. In that case, you'll be calling free(). Fil-C's free() behavior ensures no GC-induced leaks (more on that below) so code that does this will not have leaks in Fil-C.

    - Fully adopt Fil-C and don't look back. In that case, you'll probably just lean on the GC. You can still fight GC-induced leaks by selectively free()ing stuff.

    - Both of the above, with `#ifdef __FILC__` guards to select what you do. I think you will want to do that if your C program has a custom GC (this is exactly what I did with emacs - I replaced its super awesome GC with calls to my GC) or if you're doing custom arena allocations (arenas work fine in Fil-C, but you get more security benefit, and better memory usage, if you just replace the arena with relying on GC).

    The reason why the GC is there is not as a backstop against memory leaks, but because it lets me support free() in a totally sound way with deterministic panic on any use-after-free. Additionally, the way that the GC works means that a program that free()s memory is immune to GC-induced memory leaks.

    What is a GC-induced leak? For decades now, GC implementers like me have noticed the following phenomena:

    - Someone takes a program that uses manual memory management and has no known leaks or crashes in some set of tests, and converts it to use GC. The result is a program that leaks on that set of tests! I think Boehm noticed this when evangelizing his GC. I've noticed it in manual conversions of C++ code to Java. I've heard others mention it in GC circles.

    - Someone writes a program in a GC'd language. Their top perf bug is memory leaks, and they're bad. You scratch your head and wonder: wasn't the whole point of GC to avoid this?

    Here's why both phenomena happen: folks have a tendency keep dangling pointers to objects that they are no longer using. Here's an evil example I once found: there's a Window god-object that gets created for every window that gets opened. And for reasons, the Window has a previousWindow pointer to the Window from which the user initiated opening the window. The previousWindow pointer is used in initialization of the Window, but never again. Nobody nulled previousWindow.

    The result? A GC-induced leak!

    In a malloc/free program, the call to previousWindow.destroy() (or whatever) would also delete (free()) the object, and you'd have a dangling pointer. But it's fine because nobody dereferences it. It's a correct case of dangling pointers! But in the GC'd program, the dangling program keeps previousWindow around, and then there's previousWindow.previousWindow, and previousWindow.previousWindow.previousWindow, and... you get the idea.

    This is why Fil-C's answer to free() isn't to just ignore it. Fil-C strongly supports free():

    - Freeing an object immediately flags the capability as being empty and free. No memory accesses will succeed on the object anymore.

    - The GC does not scan any outgoing references from freed objects (and it doesn't have to because the program can't access those references). Note that there's almost a race here, except https://fil-c.org/safepoints saves us. This prevents previousWindow.previousWindow from leaking.

    - For those pointers in the heap that the GC can mutate, the GC repoints the capability to the free'd singleton instead of marking the freed object. If all outstanding pointers to a freed object are repointable, then the object won't get marked, and will die. This prevents previousWindow from leaking.

    > Can the GC be configured to warn or panic if something is GCed without free()?

    Nope. Reason: the Fil-C runtime itself now relies on GC, and there's some functionality that only a GC can provide that has proven indispensable for porting some complex stuff (like CPython and Perl5).

    It would take a lot of work to convert the Fil-C runtime to not rely on GC. It's just too darn convenient to do nasty runtime stuff (like thread management and signal handling) by leaning on the fact that the GC prevents stuff like ABA problems. And if you did make the runtime not rely on GC, then your leak detector would go haywire in a lot of interesting ports (like CPython).

    But, I think someone might end up doing this exercise eventually, because if you did it, then you could just as well build a version of Fil-C that has no GC at all but relies on the memory safety of sufficiently-segregated heaps.

    • gkfasdfasdf 19 hours ago

      Is it possible to use Fil-C as a replacement for valgrind/address sanitizer/leak sanitizer? I.e. say I have a C program that does manual memory management already. Can I then compile it with Fil-C and have it panic/assert on heap use after free, uninitialized memory read (including stack), array out of bounds read, etc?

    • kragen a day ago

      Okay, that's brilliant. I didn't even imagine that the GC-induced leak problem was even solvable. I guess the freed-but-not-GCed object could be arbitrarily large, but that's almost never going to be a gradual leak.

      What's awesome about the Emacs GC?

      • pizlonator a day ago

        Even if you have a large freed object, it’ll really get GC’d unless it’s referenced from a root that the GC can’t edit. I allow for such things to simplify the runtime, but they’re rare.

        As a GC dev I just found the emacs GC to be so nicely engineered:

        - The code is a pleasure to read. I understood it very quickly.

        - lots of features! Very sophisticated weak maps, weak references, and finalizers. Not to mention support for heap images (the portable dumper).

        - the right amount of tuning but nothing egregious.

        It’s super fun to read high quality engineering in an area that I am passionate about!

    • skissane a day ago

      > Nope. Reason: the Fil-C runtime itself now relies on GC, and there's some functionality that only a GC can provide that has proven indispensable for porting some complex stuff (like CPython and Perl5).

      What if there was a flag you could set on an allocation, “must be freed”. An app can set the “must be freed” flag on its allocations, meaning when the GC collects the allocation, it checks if free() has been called on it, and if it hasn’t, it logs a warning (or even panics), depending on process configuration flags. Meanwhile, internal allocations by the runtime won’t set that flag, so the GC will never panic/warn on collecting them.

      • pizlonator 21 hours ago

        Internal allocations can have pointers to user allocations

        • skissane 18 hours ago

          Right… but how is that a problem for my proposal? I’m not seeing the issue.