Comment by lelanthran

Comment by lelanthran 4 days ago

20 replies

> Good C code will try to avoid allocations as much as possible in the first place.

I've upvoted you, but I'm not so sure I agree though.

Sure, each allocation imposes a new obligation to track that allocation, but on the downside, passing around already-allocated blocks imposes a new burden for each call to ensure that the callees have the correct permissions (modify it, reallocate it, free it, etc).

If you're doing any sort of concurrency this can be hard to track - sometimes it's easier to simply allocate a new block and give it to the callee, and then the caller can forget all about it (callee then has the obligation to free it).

obviouslynotme 3 days ago

The most important pattern to learn in C is to allocate a giant arena upfront and reuse it over and over in a loop. Ideally, there is only one allocation and deallocation in the entire program. As with all things multi-threaded, this becomes trickier. Luckily, web servers are embarrassingly parallel, so you can just have an arena for each worker thread. Unluckily, web servers do a large amount of string processing, so you have to be careful in how you build them to prevent the memory requirements from exploding. As always, tradeoffs can and will be made depending on what you are actually doing.

Short-run programs are even easier. You just never deallocate and then exit(0).

  • adrianN 3 days ago

    Arenas are a nice tool, but they don't work for all use cases. In the limit you're reimplementing malloc on top of your big chunk of memory.

    • galangalalgol 3 days ago

      Most games have to do this for performance reasons at some point and there are plenty of variants to choose from. Rust has libraries for some of them, but in c rolling it yourself is the idiom. One I used in c++ and worked well as a retrofit was to overload new to grab the smallest chunk that would fit the allocation from banks of them. Profiling under load let the sizes of the banks be tuned for efficiency. Nothing had to know it wasn't a real heap allocation, but it was way faster and with zero possibility of memory fragmentation.

      • lifthrasiir 3 days ago

        Most pre-2010 games had to. As a prior gamedev after that period I can confidently say that it is a relic of the past in most cases now. (Not like that I don't care, but I don't have to be that strict about allocations.)

    • juped 3 days ago

      The normal practical version of this advice that isn't a "guy who just read about arenas post" is that you generally kick allocations outward; the caller allocates.

    • lelanthran 3 days ago

      They don't work for all use-cases, but they most certainly work for this use-case (HTTP server).

  • bheadmaster 3 days ago

    > Ideally, there is only one allocation and deallocation in the entire program.

    Doesn't this techically happen with most of the modern allocators? They do a lot of work to avoid having to request new memory from the kernel as much as possible.

    • Asmod4n 3 days ago

      last time i checked, the glibc allocator doesnt ask the OS that often for new heap memory.

      Like, every ~thousand malloc calls invoked (s)brk and that was it.

  • card_zero 3 days ago

    > there is only one allocation and deallocation in the entire program.

    > Short-run programs are even easier. You just never deallocate and then exit(0).

    What's special about "short-run"? If you deallocate only once, presumably just before you exit, then why do it at all?

    • free_bip 3 days ago

      Just because there's only one deallocation doesn't mean it's run only once. It would likely be run once every time the thread it belongs to is deallocated, like when it's finished processing a request.

  • lelanthran 3 days ago

    I agree, which is why I wrote an arena allocator library I use (somewhere on github, probably public and free).

1718627440 3 days ago

To reduce the amount of allocation instead of:

    struct parsed_data * = parse (...);
    struct process_data * = process (..., parsed_data);
    struct foo_data * = do_foo (..., process_data);
you can do

    parse (...) {
        ...
        process (...);
        ...
    }

    process (...) {
        ...
        do_foo (...);
        ...
    }
It sounds like violating separation of concerns at first, but it has the benefit, that you can easily do procession and parsing in parallel, and all the data can become readonly. Also I was impressed when I looked at a call graph of this, since this essentially becomes the documentation of the whole program.
  • ambicapter 3 days ago

    How testable is this, though?

    • 1718627440 3 days ago

      It might be a problem when you can't afford side-effects that you later throw away, but I haven't experienced that yet. The functions still have return codes, so you still can test, whether a correct input results in no error check being followed and that incorrect input results in an error check being triggered.

throwawaymaths 3 days ago

is there any system where doing the basics of http (everything up to framework handoff of structured data) are done outside of a single concurrency unit?

  • btown 2 days ago

    Not exactly what you’re looking for, but https://github.com/simdjson/simdjson absolutely uses micro-parallel techniques for parsing, and those do need to think about concurrency and how processors handle shared memory in pipelined and branch-predicted operations.