Comment by rdtsc

Comment by rdtsc 2 days ago

9 replies

> makes me wish that it was not multi-processing and Erlang, but a more mainstream language with a better library ecosystem and focused on multi-threading instead, Rust comes to mind but there could be a better one.

I am not quite following, why would we drop multi-processing with isolated tiny heaps with a few KBs each and move to multi-threading with a large multi-megabyte stacks per thread and all sharing and writing to the same heap? It seems like a step backward.

> Rust comes to mind but there could be a better one.

I can see the excitement, but at the same time this is becoming a meme. I mean the drive-by comments of "why don't your rewrite it in Rust" on every repo or "show HN" post. It's kind of like nodejs from years past: "why don't your write in nodejs, it's webscale!". The problem is, after a while it starts to have an inverse dose response: it makes people run away from Rust just based on that kind of stuff.

SigmundA 2 days ago

>I am not quite following, why would we drop multi-processing with isolated tiny heaps with a few KBs each and move to multi-threading with a large multi-megabyte stacks per thread and all sharing and writing to the same heap? It seems like a step backward.

Performance typically, hard to work in parallel on large amount of data performantly without multiple thread sharing a heap and you typically don't need large amount of threads because you don't actually have that many real cores to run them on.

Lots of little share nothing process are great conceptually but that does create significant overhead.

  • rdtsc 2 days ago

    > Lots of little share nothing process are great conceptually but that does create significant overhead.

    It doesn't, really? I have clusters running 1M+ Erlang processes comfortably per node.

    > you typically don't need large amount of threads

    Exactly, that's why Erlang only spawns just the right amount of threads. One scheduler thread per CPU, then a bunch of long running CPU task threads (same number as CPUs as well), plus some to do IO (10-20) and that's it.

    • SigmundA 2 days ago

      Erlang is not as performant for heavy computational loads, this is bared out in many benchmarks, thats not what it's good at.

      Message passing share nothing adds overhead when trying to reference data between processes because it must be copied, how would you do multithreaded processing of a large amount of data without a shared heap in a performant way? Only one thread can work on a heap at a time, so what do you do? Chop up the data and copy it around then piece it back together afterward? Thats overhead vs just working on a single heap in a lockless way. Far as I can tell the main Erlang image processing libraries just call out to C libraries that says something of that kind of work.

      Yes Erlang indirects computation to a OS thread pool, multiplexing all those little Erlang process on real threads creates scheduling overhead. Those threads cannot work on the same data at the same time unless they call out to a libraries written in another language like C to do the heavy lifting.

      .Net does similar things for say web server implementations, it uses a thread pool to execute many concurrent requests and if you use async it can yield those threads back to the pool while say waiting on a DB call to complete, you would not create a thread per http connection so the 4mb stack size is not an issue just like its not with Erlangs thread pool.

      • rdtsc 2 days ago

        > Erlang is not as performant for heavy computational loads

        Well, sure don't use it for heavy computational loads if it doesn't work for you. It works great for our use case and I think Node-RED is also not for heavy computational workload, but rather for event programming, where lots of events happen concurrently.

        > how would you do multithreaded processing of a large amount of data without a shared heap in a performant way?

        That's a specific workload, it's not universal, for sure. I haven't worked on cases there is a single large data structure and all million clients have to manipulate it concurrently. Not saying it's implausible, I can see a game server with an arena maybe being like that, but there are other systems that work like that.

      • marci 2 days ago

        Nobody choses erlang for heavy computational loads, but for ease of use and a simple mental model. You do heavy compute on it mostly if you don't want to be bothered. Erlang's multi-process thing, it's not to be efficient at compute, it's to be efficient at crash recovery and distributed compute. It was there before multi-core multithread were a thing.