Comment by do_not_redeem

Comment by do_not_redeem a day ago

11 replies

I think it's just the Zig philosophy to care more about binary size than speed. Allocators have the same tradeoff, ArrayListUnmanaged is not generic over the allocator, so every allocation uses dynamic dispatch. In practice the overhead of allocating or writing a file will dwarf the overhead of an indirect call. Can't argue with those binary sizes.

(And before anyone mentions it, devirtualization is a myth, sorry)

kristoff_it a day ago

> (And before anyone mentions it, devirtualization is a myth, sorry)

In Zig it's going to be a language feature, thanks to its single unit compilation model.

https://github.com/ziglang/zig/issues/23367

  • do_not_redeem a day ago

    Wouldn't this only work if there's only one implementation throughout the entire compliation unit? If you use 2 allocators in your app, your restricted function type has 2 possible callees for each entry, and you're back to the same problem.

    • Zambyte a day ago

      > A side effect of proposal #23367, which is needed for determining upper bound stack size, is guaranteed de-virtualization when there is only one Io implementation being used (also in debug builds!).

      > In the less common case when a program instantiates more than one Io implementation, virtual calls done through the Io interface will not be de-virtualized, as that would imply doubling the amount of machine code generated, creating massive code bloat.

      From the article

      • yxhuvud a day ago

        I wonder how massive it actually would be. I'm guessing it really wouldn't be all that massive in practice even if it of course is easy to create massive examples using ways people typically don't write code.

    • thrwyexecbrain a day ago

      Having a limited number of known callees is already better than a virtual function (unrestricted function pointer). A compiler in theory could devirtualize every two-possible-callee callsite into `if (function_pointer == callee1) callee1() else callee2()` which then can be inlined at compile time or branch-predicted at runtime.

      In any case, if you have two different implementations of something then you have to switch between them somewhere -- either at compile-time or link-time or load-time or run-time (or jit-time). The trick is to find an acceptable compromise of performance, (machine)code-bloat and API-simplicity.

    • throwawaymaths a day ago

      > Wouldn't this only work if there's only one implementation throughout the entire compliation unit

      in practice how often are people using more than one io in a program?

      • latch a day ago

        I think having a thread pool on top of some evented IO isn't _that_ uncommon.

        You might have a thread pool doing some very specific thing. You can do your own threadpool which wont use the Io interface. But if one of the tasks in the threadpool wanted to read a file, I guess you'd have to pass in the blocking Io implementation.

        • throwawaymaths 18 hours ago

          one of the io interfaces provided is a standard threadpool io. and if it was really important, you could write your own io interface that selects between std threadpool and std blocking based off of an option (i am guessing, i don't know, but seems reasonable)

      • NobodyNada 14 hours ago

        In larger Rust applications or servers I find myself doing this very often -- for example, one application I'm working on mostly uses blocking I/O for occasional filesystem access but has a little bit of async networking.

lerno 14 hours ago

> care more about binary size than speed

That does not seem to be true if you look at how string formatting is implemented.