Comment by vodou

Comment by vodou a day ago

9 replies

After years of maintaining and using an application suite that relies on multicast for internal communication, I would hesitate to use "reliable" and "multicast" in the same sentence. Multicast is great in theory, but comes with so many pitfalls and grievances in practice. Mostly due to unreliable handling of switches, routers, network adapters and TCP/IP stacks in operating systems.

Just to mention a few headaches I've been dealing with over the years: multicast sockets that joins the wrong network adapter interface (due to adapter priorities), losing multicast membership after resume from sleep/hibernate, switches/routers just dropping multicast membership after a while (especially when running in VMs and "enterprise" systems like SUSE Linux and Windows Server, all kinds of socket reuse problems, etc.

I don't even dare to think about how many hours I have wasted on issues listed above. I would never rely on multicast again when developing a new system.

But that said, the application suite, a mission control system for satellites, works great most of the time (typically on small, controlled subnets, using physical installations instead of VMs) and has served us well.

twic a day ago

I recently finished eight years at a place where everyone used multicast every day. It consistently worked very well (except for the time when the networks team just decided one of my groups was against policy and firewalled it without warning).

But this was because the IT people put effort into making it work well. They knew we needed multicast, so they made sure multicast worked. I have no idea what that involved, but presumably it means buying switches that can handle multicast reliably, and then configuring them properly, and then doing whatever host-level hardware selection and configuration is required.

In a previous job, we tried to use multicast having not done any groundwork. Just opened sockets and started sending. It did not go so well - fine at first, but then packets started to go missing, and we spent days debugging, and finding the obscure errors in our firewall config. In the end, we did get it working, but i would't have done it again. Multicast is a commitment, and we weren't ready to make it.

  • mrkstu a day ago

    Yep- the main issue is multicast is so sparsely utilized that you can go through most of a career in networking with minimal exposure to multicast except on a particular peer link- once you scale support to multi-hop the institutional knowledge is critical because the individual knowledge is so spotty.

dahfizz a day ago

Aeron is very popular in large financial trading systems. Maybe since multicast is already commonplace (that's how most exchanges distribute market data).

mgaunard a day ago

"reliable" means that if one of the recipients observes a gap, it can ask for a replay of the missing packets.

DanielHB a day ago

Printers seem to be a solved problem and they mostly use zeroconf which uses mDNS (multicast DNS). I have done a bit of work in the area and I didn't run into the problems you mentioned.

However I had very semi-strict control of my network, but used plenty of random routers for testing.

  • zamadatix a day ago

    Link-local multicast like mDNS can be a bit simpler to wrangle than routed multicast. For the link-local case a lot of the interop failure cases with network equipment just devolves into "and it turned into a broadcast" instead of "and it wasn't forwarded". You can still run into some multiple interface issues though.