Rootless Pings in Rust
(bou.ke)122 points by bouk a day ago
122 points by bouk a day ago
> The issue is that the rust library apparently conflates datagram and UDP, when they're not the same thing.
It comes down to these two lines (using full items paths for clarity):
let socket = socket2::Socket::new(Domain::IPV4, Type::DGRAM, Some(Protocol::ICMPV4))?;
let socket: std::net::UdpSocket = socket.into();
The latter is using this impl: https://docs.rs/socket2/0.6.1/socket2/struct.Socket.html#imp...Basically the `socket2` crate lets you convert the fd it produces into a `UdpSocket`. It doesn't verify it really is a UDP socket first; that's up to you. If you do it blindly, you can get something with the wrong name, but it's probably harmless. (At the very least, it doesn't violate memory safety guarantees, which is what Rust code tends to be very strict about.)
`UdpSocket` itself has a `From<OwnedFd>` impl that similarly doesn't check it really is a UDP socket; you could convert the `socket2::Socket` to an `OwnedFd` then that to a `UdpSocket`. https://doc.rust-lang.org/stable/std/net/struct.UdpSocket.ht... https://docs.rs/socket2/0.6.1/socket2/struct.Socket.html#imp...
It may be memory safe but it's not using the type system to represent the domain very well.
One could imagine a more type-friendly design in which we could write that first line as follows:
let socket: Socket<IPv4, Datagram, IcmpV4> = Socket::new()?;
Now, the specifics of socket types will be statically checked.Edit: I realized that the issue here is actually the conversion, and that UdpSocket on its own is actually a type-safe representation of a UDP socket, not a general datagram socket. But the fact that this dubiously-safe conversion is possible and even useful suggests that an improved design is possible. For example, a method like UdpSocket's `set_broadcast` can't work with a socket like the above, and from a type safety perspective, it shouldn't be possible to call it on such a socket.
One could, but one probably doesn't want to have separate types for TCP-over-IPv4 vs TCP-over-IPv6 for example, even if they accept/produce different forms of addresses. That'd force a lot of code bloat with monomorphization.
So now one is making one's own enumeration which is different than the OS one and mapping between them, which can get into a mess of all the various protocols Linux and other OSs support, and I'm not sure it's solving a major problem. Opinions vary, but I prefer to use complex types sparingly.
I think there are likely a bunch of other cases where it's useful to choose these values more dynamically too. Networking gets weird!
Rust has an RFC about some of the conventions for the Fd conversions. https://rust-lang.github.io/rfcs/3128-io-safety.html
It's unfortunate they did not extend marking the OwnedFd conversions as unsafe due to the focus in the RFC on a single class of unsafety in Fds instead of having a recognition that there are other issues with arbitrary Fd conversions.
> dubiously-safe
No, it's perfectly safe. Except if you expand the scope of "safe" by a lot.
OP turned the socket into an (almost) raw file descriptor, and created an UDP socket from it. Weird, yes, but since it's perfectly memory safe and invalid operations would correctly error, it's not "dubiously-safe". It's safe.
I mean, either your language has the ability to do raw (technically Owned in this case) file descriptors, or it doesn't.
Maybe you'd prefer Rust had a third mode? Safe, `unsafe {}`, and `are_you_sure_you_understand_this {}`, the last one also being 'safe', but just… odd.
Thank you.
I assumed this was what was happening, but conflating network layer protocols with transport layer ones isn't great.
I'm surprised that pedantic zealots, like me in my youth, haven't risen up and flooded rust with issues over it way before this though.
Why are you saying rust needs to be flooded with issues? Rust isn't conflating transport layers and protocols. OP is.
UdpSocket is just able to take ANY file descriptor and try to use it as if it's a UDP socket.
E.g. this compiles, and it's not a bug. It doesn't make any sense, but it's not a bug:
fn main() {
let f = std::fs::File::open("/dev/null").unwrap();
let f: std::os::fd::OwnedFd = f.into();
let socket: std::net::UdpSocket = f.into();
}
OP is clearly confused, since there's no need to do this at all. socket2::Socket already has a `send_to()`: https://docs.rs/socket2/latest/socket2/struct.Socket.html#me...I think OP either banged on this until it compiled, maybe blindly copying from other examples, or it's vibe coded, and shows why AI needs supervision from someone who can actually understand what the code does.
Could you please explain me the difference? As UDP is the "User Datagram Protocol" when I read about datagrams I always think about UDP and though it was just a different way of saying the same thing. Maybe "datagram" is supposed to be the packet itself, but you're still sending it via UDP, right?
There's actually a lot of combinations of (domain, type, protocol) that are available. It is not always the case that the protocol implies the type.
In IP land (domains AF_INET and AF_INET6), we have the well known UDP and TCP protocols, of course. UDP is always datagram (SOCK_DGRAM) and TCP is always stream (SOCK_STREAM). Besides datagram-only ICMP, there's also SCTP, which lets you choose between stream and sequential-packet (SOCK_SEQPACKET) types. A sequential-packet socket provides in-order delivery of packet-sized messages, so it sits somewhere between datagram and stream in terms of features.
In AF_UNIX land, there are no protocols (the protocol field is always 0), but all 3 of the aforementioned types are available. You just have to pick the same type on both sides.
Footnotes: SCTP is not widely usable because Windows doesn't natively support it and many routers will drop or reject it instead of forwarding it. Also, AF_UNIX is now supported on Windows, but only with SOCK_STREAM type.
UDP and TCP are Layer 3 protocols, and so is ICMP. They all fill the same bits within network packets, like at the same level. So sending an ICMP packet (protocol 1) is not the same as sending a UDP packet (protocol 17).
You can see a list of network protocols in /etc/protocols actually, or here: https://www.iana.org/assignments/protocol-numbers/protocol-n...
As the other commenter pointed out, UDP is transport protocol, not a packet level protocol.
Think of it like this:
Ethernet sends data frames to a MAC address. It has a sender and a receiver address. See here for structure: https://en.wikipedia.org/wiki/Ethernet_frame
Internet Protocols (v6 and v4) send packets via Ethernet (or WiFi or Bluetooth or anything else) from an IP address to an IP address. For structure see https://en.wikipedia.org/wiki/IPv6_packet or if for some reason you still need the legacy version see https://en.wikipedia.org/wiki/IPv4#Packet_structure (aside but notice how much complexity was removed from the legacy version). Notably, IP does not have any mechanism for reliability. It is essentially writing your address and a destination address on a brick and tossing over your fence to the neighbor’s yard and asking them to pass it along. If your neighbor isn’t home your brick is not moving along.
TCP and UDP send streams and datagrams respectively and use the concept of application ports. A TCP stream is what it sounds like: a continuous stream of bytes with no length or predefined stopping point. TCP takes your stream and chunks it into IP packets, the size of which is determined by the lowest Ethernet (or whatever data link protocol) data frame size. Typically this is 1500 but don’t forget to account for header sizes so useful payload size is smaller. TCP is complex because it guarantees that your stream will eventually be delivered in the exact order in which it was sent. Eventually here could mean at t = infinity. UDP simply has source and destination port numbers in its header (which follows the IP header in the data frame), and guarantees nothing: not ordering not guaranteed delivery, etc. If an IP packet is a brick with two house addresses, a UDP datagram is a brick with two house addresses and an ATTN: application X added. An address represents an computer (this is very approximate in the world where any machine can have N addresses and run M VMs or containers which themselves can have O addresses), and a port represents a specific process on that computer.
ICMP does not use ports. ICMP is meant for essentially network and host telemetry so you are still sending and receiving only at an IP address level. But it has a number of message types. You can see them here: https://en.wikipedia.org/wiki/ICMPv6. Note that ping and pong are just two of the types. Others are actually responsible for things like communicating what raw IP cannot. For example Packet Too Large type tells you that an IP packet you tried to send was hitting a part of its path where the datagram size did not allow it to fit and it’s used for IP MTU path discovery (you keep sending different size packets to find what is the largest that will go through).
There are other protocols that run directly on top of IP (6in4 for example, or SCTP). Most are way less popular than the three mentioned above. Some use datagrams (discrete “bricks” of data), some use streams (endless “tapes” of data), which is the difference in protocol family: datagrams vs stream. You can also go a level deeper and just craft raw IP packets directly but for that you typically must be the root user since you can for example send a packet with the source port set to 22 even though you are not the ssh daemon.
Since ICMP has no concept of a port, when you send a ping to a remote host and it returns a ping to you, how does your kernel know to hand the response to your process and not some other one? In the ICMP header there is an ICMP identifier (often the process PID) and when the reply comes back it has the same identifier (but with source and destination IPs swapped and type updated to echo reply). This is what the kernel uses to find the process to which it will deliver the ICMP packet.
I hope this clears up some of this.
Since basically all the comments are about how both the author and many commenters are confused about what UDP and DGRAM sockets are, I have corrected the author's code to no longer miscommunicate what protocol is being used.
https://github.com/ThomasHabets/rust-ping-example-corrected
There is no UDP used anywhere in this example. ICMP is not UDP.
I'm not saying my fix is pretty (e.g. uses unwrap(), and ugly destination address parsing), but that's not the point.
This is interesting, but falls just short of explaining what's going on. Why does UDP work for ICMP? What does the final packet look like, and how is ICMP different from UDP? None of that is explained, it's just "do you want ICMP? Just use UDP" and that's it.
It would have been OK if it were posted as a short reference to something common people might wonder about, but I don't know how often people try to reimplement rootless ping.
The BSD socket API has 3 parameters when creating a socket with socket(), the family (e.g. inet) the kind (datagram in this case) and the protocol (often 0, but IPPROTO_ICMP in this case).
Because when the protocol is 0 it means a UDP socket Rust has called its API for creating any(?) datagram sockets UdpSocket, partly resulting in this confusion.
The kernel patch introducing the API also explains it was partly based on the UDP code, due to obviously sharing a lot of properties with it. https://lwn.net/Articles/420800/
The std api can only create UdpSockets, the trick here is that you use Socket2 which allows more kinds of sockets and then you tell UdpSocket that some raw file descriptor is a upd socket through a unsafe api with no checks and I guess it works because they use the same api on posix.
Edit: It is possible in safe rust as well, see child comment.
The macro used by socket2: https://docs.rs/socket2/0.6.1/src/socket2/lib.rs.html#108
The FromRawFd trait: https://doc.rust-lang.org/stable/std/os/fd/trait.FromRawFd.h...
So UdpSocket should really be called DatagramSocket, UDP being the protocol that operates on these datagrams?
Surprising that they got such a fundamental thing wrong.
ICMP is just different protocol from UDP. There's field "Protocol" in IP packet. 0x01 = ICMP, 0x06 = TCP, 0x11 = UDP.
I think that this article gets terminology wrong. It's not UDP socket that gets created here, but Datagram socket. Seems to be bad API naming in Rust library.
To give a more nuanced reply versus the "you're wrong" ones already here, the difference is that UDP adds send and receive ports, enabling most modern users (& uses) of UDP. Hence, it is the "User" datagram protocol.
(it also adds a checksum, which used to be more important than it is nowadays, but still well worth it imho.)
In related news, all rectangles are squares and all animals are dogs.
So in fairness, this doesn't actually use UDP at all (SOCK_DGRAM does not mean UDP!).
The actual protocol in use, and what's supported, it matched by all of the address family (IPV4), the socket type (DGRAM), and the protocol (ICMP). The match structure for IPV4 is here in Linux at least: https://elixir.bootlin.com/linux/v6.18/source/net/ipv4/af_in...
So ultimately, it's not even UDP, it's just creating a standard ICMP socket.
The semantic wrappers around file descriptors (File, UdpSocket, PidFd, PipeReader, etc.) are advisory and generally interconvertible. Since there's no dedicated IcmpSocket they're using UdpSocket which happens to provide the right functions to invoke the syscalls they need.
The rust API in use lets you feed an fd into a UdpSocket which calls the necessary send/recv/etc on it.
The socket itself is an ICMP socket, but the ICMP shaped API just happened to fit into the UDP shaped hole. I'm sure some advanced UDP socket options will break or have weird side effects if your code tries to apply them.
I was interested in a related topic a while back.
Historically, to receive ICMP packets, I think you had to open a RAW socket and snoop everything. Obviously, this required root or similar.
IPPROTO_ICMP allows you to send ICMP packets and receive responses from the same address, without root. But you can't use it for traceroute because it only accepts ICMP responses from the ultimate destination you sent to; not some TTL failure intermediary.
Finally, IP_RECVERR (Linux 2.2) on UDP sockets allows you to receive associated ICMP errors from any hop for a send. (This is useful for traceroute, but not ICMP ping.)
I think there are also some caveats on how you can monitor for these type of events in Rust in particular? IIRC, the mainstream async stuff only watches for read/write events, and these aren't those.
Worth noting you don't actually need to be fully root in Linux to do standard pings with your code, there's a couple of different options available at the OS level without needing to modify code.
1. You can just add the capability CAP_NET_RAW to your process, at which point it can ping freely
2. There's a sysctl that allows for unprivileged ping "net.ipv4.ping_group_range" which can be used at the host level to allow different groups to use ICMP ping.
It lets you send raw sockets, and has some dangers (e.g. packet forgery). It's included in pretty much every container in existence (if you're running as root in the container or have ambient capabilities setup).
The goal of the capabilities system was to allow processes and users to gain a small portion of root privileges without giving them all.
In the "old days" ping on a Linux host would be setuid root, so it essentially had all of root's rights. In more modern setups it either has CAP_NET_RAW or the ping_group sysctl is used to allow non-root users to use it.
CAP_NET_RAW also allow to capture packets (tcpdump) so you really can have some fun like running a TCP stack in user space or MITM http connections: https://blog.champtar.fr/IPv6_RA_MITM/ / https://blog.champtar.fr/Metadata_MITM_root_EKS_GKE/
> There's a sysctl that allows for unprivileged ping "net.ipv4.ping_group_range"
What are the risks of enabling this for all groups (i.e. sysctl net.ipv4.ping_group_range='0 4294967294')?
Note this allows unprivileged ICMP sockets, not unprivileged RAW sockets.
The Linux vs macOS behavioral differences in ICMP sockets documented by the article are critical:
- Linux overwrites identifier and checksum fields
- macOS requires correct checksum calculation
- macOS includes IP header in response, Linux doesn't
I think this is the kind of subtle difference that would trip up even experienced programmers
Great article, it lead me to the `icmplib`[0] Python project, which has a `privileged` option:
When this option is enabled, this library fully manages the exchanges and the structure of ICMP packets. Disable this option if you want to use this function without root privileges and let the kernel handle ICMP headers.
[0] https://github.com/ValentinBELYN/icmplibThe unprivileged DGRAM approach is a lifesaver for container environments. Ran into this building a health check service - spent ages wondering why my ping code needed --privileged when the system ping worked fine as a normal user. Turns out the default ping binary has setuid, which isn't an option in a minimal container image.
The cross-platform checksum difference is a pain though. Linux handling it for you is convenient until you test on macOS and everything breaks silently.
It doesn't.
For users in the UID range in sysctl `net.ipv4.ping_group_range` the normal ping command uses this non-root way.
Sure, maybe your system still sets suid root on your ping binary, or shows it adding `cap_net_raw` according to `getcap`, but mine does not.
I struggled in vain to see what this has to do with rust. The answer is nothing other than the 4 lines of sample code shown are in Rust. The actually useful nugget of knowledge contained therein (one can create ICMP packets without being root on MacOS or Linux) is language agnostic.
So... why? Should I now add "in C" or "in assembly" to the end of all my article titles?
It's a lot more than 4 lines of sample code, in fact on my screen, it looks like it's more code than text. This is closer to a Rust tutorial then a low-level networking explainer, so yeah, it makes sense to say "in Rust". If I wanted to do this in C, this would not be the best resource.
Agreed. I don't dislike Rust as a language, but it annoys me how its practitioners add the "[written] in Rust" tagline to every single thing they do that's otherwise unrelated to Rust. Specially when their code or dependencies are full of unverified unsafe blocks, which defeats the selling point.
> It turns out you can create a UDP socket with a protocol flag, which allows you to send the ping rootless
This is wrong, despite the Rust library in question's naming convention. You're not creating a UDP socket. You're creating an IP (AF_INET), datagram socket (SOCK_DGRAM), using protocol ICMP (IPPROTO_ICMP). The issue is that the rust library apparently conflates datagram and UDP, when they're not the same thing.
You can do the same in C, by calling socket(2) with the above arguments. It hinges on Linux allowing rootless pings from the GIDs in
EDIT: s/ICMP4/ICMP/gEDIT2: more spelling mistakes