Comment by chuckadams

Comment by chuckadams 2 days ago

3 replies

both cgroups and namespaces are hierarichal, so you certainly can subdivide the sandbox. That is, if you're a decent C programmer and can navigate some dense kernel documentation. You can also run Docker in Docker, but it requires a privileged root container, and even the creator of that feature suggests just bind-mounting the docker socket instead.

I have a nagging feeling Plan9 probably had a solution for all this 30 years ago.

staticassertion 2 days ago

> both cgroups and namespaces are hierarichal, so you certainly can subdivide the sandbox.

This is true, you can enter a namespace while in another namespace, but it's a privileged operation to namespace.

Docker in Docker does use socket bind mounting already afaik, and it's a trivial privesc because docker runs as root and the ability to talk to the socket means you can run `docker run --privileged --user root image_name -it bash` and get a shell as the host root user.

The solution is to allow unprivileged users to drop privileges, which is how MacOS and Windows work. On Windows you have integrity levels, tokens, etc, all of which you can drop without privileges. On MacOS you have seatbelt.

Linux almost had this with unprivileged user namespaces but that's not viable because 30 years of "root -> kernel privesc isn't a security issue" attitude proved to be problematic.

  • chuckadams 2 days ago

    Docker-in-Docker is a different thing than bind mounting the socket. The former runs a new docker daemon in a container, the latter just talks to the host's socket. Anyway, https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-d... tells it straight from the horse's mouth. It appears you may not even need privileged containers to pull it off nowadays, but the author still lists several more footguns.

    Landlock is an all right start at unprivileged restrictions, but yeah, doesn't seem anywhere near as nice as pledge() and unveil().

    • staticassertion 3 hours ago

      Thanks, I'd misremembered that it just required --privileged. I suspect that will continue to be a requirement since unprivileged user namespaces are not viable.