Comment by perlgeek

Comment by perlgeek a day ago

13 replies

The real lesson they should learn is to not rely on running images and then using "docker commit" to turn it into an image, but instead to use proper image building tools.

If you absolutely have to do it that way, be very deliberate about what you actually need. Don't run an SSH daemon, don't run cron, don't an SMTP daemon, don't run the suite of daemons that run on a typical Linux server. Only run precisely what you need to create the files that you need for a "docker commit".

Each service that you run can potentially generate log files, lock files, temp files, named pipes, unix sockets and other things you don't want in your image.

Taking a snapshot from a working, regular VM and using that as a docker image is one of the worst ways to built one.

larusso a day ago

My first reaction: 800GB who committed that?!? This size alone screams something is wrong. To be fair even with basic dockerfiles it’s easy to build up a lot of junk. But there should be a general size limit in any workflow that just alerts when something grows out of proportion. We had this in our shop just a few weeks ago. A docker image for some ai training etc grew too big and nobody got alerted about the image final size. It got committed and pushed to jfrog. From there the image synced to a lot of machines. Jfrog informed us that something is off with our amount of data we shuffle around. So on one end this should not happen but it seems to easily end up in production without warning.

  • SOLAR_FIELDS a day ago

    Given that Jfrog bills on egress for these container images I’m sure you guys saw an eye watering bill for the privilege of distributing your bloated container

    • larusso 20 hours ago

      Yes. But fair enough that we got a warning the very next day.

gobip a day ago

What if I need cron in my docker container? And ssh? And a text editor? And a monitoring agent? :P

Thankfully LXD is here to serve this need: very lightweight containers for systems, where your app runs in a complete ecosystem, but very light on the ram usage.

  • curt15 a day ago

    >What if I need cron in my docker container? And ssh? And a text editor? And a monitoring agent? :P

    How are you going to orchestrate all those daemons without systemd? :P

    As you mentioned, a container running systemd and a suite of background services is the typical use case of LXD, not docker. But the difference seems to be cultural -- there's nothing preventing one from using systemd as the entry point of a docker container.

    • sally_glance a day ago

      fwiw I recently bootstrapped a small Debian image for myself, originally intended to sandbox coding agents I was evaluating. Shortly after I got annoyed by baseline vim and added my tmux & nvim dotfiles, now I find myself working inside the container regularly. It definitely works and is actually not the worst experience if your workflow is cli-focused.

      • BobbyTables2 9 hours ago

        Even putting GUI apps in a container isnt too bad once one develops the right incantation for x11/wayland forwarding.

      • SOLAR_FIELDS a day ago

        My experience is if the tooling is set up right it’s not painful, it’s the fiddling around with volume mounts folder permissions and debug points and “what’s inside the container and what isn’t” etc that is always the big pain point

        • sally_glance 20 hours ago

          Very accurate - that was one of the steps that caused me to fiddle quite a bit. Had to add an entrypoint to chown the mounts and also some Buildkit cache volumes for all the package managers.

          You can skip the uid/chown stuff if you work with userns mappings, but this was my work machine so I didn't want to globally touch the docker daemon.

  • ndsipa_pomu a day ago

    Ideally, you have a separate docker container for each process (i.e. a separate container for the ssh service, one for cron etc). The text editor can be installed if it's needed - that's not an issue apart from slightly increasing the container size. Most of the time, the monitoring agent would be running on the host machine and setup to monitor aspects of the container - containers should be thought of as running a single process and not as running a VM along with all its services.

ndsipa_pomu a day ago

Initially I didn't understand how they were getting the log files into the image. I had no idea that people abuse "docker commit" - do they know nothing about containers? If you want persistent logs, then have a separate volume for them so they can't pollute any image (plus they are readable when the image restarts etc).

When I saw the HN title, I thought this was going to be something subtle like deleting package files (e.g. apt) in a separate layer, so you end up with a layer containing the files and then a subsequent layer that hides them.