Comment by photonthug
Comment by photonthug 4 days ago
I do get the theoretical annoyance of how it’s technically not reproducible, but in practice most containers are pulled and not built from scratch. If you’re really concerned about that apt-get then besides a container registry you’re going to host a private package repository too, or install a versioned tarball from a public URL, but check the hash of whatever you’re downloading and put that hash in the dockerfile.
So in practice.. if the build described in the dockerfile breaks, you notice when you’re changing / extending the dockerfile.. which is the time and place where you’d expect to need to know. My guess is that most people complaining about deterministic builds for containers are not using registries for storing images, and are not deploying to platforms like k8s. If your process is, say, shipping dockerfiles to EC2 and building them in situ with “compose up” or something, then of course it won’t be very deterministic and you’re at the mercy of many more network failures, etc
> If you’re really concerned about that apt-get then besides a container registry you’re going to host a private package repository too, or install a versioned tarball from a public URL, but check the hash of whatever you’re downloading and put that hash in the dockerfile.
Right, but if you're doing that then you probably don't need to bother with the docker part at all.
> you notice when you’re changing / extending the dockerfile.. which is the time and place where you’d expect to need to know
You notice, sure, but you can't see what it was. Like, sure, it's better than it failing when you come to deploy it, but you're still in the position of having to do software archaeology to figure out how it ever worked in the first place.
Like, for me the main use case for reproducible builds is "I need to make a small change to this component that was made by someone who left the company 3 years ago and has been quietly running since then", and you want to be able to just run the build and be confident it's going to work. You don't necessarily need to build something byte-for-byte identical to the thing that's currently running, but you do need to build something equivalent. The reproducibility isn't important per se but the declarativeness is, and with Docker you don't get that.