Comment by lmm
> Which will reproduce the environment exactly, it's just not a guidebook on how it was built.
By that logic every binary artifact is a "reproducible build". The point of reproducibility isn't just to be able to reproduce the exact same artifact, it's to be able to make changes that have predictable effects.
> The value of most reproducibility at the Dockerfile is that we're actually agnostic to getting a byte-exact reproduction: what we want is the ability to record what was important and effect upgrades.
More or less true. But we don't have that, because of what grandparent said; if a Dockerfile used to work and now doesn't, and there's an apt-get update in it, who knows what version it was getting back when it was working, or how to fix the problem?
I do get the theoretical annoyance of how it’s technically not reproducible, but in practice most containers are pulled and not built from scratch. If you’re really concerned about that apt-get then besides a container registry you’re going to host a private package repository too, or install a versioned tarball from a public URL, but check the hash of whatever you’re downloading and put that hash in the dockerfile.
So in practice.. if the build described in the dockerfile breaks, you notice when you’re changing / extending the dockerfile.. which is the time and place where you’d expect to need to know. My guess is that most people complaining about deterministic builds for containers are not using registries for storing images, and are not deploying to platforms like k8s. If your process is, say, shipping dockerfiles to EC2 and building them in situ with “compose up” or something, then of course it won’t be very deterministic and you’re at the mercy of many more network failures, etc