Comment by btilly
That does seem extremely relevant. And is a mirror of the fundamental insight behind nonstandard analysis. Which is that that any set containing the integers that follows some set of axioms, has a nonstandard model that follows a nonstandard version of those axioms, and which contains infinite integers.
This can be seen as why it is different for a set of axioms to prove that it proves something, than it is to prove something directly. Because when the axioms prove that they prove, you might be in a nonstandard model where that "proof" is infinitely long, and therefore isn't really a proof!
And that is why, for example, if PA is consistent, then it remains consistent if you extend PA with the axiom, "PA is not consistent". Clearly any model of that extended set of axioms does not describe what we want PA to mean. But that doesn't mean that it logically contradicts itself, either.