Comment by deng
The only thing volatile does is to assure that the value is read from memory each time (which implicitly also forbids optimizations). Whether that memory is in a CPU cache is purely a hardware issue and outside the C specification. If you read something like a hardware register, you yourself need to take care in some way that a hardware cache will not give you old values (by mapping it into a non-cached memory area, or by forcing a cache update). If you for-loop over something that acts as a compiler barrier, all that 'volatile' on the counter variable will do is potentially make the for-loop slower.
There's really just very few reasons to ever use 'volatile'. In fact, the Linux kernel even has its own documentation why you should usually not use it:
https://www.kernel.org/doc/html/latest/process/volatile-cons...
doesnt volatile also ensure the address is not changed for the read by compiler (as it might optimise data layout otherwise)? (so you can be sure when using mmio etc. it wont read from wrong place?)