Comment by adrian_b
It should be noted that using a parallelizable hash, like Blake2/3, does not provide higher speed by magic.
Evaluating anything in parallel is a different compromise between the time and the power needed to perform a computation, i.e. with an N-way parallel evaluation you hope to reduce the time by almost N times, while increasing the power by a similar factor and not increasing much the energy required to do the computation.
The time to compute a hash is not always the most important, especially when the hash computation can be overlapped with other data processing. In mobile and embedded applications the energy can be more important. In that case using the hardware instructions for SHA-256 or SHA-1 can provide energy savings over hashes like Blake2/3.
So the best choice for a hash function can be affected by many factors, it is preferable not to choose automatically the same function regardless of the circumstances.
Nowadays SHA-256 is widely supported in hardware and still secure enough for any application with an 128-bit security target, so it is OK as a default choice, but it may be not the best choice in many cases.