Comment by bee_rider
The MKL blas/lapack implementation also provides the “cblas” interface (I’m sure most blas implementations do, I’m just familiar with MKL—BLIS seems quite willing to provide additional interfaces to I bet they provide it as well) which explicitly accepts arguments for row or column ordering.
Internally the matrix is tiled out anyway (for gemm at least) so column vs row ordering is probably a little less important nowadays (which isn’t to say it never matters).
Oh yes, from an actual implementation POV you can just apply some transpose and ordering transforms to convert from row major to column major or vice-versa. cblas is pretty universal though I don't think any LAPACK C API ever gained as wide support for non column-major usage (and actually has some routines where you can't just pull transpose tricks for the transformation).
Certain layouts have performance advantages for certain operations on certain microarchitectures due to data access patterns (especially for level 2 BLAS), but that's largely irrelevant to historical discussion of the API's evolution.