Comment by godelski
It is actually really common practice. It is a single linear layer so there's no connection intranodes. The reason to do this is because it is a bit less computationally intensive.
tldr: linear layers have an associative property