smus 2 days ago

Nope, they all depend on x and the same is true in this scenario

godelski 2 days ago

It is actually really common practice. It is a single linear layer so there's no connection intranodes. The reason to do this is because it is a bit less computationally intensive.

tldr: linear layers have an associative property