Comment by recursive

Comment by recursive 4 hours ago

8 replies

I took a linear algebra class, as well as many others. It didn't work.

Most math classes I've taken granted me some kind of intuition for the subject material. Like I could understand the concept independent from the name of the thing.

In linear algebra, it was all a series of arbitrary facts without reason for existing. I memorized them for the final exam, and probably forgot them all the next day, as they weren't attached to anything in my mind.

"The inverse of the eigen-something is the determinant of the abelian".

It was just a list of facts like this to memorize by rote.

I passed the class with a decent grade I think. But I really understood nothing. At this point, I can't remember how to multiply matrices. Specifically do the rows go with the columns or do the columns go with the rows?

I don't know if there's something about linear algebra or I just didn't connect with the instructor. But I've taken a lot of other math classes, and usually been able to understand the subject material readily. Maybe linear algebra is different. It was completely impenetrable for me.

omnicognate 3 hours ago

You might want to try Linear Algebra Done Right by Sheldon Axler. It's a short book, succinct but extremely clear and approachable. It explains Linear Algebra without using determinants, which are relegated to the end, and emphasises understanding the powerful ideas underpinning the subject rather than learning seemingly arbitrary manipulations of lists and tables of numbers.

Those manipulations are of course extremely useful and worth learning, but the reasons why, and where they come from, will be a lot clearer after reading Axler.

As someone pointed out elsewhere in this thread, the book is available free at https://linear.axler.net/

  • recursive an hour ago

    The page count suggests that we have different ideas of what's meant by "short". In any case, it looks great from the forewords. If I ever want to make a serious try to really get it, this is probably what I'll use.

    • getnormality 19 minutes ago

      It is widely considered to deliver on the promise of done right!

drdeca 4 hours ago

To remind oneself how to multiply matrices together, it suffices to remember how to apply a matrix to a column vector, and that ((A B) v) = (A (B v)).

For each 1-hot vector e_i (i.e. the row vector that has a 1 in the i-th position and 0s elsewhere), apply B e_i to get the i-th column of the matrix B. Then, apply the matrix A to the result, to obtain A (B e_i), which equals (A B) e_i . This is then the i-th column of the matrix A B. And, when applying the matrix A to some column vector v, for each entry/row of the resulting vector, it is obtained by combining the corresponding row of A, with the column vector v.

So, to get the entry at the j-th row of the i-th column of (A B), one therefore combines the i-th column of B with the j-th row of A. Or, alternatively/equivalently, you can just compute the matrix (A B) column by column, by, for each e_i , computing that the i-th column of (A B) is (A (B e_i)) (which is how I usually think of it).

To be clear, I don't have the process totally memorized; I actually use the above reasoning to remind myself of the computation process a fair portion of the time that I need to compute actual products of matrices, which is surprisingly often given that I don't have it totally memorized.

When I took linear algebra, the professor emphasized the linear maps, and somewhat de-emphasized the matrices that are used to notate them. I think this made understanding what is going on easier, but made the computations less familiar. I very much enjoyed the class.

ndriscoll 3 hours ago

Here's a recipe for matrix multiplication that you can't forget: choose bases b_i/c_j for your domain/codomain. Then all a matrix is is listing the outputs of a function for your basis: if you have a linear function f, then the ith column of its matrix A is just f(b_i). If you have another function g from f's codomain, then same thing, its matrix B is just the list of outputs g(c_j). Then the ith column of BA is just g(f(b_i)). If you write these things down on paper and expand out what I wrote, you'll see the usual row and column thing pop out. The point is that f(b_i) is a weighted sum of the c_i (since c_i is a basis for the target of f), but you can pull the weighted sums through the definition of g because it's linear. A basis gives you a minimal description/set of points where you need to define a function, and the definition for all other points follows from linearity.

The point of the eigen-stuff is that along some directions, linear functions are just scalar multiplication: f(v) = av. If the action in a direction is multiplication by a, then it can't also be multiplication by b. So unequal eigenvalues must mean different directions/linearly independent subspaces. So e.g. if you can find n different eigenvalues/eigenvectors, you've found a simple basis where each direction is just multiplication. You also know that it's invertible if the eigenvalues are nonzero since all you did was multiply by a_i along each direction, so you can invert it by multiplying by 1/a_i on each direction.

Taught properly it's all very straightforward, though determinants require some more buildup with a detour through things like quotienting and wedge products if you really want it to be straightforward IMO. You start by saying you want to look at oriented areas/volumes, and look at the properties you need. Then quotienting gives you a standard tool to say "I want exactly the thing that has those properties" (wedge products). Then the action on wedges gives you what your map does to volumes, with the determinant as the action on the full space. You basically define it to be what you want, and then you can calculate it by linearity/functoriality just like you expand out the definition of a linear map from a basis.

xeromal 3 hours ago

IDK why but the replies to your comment crack me up because they ended up confusing me rather than helped. It's the same for me. Impenetrable.

getnormality 4 hours ago

I'm an applied math PhD who thinks linear algebra is the best thing ever, and it's the nuts and bolts of modern AI, so for fun and profit I'll attempt a quick cheat sheet.

To manage expectations, this won't be very satisfying by itself. You have to do a lot of exercises for this stuff to become second nature. But hopefully it at least imparts a sense that the topic is conceptually meaningful and not just a profusion of interacting symbols. For brevity, we'll pretend real numbers are the only numbers that exist; assume basic knowledge of vectors; and, I won't say anything about eigenvalues.

1. The most important thing to know about matrices is that they are linear maps. Specifically, an m x n matrix is a map from n-dimensional space (R^n) to m-dimensional space (R^m). That means that you can use the matrix as a function, one which takes as input a vector with n entries and outputs a vector with m entries.

2. The columns of a matrix are vectors. They tell you what outputs are generated when you take the standard basis vectors and feed them as inputs to the associated linear map. The standard basis vectors of R^n are the n vectors of length 1 that point along the n coordinate axes of the space (the x-axis, y-axis, z-axis, and beyond for higher-dimensional spaces). Conversely, a vector with n entries is also an n x 1 column matrix.

3. Every vector can be expressed uniquely as a linear combination (weighted sum) of standard basis vectors, and linear maps work nicely with linear combinations. Specifically, F(ax + by) = aF(x) + bF(y) for any real-valued "weights" a,b and vectors x,y. From this, you can show that a linear map is uniquely determined by what it maps the standard basis vectors to. This + #2 explains why linear maps and matrices are equivalent concepts.

4a. The way you apply the linear map to an arbitrary vector is by matrix-vector multiplication. If you write out (for example) a 3 x 2 matrix and a 2 x 1 vector, you will see that there is only one reasonable way to do this: each 1 x 2 row of the matrix must combine with the 2 x 1 input vector to produce an entry of the 3 x 1 output vector. The combination operation is, you flip the row from horizontal to vertical so it's a vector, then you dot-product it with the input vector.

4b. Notice how when you multiply 3x2 matrix with 2x1 vector, you get a 3x1 vector. In the "size math" of matrix multiplication, (3x2) x (2x1) = (3x1); the inner 2's go away, leaving only the outer numbers. This "contraction" of the inner dimensions, which happens via the dot product of matching vectors, is a general feature of matrix multiplication. Contraction is also the defining feature of how we multiply tensors, the 3D and higher-dimensional analogues of matrices.

5. Matrix-matrix multiplication is just a bunch of matrix-vector multiplications put side-by-side into a single matrix. That is to say, if you multiply two matrices A and B, the columns of the resulting matrix C are just the individual matrix-vector multiplications of A with the columns of B.

6. Many basic geometric operations, such as rotation, shearing, and scaling, are linear operations, so long as you use a version of them that keeps the origin fixed (maps the zero vector to zero vector). This is why they can be represented by matrices and implemented in computers with matrix multiplication.

fwip 4 hours ago

I think this is pretty instructor-dependent. I had two LinAlg courses, and in the first, I felt like I was building a great intuition. In the second, the instructor seemed to make even the stuff I previously learned seem obtuse and like "facts to memorize."

Maybe linear algebra is more instructor-dependent, since we have fewer preexisting concepts to build on?