Comment by HarHarVeryFunny

Comment by HarHarVeryFunny 2 days ago

2 replies

Whoever chose this topic title perhaps did him a disservice in suggesting he said the problem was backprop itself, since in his blog post he immediately clarifies what he meant by it. It's a nice pithy way of stating the issue though.

nirinor 2 days ago

Nah, Karpathy's title is "Yes you should understand backprop", and his first highlight is "The problem with Backpropagation is that it is a leaky abstraction." This is his choice as a communicator, not the poster to HN.

And his _examples_ are about gradients, but nowhere does he distinguish between backpropagation, a (part of) an algorithm for automatic differentiation and the gradients themselves. None of the issues are due to BP returning incorrect gradients (it totally could, for example, lose too much precision, but it doesn't).

  • HarHarVeryFunny 2 days ago

    Yeah - he chose it as a pithy/catchy description of the issue, then immediately clarified what he meant by it.

    > In other words, it is easy to fall into the trap of abstracting away the learning process — believing that you can simply stack arbitrary layers together and backprop will “magically make them work” on your data.

    Then follows this with multiple clear examples of exactly what he is talking about.

    The target audience was people building and training neural networks (such as his CS231n students), so I think it's safe to assume they knew what backprop and gradients are, especially since he made them code gradients by hand, which is what they were complaining about!