Implementing deep learning algorithms from scratch using Python and NumPY is a good way to get an understanding of the basic concepts, and to understand what these deep learning algorithms are really doing by unfolding the deep learning black box.
However, as you start to implement very large or more complex models, such as convolutional neural networks (CNN) or recurring neural networks (RNN), it is increasingly not practical, at least for most of the people like me, is not practical to implement everything yourself from scratch.
Even though you understand how to do matrix multiplication and you are able to implement it in your code. But as you build very large applications, you'll probably not want to implement your own matrix multiplication function but instead, you want to call a numerical linear algebra library that could do it more efficiently for you. Isn't it?
The efficiency of your algorithm will help you fail fast 😃 and thus will help you to complete your iteration throughout the IDEA -> EXPERIMENT -> CODE cycle much more quickly.