I am embarking on an ambitious learning project for the next 3 months:
learning deep learning and the mathematics behind it by studying only online resources such as fastai, MIT opencourseware and YouTube.
Deep learning is cool! Its like particle physics in the early 20th Century crossed with the advent of personal computers in the late 70s. New and exciting frontiers of science are open to help discover the fundamental building blocks of intelligence and the barrier to entry is just a computer with access to the internet.
Everyday there is a new article about “AI” being implemented to revolutionise everything from Snapchat filters to driverless cars. It is the next big thing. Deep learning gives you a power to create applications that were practically impossible with handcrafted systems. Not only that, the current incarnation of the field is only few years old. What an exciting prospect to be able to get into a new and powerful intellectual field when things are just taking off. It feels like a unique moment in human history.
Deep learning is supposed to be a very difficult and math heavy field. To really understand it you should have a good foundation in the related mathematics subjects of linear algebra, calculus and probability. Then to get your hands dirty you need programming experience so that you can implement deep learning models.
In the mathematics department, however, I am zero. My background isn’t in machine learning or mathematics. I studied Psychology for 4 years at University but only ever took a couple of papers in statistics. Maths felt very foreign to me until recently. I have never studied linear algebra at all and would look with horror and intrigue at the matrices on my friends’ assignments at university. It seemed impossible to learn higher level mathematics. Like, if you weren’t good in high school then it’s obviously not for you. You just don’t have a math brain sorry. I mostly avoided the maths side of programming when I learnt it. I could code just fine without it!
However a little while ago I discovered Jeremy Khun’s book: Programmer’s introduction to Mathematics (PIM). Reading PIM has given me a new perspective on what the process of understanding maths is actually like. The process of struggling through frustration to understand a mathematical concept or proof seems very similar to the addictive frustration to solution rhythm of programming. The amazing 3Blue1Brown has also opened my eyes to the beauty of maths. And at this point, I am excited to learn more of it.
But how do you find your way out? When I first learnt code I developed some strategies over time like building projects, taking regular breaks and always searching google. These worked, but I would still struggle to digest concepts. Overall the process was still a bit haphazard, with lots of forgetting then relearning concepts.
This time around I want to take a more active and methodical approach to learning. There are lots of decisions I will make about what and how to learn different content (What is the best way to learn Mathematics?). In writing this blog, I want to share my experience with what works for me. I am excited to add Anki, a spaced repetition tool, to my learning tool kit. More on that later.
If you look at any job add for a Deep Learning Research Engineer, it will probably say: “requires Masters / PhD in Statistics or Comp Sci”. You also see a similar thing for your average developer Jobs “requires Bachelors in Comp Sci”. It seems like you need a PhD in machine learning to get one of these Research jobs, or at least you should have practical skills comparable to someone with a PhD. I don’t think you need a PhD to get a job, just like you don’t need a BSc in CompSci to get a developer job. But I do think you should have the practical skills.
Can you really learn in 3 months what takes a PhD student years?
I think you can learn what a Deep Learning PhD student knows outside of academia. Ok, I admit the goal is a little preposterous for 3 months. I am willing to budge on the time factor, but I don’t think you need 3 - 5 years to get there.
I may have gotten the wrong idea from Jeremy Howard and Rachel Thomas. I found their deep learning for coders course in 2017 6 and took from it the idea that it was possible to learn math even if you were not good at it before AND that it’s possible to make meaningful contributions to deep learning research even if you didn’t do a PhD in Math.
With that blissful ignorance about the difficulty we can happily skip into my plan ⬇️
I hope to build a strong foundational knowledge in the practical methods for building deep learning systems and the mathematics involved. From there I want to build working state of the art models from scratch and understand them.
Beside each course you’ll see a number which is my arbitrary Full Time Effort metric.
1 point being the full focus of a university paper for 1 semester ( ~ 12 weeks ). Of course it is arbitrary, but my idea behind this is to quantify, like in Scrum, the complexity of the content and the effort I will spend on each course. Basically, the less effort the more content I will skip from the course.
The plan comes out to about 7.5 points. By the above definition this is nearly twice the amount of work I can fit into 12 weeks. I expect I will learn what is more important to focus on and change how I allocate my effort as I learn more. Let’s see what happens…
|- Part 1: Practical Deep Learning for Coders||1.5|
|- Part 2: Deep Learning from the Foundations||1.5|
|- Deep learning specialization 1||0.5||$45|
|- Programmers Introduction to Mathematics Book 2||0.5|
|- 18.01 Single Variable Calculus||0.2|
|- 18.02 Multivariable Calculus||0.4|
|- 18.06 Linear Algebra||0.7|
|- 18.065 Matrix Methods in Data analysis, signal processing and machine learning||0.7|
|- Essence of Linear Algebra & Calculus||0.5|
|- Linear Algebra & Calculus||0.3|
|- Swift - iOS app development 3||0.3||$11|
|- Python for data science and machine learning 4||0.3||$11|
|GPU server instance - AWS p2 instance <200 hours5||~$300|
I get asked this question by a lot by friends and family when I tell them my plan.
The answer is No. While I will get a certificate for completing deeplearning.ai (because I paid), I don’t trust this will be enough proof for anyone. Especially as this certificate will only account for a small amount of my overall learning.
Rightly or wrongly I will follow this philosophy for deep learning. Building models in production with code on Github, implementing papers in code and technical writing will be my certificate of proficiency for deep learning.
Good question. I would have the chance to basically do what I am doing but in a more supported environment with capable peers to collaborate with and learn from etc.
I am not sure if I could get straight into a program without any compsci degree. Apart from that, there are three reasons:
A Masters in NZ would cost at least 10k plus the opportunity cost of one year’s income. If I study online, I can cut this cost greatly and probably get access to on par course content from places like MIT open courseware.
The Cutting edge is online
I don’t want to learn MATLAB, or R. It seems to me that University Academia can very often be behind the cutting edge of Computer Science. Many of the things a Comp Sci grad needs for a developer job they will learn outside of uni.
I think we are in a unique moment in history: the field of machine learning has a majority of advanced content available for free on the internet (e.g paperswithcode)
I was super inspired by a great post by Michael Neilson on augmenting your cognition
I am excited to try and apply his strategies to my learning process over the next three months. Michael advocates using the Anki application and method to be remember more of what you spend your time learning. Anki is an open source spaced repetition flash card app. You add flash cards with questions and answers. Each time you practice a card you answer how difficult you found it to remember (hard, ok, easy ). Anki then uses the an algorithm based on the forgetting curve to remind you to rehearse each flashcard at the optimal time solidify it as a long term memory.
The main goal studying over the next 3 months is to remember what I learn. As Michael says: “Anki makes memory a choice, rather than a haphazard event, to be left to chance”. It is a seductive idea. That by using Anki you can have a better memory and therefore be more efficient with the time you spend learning.
My biggest takeaway from reading Michael’s post was his advice for starting to use Anki. And I would repeat it for anyone who is interested in trying it out. “Cultivate a taste in what to Ankify”. You should try to learn and understand what kind of information works well as Anki cards, when in the learning process you should add it to Anki and how to best write cards for different types of information. During my learning journey I plan to put a lot of time into experimenting with the best ways to use Anki to make my learning more efficient.
I hope to share my experiences with using Anki in future.
I was a little terrified to write this blog post (so many revisions 😫 ). My main motivation comes from the advice of Rachel Thomas and Jeremy Howard. The sentiment of sharing your knowledge even if you are a beginner in a subject.
If you are thinking about breaking into deep learning, I think you can do it.
Thanks for reading, hit me up on twitter ✌️
1 I intend to take the deeplearning.ai course as supplement to fastai practical deep learning courses
2 This book got me excited about the ideas and magic of mathematical proofs and the beauty behind it. It has helped to change my view of mathematics. I think of it like a foundational perspective on the subject, rather than a rigorous course to teach me many subjects.
3 Swift for Tensorflow seems important so I should learn swift
4 To fill out some my practical knowledge of the python datascience ecosystem matplotlib, numpy etc.
5 Unknown how much this will cost as you can also use Google Colab with a GPU/TPU for free!
6 When I found the course I was actually searching for resources to learn about AWS and the first course included a lot of fiddly bits about setting up a GPU on AWS. Which I loved