Learning Machine Learning

Written for some coworkers who wanted to learn deep learning

How to get started

Paper reading

How to get started

I spent some time learning classical ML first since it was most relevant for my job. You can learn deep learning first without any other ML experience/knowledge.

ML resources

I started off with a homemade ML in 10 weeks course. TL;DR, here’s the course, using content primarily from Hands-On Machine Learning with Scikit-Learn and TensorFlow and Andrew Ng’s Coursera course on ML:

- Chapter 2 End-to-End Machine Learning Project
- Chapter 3 Classification (precision/recall, multiclass)
- Text feature extraction (from sklearn docs)
- Chapter 4 Training Models (linear/logistic regression, regularization)
- Advice for Applying Machine Learning
- Chapter 5 SVMs (plus kernels)
- Chapter 6 Decision Trees (basics)
- Chapter 7 Ensemble Learning and Random Forests (xgboost, RandomForest)
-  Chapter 8 Dimensionality Reduction (PCA, t-SNE, LDA)
- Machine Learning System Design
(Google) Best Practices for ML Engineering A group of friends and I worked through this content at a cadence of one meeting every other Wednesday starting late June 2018 wrapping up at the end of 2018.

Deep learning resources

NLP resources

  • Kyunghyun Cho’s lecture notes on “Natural Language Processing with Representation Learning”: https://github.com/nyu-dl/NLP_DL_Lecture_Note/blob/master/lecture_note.pdf
  • Jacob Eisenstein’s textbook on “Natural Language Processing” (https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)

Brushing up on math

It’s easy to get intimated by the math in papers. I found that taking the time to relearn linear algebra and some calculus has had compounding returns!

Paper reading

Once you’ve understood common concepts, the best way to keep up to date with research and continue learning beyond courses is by reading and reimplementing papers.

How to manage papers

I recommend you track papers either through Zotero or Mendeley. I started off using Zotero but switched Mendeley to share folders/papers in groups I was in. I don’t have a strong opinion on which one is better.

How to figure out what to read? Check out these sources:

  • twitter - follow 20+ practitioners/researchers you admire on twitter to find interesting papers
  • ML subreddit
  • AI/DL fb groups
  • arXiv - there’s 10-20 new papers on arXiv every day for AI/computational linguistics so you could just browse arXiv every day for the latest papers in the topics you’re most interested in
  • AI blogs

How to read a paper:

  • your objective is to figure out quickly which papers NOT to read
  • spend time in the conclusions
  • try to answer the question what is novel?
  • create a reading group! Even just one other person can already save you 50% of the time.