Friday, January 30, 2015

Learning Machine Learning

Lately I have spent a lot of time re-learning Machine Learning from scratch (to both reinforce what I once knew but forgot, as well as to build more extensive data analytics and model building muscle that is ever so useful at work these days). This ties in well with my interest in biologically inspired algorithms (genetic algorithms etc), variants of which are widely used in fields like Finance - for example if you build a Stochastic Volatility model for Option Pricing like the Heston or SABR, or the Bates model for Stochastic Volatility with Jump Diffusion, typically you will have to use an algorithm like differential evolution which is a variant (or maybe a special kind) of a genetic algorithm. You will also have to use techniques like partial functions also called currying or schonfinkelization.

Anyway, here is what I have done so far. This is one of but many paths possible to learn this material. Each line here takes 30+ hours to get through, but you will improve your understanding of both the underlying mathematics as well as your ability to actually implement the ideas in a work setting if you spend the time here. Here's the path that worked for me. Yes, the material in some of the lectures below overlaps, but I learn best when I see the same material presented by different people who come at it from different angles.

In terms of prerequisites, a liking for mathematics and a somewhat minimal knowledge of multivariate algebra, multivariate calculus (at least basic vector calculus including the notions of div, and grad would be useful), and some multivariate statistics would be useful. Completing MIT OCW courses like 6.004 and 6.042J or equivalently, the algorithms class offered by Tim Roughgarden (Stanford) or Sedgewick (Princeton) would be good. Those, and of course, the desire to work hard through the material when things get a little difficult.

(*) are next to what I felt were the best quality courses. Of course, your mileage may vary. Some of these are difficult and require serious work.
  1. (*) Trevor Hastie and Robert Tibshirani's Lectures at Stanford - very accessible even without too much of a math background. This is truly phenomenal. And what great guys, their textbooks are legally free to download. Hats off to them!
  2. (*) Yaser Abu Mostafa's extremely well-done lectures at Caltech on Machine Learning also cover lots of theory with mathematics (learning theory sections are a bit challenging, but very necessary). Very clear explanations.
  3. (*) Professor Andrew Ng's lectures again at Stanford have more of a practical feel to them. Again extremely well done. I took the actual Stanford class (slightly more difficult, but totally worth it), not the somewhat diluted Coursera one.
  4. (*) Plan on re-taking Professor Koller's Stanford course on Probabilistic Graphical Models - demands a lot of work and paying close attention, very challenging at times, but definitely worth the effort. Re-taking to ensure I understand everything correctly.
  5. (*) Geoffrey Hinton's lectures from the University of Toronto on Neural Nets. These are on Coursera, the content is excellent and extremely clear.
  6. (*) Coursera lectures on Mining Massive Datasets by Anand Rajaraman and Jeffrey Ullman that follow along the lines of their free book published some time ago. This course is very good but requires quite a bit of work.
  7. (*) Taking MIT 6.006 (Introduction to Algorithms), 6.034 (Introduction to AI), and 6.042J (Mathematics for Computer Science). Love Srini Devadas,Tom Leighton, and the other instructors - very gifted teachers.
  8. A Practical Machine Learning course offered by Johns Hopkins. This was quite a bit easier after all of the above.
  9. Completed the University of Washington Machine Learning on Coursera - interesting difference here is that it is case-study based and application oriented. Also quite a bit easier than the starred courses above.
  10. I plan on working my way through MIT lectures on Advanced Probability Reasoning taught by Dr. Tsitsiklis (6.041), and the Harvard CS109 class on Statistics and Analytics also online.
I have already made significant use of the methods at work - building models utilizing techniques from Neural Networks, Support Vector Machines, and Logistic Regression, and overall the lectures were so clear I understood exactly what I did in each case, what decisions I made, and why.

I will update the post with more material as I learn more machine learning.

No comments:

Post a Comment