This is a very basic set of questions in Machine Learning. This post is incomplete, more questions to be added over time.
- What is bias? variance? What is the bias-variance trade-off?
- What is meant by the term "the curse of dimensionality"?
- Explain over-fitting with an example
- Explain Bayes theorem with an example
- What is a V-C bound?
- What is the Hoeffding inequality? What does it mean?
- What are parametric, semi-parametric, and non-parametric methods? Give an example of each.
- What is supervised learning? unsupervised learning? reinforcement learning? online learning?
- Briefly describe the perceptron model and how it works
- What would you look at as you design a linear regression and read its output?
- What is multicollinearity? Heteroskedasticity?
- What is a confidence interval? t-statistic? p-value?
- What is Laplacian smoothing? When and how would you use it?
- What is a logistic regression? Where would you use it? How do you read its output?
- Why is logistic regression a useful technique? Which kinds of data distributions does it work best with?
- What is a generalized additive model?
- What is gradient descent? Explain stochastic gradient descent. What assumptions does it make?
- How would you influence the rate of convergence of stochastic gradient descent?
- What is Hill Climbing? Give an example of where you would use this.
- What is cross-validation? Describe two different ways of performing this.
- Describe three distinct ways of dealing with data sets that have more features than samples available.
- What are neural networks? How do they work? What are the different kinds of neural nets?
- What is deep learning?
- Describe how one might design the structure of a neural net to solve a problem.
- What are the different techniques of optimizing weights to reflect the learning a neural network undergoes as it is trained with a data set? (e.g. genetic algorithms, particle swarm optimization, back propagation)
- What is maximum likelihood estimation? How would you use it?
- Describe a naive Bayes model with an example.
- What is a Probabilistic Graphical Model (PGM)? Where might it be used?
- What are kernel methods? Give examples of their use (e.g. kernel smoothing, support vector machines)
- What are genetic algorithms? Illustrate with an example where and how these can be used.
- What are support vector machines (SVMs)? How can they be built and used?
- What is a self organizing map? How does it work? What is vector quantization?
- You are given a million points in 10 dimensional space. I have a hypothesis that these points are clustered in 8 groups. I want to test this hypothesis. What methods would you use and why?
- What is the complexity of the method you would use for the above?
- Perform a relative computational complexity analysis of K-means clustering vs. Hierarchical Clustering. Does it make a difference to your analysis if you do agglomerative hierarchical clustering vs divisive hierarchical clustering?
- The clusters you see when you run your algorithms on my data are elongated hyper-ellipsoids. What does this tell you about the data? What would you do in this situation to make inferencing easier?