Machine Learning Interview Questions

 


  1. Why reasons resulted in Machine learning introduction?

The simplest answer is for making our lives easier. In the early days of intelligent applications, numerous systems depended on hardcode rules of “if” and “else” decisions for processing data or adjusting the user input. Imagine spam filter whose job is to move the right incoming email messages to a spam folder.

With machine learning algorithms, one is offered ample information for the data to learn and identify patterns from the data. One is not required to write new rules for each problem in machine learning.


2. What are several Types of Machine Learning algorithms?

There are several machine learning algorithms. Broadly speaking Machine learning algorithms are divided in supervised, unsupervised, and reinforcement learning.

 

3.What is Supervised Learning?

Supervised learning simply putmachine learning algorithm of deducing a function from labelled training data. Some of the supervised learning algorithms are:

  • Support Vector Machines
  • Regression
  • Naive Bayes
  • Decision Trees

4. What is Unsupervised Learning?

Unsupervised learning is second type of ML algorithm considered for finding patterns on the set of data provided. In this one does not have to dependent on variable or label to predict. 

Unsupervised learning algorithms include:

  • Clustering,
  • Anomaly Detection,
  • Neural Networks and Latent Variable Models.

In case you wish to gain more clarity then machine learning coding bootcamp can offer you the right guidance for successful career opportunities.

5. What is ‘Naive’ concept in Naive Bayes?

Naive Bayes methodology is a supervised learning algorithm; it is naive as it makes supposition by applying Bayes’ theorem that all characteristics are independent of each other. Consult a machine learning bootcamp to understand the technique and further tools for cracking the interview.

6. What is PCA? When do you use it?

Principal component analysis (PCA) is the most commonly used for dimension reduction and measures the variation in each variable. If there is little alteration, it throws the variable out.

Principal component analysis makes the dataset easy to visualize, and is used in finance, neuroscience, and pharmacology. It is further useful in pre-processing stage, when linear correlations are present between features. Consider coding bootcamp for learning tools and techniques.

7. Explain SVM Algorithm.

A SVM or Support Vector Machine is a strong and versatile supervised machine learning model, capable of performing linear or non-linear classification, outlier detection and regression.

8. What are Support Vectors in SVM?

Support Vector Machine (SVM) is an algorithm which makes fitting line between different classes that maximizes the distance from line to the points of the classes. In this manner, it tries to find a robust separation between classes. Support Vectors are points of edge of dividing hyper plane.

9. What are Different Kernels in SVM?

There are 6 types of kernels in SVM however, following four are widely used:

  1. Linear Kernel- used when data is linearly separable.
  2. Polynomial kernel - When one has discrete data that has no natural notion of efficiency.
  3. Radial basis kernel - Is used for creating a decision boundary for doing a better job of separating two classes compared to the the linear kernel.
  4. Sigmoid kernel - Is used as an activation function for neural networks.

10. What is Cross-Validation?

Cross-Validation is a method of splitting data in three parts- training, validation and testing. Data is split into K subsets, and models have trained on k-1 of the datasets. The last subset is held for testing and is conducted for each of the subsets. This is k-fold cross-validation. Lastly, the scores from all the k-folds are averaged for producing final score.

11. What is Bias in Machine Learning?

Bias in data indicates there is inconsistency in data. The inconsistency may be cause due to several reasons which are not reciprocally exclusive.

12. What is the Difference Between Classification and Regression?

Classification is used for producing discrete results whereas, classification is used for classifying data into some definite categories.

13. Define Precision and Recall?

Precision and recall are ways of monitoring power of machine learning implementation. But these are often used at the same time. Precision may inspect relevance whereas recall answers the questions. Basically, the meaning of precision is the fact of being exact and accurate. Same goes in machine learning models as well. In case one has set of items that model needs to predict to be relevant then it could answer how many items are truly relevant.

15. How to Tackle Overfitting and Underfitting?

Overfitting means model fitted for training data well, in this case, one needs to resample the data and estimate model accuracy using techniques like K-fold cross-validation. Whereas in case of underfitting one is not able to understand or capture the patterns from data, in such case, one needs to change the algorithms, or one needs to feed more data points in the model for accuracy.

16. What is a Neural Network?

Neural Network to put in simple words is model of human brain. Much like brain, it has neurons that activate when encountering something relatable. Different neurons are connected via connections which help information flow from one neuron to another.

17. What is Ensemble learning?

Ensemble learning is a method that joins multiple machine learning models for creating powerful models. 

There are numerous reasons for a mode to be different. Some are:

  • Different Hypothesis
  • Different Population
  • Different Modelling techniques

When working with model’s training and testing data, one can experience an error. This error might be bias, irreducible error or variance.

Now model should have a balance between bias and variance, this one call a bias-variance trade-off. This ensemble learning is a manner to perform this trade-off. There are numerous ensemble techniques available but when aggregating multiple models there are general two methods- Bagging and Boosting.

18 . How does one make sure which Machine Learning Algorithm to use?

It solely depends on the dataset one has. If the data is discrete one makes use of SVM. If the dataset is continuous one uses linear regression. So, while there is no specific way for knowing which ML algorithm to use, it entirely depends on the exploratory data analysis (EDA)

19. How to Handle Outlier Values?

An outlier is an action in the dataset which is far away from other observations in the dataset. Tools used for discovering outlier are:

Z-score

Box plot

Z-score

Scatter plot

Conclusion,

The above listed questions cover the basics of machine learning. With the advancement in machine learning growing rapidly so in case one has to consider joining the communities, and cracking the interview machine learning bootcamp is the way forward.

Also, Read This Blog: Evolution of machine learning

Comments

Popular posts from this blog

Machine Learning a Great Career Pathway

What’s the Difference Between Artificial Intelligence, Machine Learning and Deep Learning?

Skills You Need To Be A Machine Learning Programmer!