What constitutes over-fitting, and how can it be prevented? Elaborate with a suitable example.
Define perceptron. Explain K-means clustering algorithm.
Discuss the strengths and weaknesses of the Naive Bayes algorithm compared to other classification algorithms like Support Vector Machines.
Why dimensionality reduction is useful? Considering a Long List of Machine Learning Algorithms, given a Data Set, How Do You Decide Which One to Use? Justify your answer.
What do you mean by ROC curve? In a credit card fraud detection system, the algorithm flagged 50 transactions as fraudulent. Out of these flagged transactions, 45 were indeed fraudulent. Additionally, the algorithm didn't flag 30 fraudulent transactions. Calculate the precision, recall and F1-score for the fraud detection system.
What do you mean by hierarchical clustering? How can regression be used to find best fit line?
Write short notes on (any two):
a) Supervised Vs. Unsupervised Learning
b) Sensitivity and Specificity
c) Bayesian Belief Network
Attempt any TWO questions
[2x10=20]What is the need of confusion matrix? Explain the frame work for building machine learning system.
Define entropy and information gain. Construct decision tree (ID3 Algorithm)for the following:
| Day | Outlook | Temperature | Humidity | Wind | Decision |
|---|---|---|---|---|---|
| 1 | Sunny | Hot | High | Weak | Yes |
| 2 | Sunny | Hot | High | Strong | No |
| 3 | Overcast | Hot | High | Weak | Yes |
| 4 | Rain | Mild | High | Weak | No |
| 5 | Rain | Cold | Normal | Weak | Yes |
What are bias -variance trade off? Consider the table given below and use K-NN algorithm to evaluate result of Sija= {Machine Learning=60 GIS=80}
| Day | Outlook | Temperature | Humidity | Wind | Decision |
|---|---|---|---|---|---|
| 1 | Sunny | Hot | High | Weak | Yes |
| 2 | Sunny | Hot | High | Strong | No |
| 3 | Overcast | Hot | High | Weak | Yes |
| 4 | Rain | Mild | High | Weak | No |
| 5 | Rain | Cold | Normal | Weak | Yes |