MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
{A, B, C}
B.
{A} only
C.
{B, C} only
D.
{A, A, B, C}
Show Answer
Correct Answer: A. {A, B, C}
Explanation:
The two itemsets share the ordered prefix A and differ in the final item.
Their union forms the candidate {A, B, C}.
Choose an option to check your answer.
A.
It preserves the support information needed for frequent itemsets
B.
It stores every raw attribute value unchanged
C.
It preserves transaction order in time
D.
It stores all infrequent items
Show Answer
Correct Answer: A. It preserves the support information needed for frequent itemsets
Explanation:
Although transactions are compressed, shared-prefix counts and node links retain relevant frequency information.
The original transaction list need not be reproduced.
Choose an option to check your answer.
A.
The proportion of all predictions that are correct
B.
The proportion of positives detected
C.
The proportion of predicted positives that are correct
D.
The average probability assigned to classes
Show Answer
Correct Answer: A. The proportion of all predictions that are correct
Explanation:
Accuracy equals correct predictions divided by total predictions.
It can be misleading when classes are highly imbalanced.
Choose an option to check your answer.
A.
The probability of misclassification when labels follow the node's class distribution
B.
The average distance to a centroid
C.
The confidence of an association rule
D.
The number of missing values
Show Answer
Correct Answer: A. The probability of misclassification when labels follow the node's class distribution
Explanation:
Gini impurity is zero for a pure node.
It increases as class proportions become more mixed.
Choose an option to check your answer.
A.
Features are conditionally independent given the class
B.
Classes are equally likely
C.
All features are normally distributed
D.
Training examples are unlabeled
Show Answer
Correct Answer: A. Features are conditionally independent given the class
Explanation:
Naive Bayes factorizes the class-conditional likelihood across features.
The independence assumption is often unrealistic but computationally useful.
Choose an option to check your answer.
A.
Agreement between predicted probabilities and observed outcome frequencies
B.
Ordering features by importance
C.
Choosing the largest class
D.
Scaling attributes to zero mean
Show Answer
Correct Answer: A. Agreement between predicted probabilities and observed outcome frequencies
Explanation:
Among cases assigned probability 0.8, roughly 80 percent should be positive if calibrated.
A classifier may rank well while producing poorly calibrated probabilities.
Choose an option to check your answer.
A.
Frequent subsets guarantee confidence one
B.
Any infrequent subset proves the candidate cannot meet minimum support
C.
Subsets determine class labels
D.
All subsets must have equal support
Show Answer
Correct Answer: B. Any infrequent subset proves the candidate cannot meet minimum support
Explanation:
This is the candidate prune step based on anti-monotonicity.
It prevents counting candidates that cannot possibly be frequent.
Choose an option to check your answer.
A.
It represents the most frequent item
B.
It is a null starting node for all transaction paths
C.
It stores the class label
D.
It contains the minimum support value only
Show Answer
Correct Answer: B. It is a null starting node for all transaction paths
Explanation:
All filtered transaction paths begin under the root.
The root itself does not correspond to an item.
Choose an option to check your answer.
A.
The proportion of actual positives detected
B.
The proportion of predicted positives that are truly positive
C.
The proportion of all records predicted correctly
D.
The proportion of negatives detected
Show Answer
Correct Answer: B. The proportion of predicted positives that are truly positive
Explanation:
Precision measures the reliability of positive predictions.
It is important when false positives are costly.
Choose an option to check your answer.
A.
The split guaranteeing the globally smallest tree
B.
The split giving the best immediate impurity reduction
C.
A random split in every case
D.
The split using the last feature
Show Answer
Correct Answer: B. The split giving the best immediate impurity reduction
Explanation:
Standard tree induction makes locally optimal choices at each node.
It does not usually search every possible complete tree.
Choose an option to check your answer.
A.
The central limit theorem
B.
Bayes' theorem
C.
The Apriori principle
D.
The triangle inequality
Show Answer
Correct Answer: B. Bayes' theorem
Explanation:
Bayes' theorem combines prior class probability with feature likelihood.
The result is a posterior probability for each class.
Choose an option to check your answer.
A.
Use training accuracy only
B.
Assess both discrimination and calibration
C.
Check itemset support
D.
Inspect cluster silhouettes
Show Answer
Correct Answer: B. Assess both discrimination and calibration
Explanation:
Risk decisions depend on probability meaning, not just class ranking.
Calibration plots and proper scoring rules complement AUC.