MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
It mines frequent patterns without generating a large candidate set
B.
It requires labeled class data
C.
It computes only one-itemsets
D.
It never scans the database
Show Answer
Correct Answer: A. It mines frequent patterns without generating a large candidate set
Explanation:
FP-growth compresses transactions into an FP-tree and mines the structure recursively.
This avoids Apriori's candidate explosion.
Choose an option to check your answer.
A.
No, it still uses minimum support to define frequent items and patterns
B.
Yes, all patterns are always returned
C.
Yes, because counts are unnecessary
D.
No, because it uses class labels instead
Show Answer
Correct Answer: A. No, it still uses minimum support to define frequent items and patterns
Explanation:
Minimum support remains the criterion for pattern frequency.
The algorithm changes the search strategy, not the definition.
Choose an option to check your answer.
A.
The harmonic mean of precision and recall
B.
The arithmetic mean of accuracy and specificity
C.
The difference between precision and recall
D.
The square of recall
Show Answer
Correct Answer: A. The harmonic mean of precision and recall
Explanation:
F1 balances precision and recall in one measure.
The harmonic mean is low when either component is low.
Choose an option to check your answer.
A.
Stopping tree growth before leaves become overly specific
B.
Growing a full tree and trimming it later
C.
Removing training examples
D.
Pruning input features before collecting data
Show Answer
Correct Answer: A. Stopping tree growth before leaves become overly specific
Explanation:
Pre-pruning uses conditions such as maximum depth or minimum node size.
It prevents excessive complexity during induction.
Choose an option to check your answer.
A.
The class with the highest posterior probability
B.
The class with the most features
C.
The nearest class centroid only
D.
The class with the smallest prior
Show Answer
Correct Answer: A. The class with the highest posterior probability
Explanation:
Posterior probabilities are compared across classes.
The maximum a posteriori class is selected.
Choose an option to check your answer.
A.
By using the classes of the k closest training observations
B.
By fitting a global linear equation
C.
By mining frequent itemsets
D.
By growing a decision tree
Show Answer
Correct Answer: A. By using the classes of the k closest training observations
Explanation:
KNN stores the training data and performs local voting at prediction time.
The chosen distance metric defines closeness.
Choose an option to check your answer.
A.
One for every possible itemset
B.
Two
C.
Exactly one in all implementations
D.
No scans
Show Answer
Correct Answer: B. Two
Explanation:
The first scan finds frequent items and their order.
The second inserts filtered, ordered transactions into the tree.
Choose an option to check your answer.
A.
It stores no counts
B.
The FP-tree and conditional trees may not fit comfortably in memory
C.
It requires one model per transaction
D.
It can process only one item
Show Answer
Correct Answer: B. The FP-tree and conditional trees may not fit comfortably in memory
Explanation:
Large, weakly compressible datasets can produce substantial tree structures.
Disk-based or distributed variants may be required.
Choose an option to check your answer.
A.
When classes are perfectly balanced
B.
When one class greatly outnumbers the other
C.
When every prediction is probabilistic
D.
When there are only two features
Show Answer
Correct Answer: B. When one class greatly outnumbers the other
Explanation:
A model can achieve high accuracy by predicting only the majority class.
Precision, recall, F1, ROC-AUC, or PR-AUC may be more informative.
Choose an option to check your answer.
A.
Stopping at the root
B.
Growing a larger tree and then removing weak subtrees
C.
Normalizing features after training
D.
Changing class labels after prediction
Show Answer
Correct Answer: B. Growing a larger tree and then removing weak subtrees
Explanation:
Post-pruning evaluates whether replacing subtrees with simpler leaves improves estimated generalization.
It often produces better control than very early stopping.
Choose an option to check your answer.
A.
Multinomial Naive Bayes
B.
Gaussian Naive Bayes
C.
Bernoulli Naive Bayes
D.
Apriori Bayes
Show Answer
Correct Answer: B. Gaussian Naive Bayes
Explanation:
Gaussian Naive Bayes estimates a mean and variance per feature and class.
It evaluates each continuous value using a normal density.
Choose an option to check your answer.
A.
It ignores training data
B.
It postpones most computation until prediction time
C.
It never computes distances
D.
It uses only one feature
Show Answer
Correct Answer: B. It postpones most computation until prediction time
Explanation:
KNN does not fit an explicit global model during training.
It retains examples and searches neighbors for each new case.