MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
All frequent items
B.
Every item appearing in more than one transaction
C.
Items that do not meet minimum support
D.
The transaction identifier only
Show Answer
Correct Answer: C. Items that do not meet minimum support
Explanation:
Infrequent items cannot belong to a frequent itemset.
Removing them reduces the tree and preserves relevant pattern information.
Choose an option to check your answer.
A.
It changes the meaning of support
B.
It ignores long itemsets
C.
It reduces database scans and avoids explicit candidate counting
D.
It assumes every item is independent
Show Answer
Correct Answer: C. It reduces database scans and avoids explicit candidate counting
Explanation:
Apriori repeatedly generates and counts candidate levels.
FP-growth reuses the compressed tree for recursive mining.
Choose an option to check your answer.
A.
Precision against recall only
B.
Training error against tree depth
C.
True positive rate against false positive rate across thresholds
D.
Support against confidence
Show Answer
Correct Answer: C. True positive rate against false positive rate across thresholds
Explanation:
The ROC curve summarizes discrimination as the decision threshold changes.
It shows the trade-off between detecting positives and raising false alarms.
Choose an option to check your answer.
A.
It cannot represent nonlinear relationships
B.
It requires standardized features
C.
It can create leaves for noise and rare training cases
D.
It always predicts the majority class
Show Answer
Correct Answer: C. It can create leaves for noise and rare training cases
Explanation:
Deep branches may capture accidental patterns unique to training data.
Simplification reduces this variance.
Choose an option to check your answer.
A.
Gaussian Naive Bayes
B.
K-median Bayes
C.
Multinomial Naive Bayes
D.
Support-vector Bayes
Show Answer
Correct Answer: C. Multinomial Naive Bayes
Explanation:
Multinomial Naive Bayes models count or frequency data.
It is widely used for document and text classification.
Choose an option to check your answer.
A.
All training points vote
B.
The majority class is always predicted
C.
The class of the single nearest training point is predicted
D.
The closest feature is selected
Show Answer
Correct Answer: C. The class of the single nearest training point is predicted
Explanation:
One-nearest neighbor creates highly flexible local boundaries.
It can be sensitive to noise and mislabeled examples.
Choose an option to check your answer.
A.
Alphabetically only
B.
By ascending confidence
C.
Randomly for every transaction
D.
By descending global support
Show Answer
Correct Answer: D. By descending global support
Explanation:
A common global order allows transactions to share prefixes.
Descending frequency generally maximizes compression.
Choose an option to check your answer.
A.
The confidence of a rule ending at the node
B.
The number of classes
C.
The distance from the root
D.
The number of transactions sharing that prefix up to the node
Show Answer
Correct Answer: D. The number of transactions sharing that prefix up to the node
Explanation:
Counts aggregate repeated transaction prefixes.
A node count can differ from counts of its descendants because fewer transactions continue farther.
Choose an option to check your answer.
A.
The accuracy at one fixed threshold
B.
The proportion of positive cases in the sample
C.
The calibration error only
D.
The probability that a random positive receives a higher score than a random negative
Show Answer
Correct Answer: D. The probability that a random positive receives a higher score than a random negative
Explanation:
AUC measures ranking discrimination over all thresholds.
A value of 0.5 corresponds to random ranking.
Choose an option to check your answer.
A.
By requiring a linear equation only
B.
By using no features
C.
By sorting classes alphabetically
D.
By combining multiple axis-aligned splits
Show Answer
Correct Answer: D. By combining multiple axis-aligned splits
Explanation:
A sequence of feature thresholds partitions the space into rectangular regions.
These regions can approximate complex nonlinear boundaries.
Choose an option to check your answer.
A.
Gaussian Naive Bayes
B.
Ordinal Naive Bayes only
C.
Hierarchical Naive Bayes
D.
Bernoulli Naive Bayes
Show Answer
Correct Answer: D. Bernoulli Naive Bayes
Explanation:
Bernoulli Naive Bayes models each feature as a binary event.
It is useful when presence and absence both carry information.
Choose an option to check your answer.
A.
High bias and very smooth boundaries
B.
No dependence on training noise
C.
Guaranteed optimal accuracy
D.
Low bias but high variance
Show Answer
Correct Answer: D. Low bias but high variance
Explanation:
Small neighborhoods closely follow local training detail.
This flexibility can overfit noise.