MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
All rejected candidates
B.
Only low-confidence rules
C.
The frequent itemsets from the current level
D.
The class confusion matrix
Show Answer
Correct Answer: C. The frequent itemsets from the current level
Explanation:
Lk is joined to form candidates Ck+1.
Infrequent candidates do not contribute because their supersets cannot be frequent.
Choose an option to check your answer.
A.
It changes item supports
B.
It makes all rules causal
C.
Consistent ordering enables maximum prefix sharing
D.
It creates target labels
Show Answer
Correct Answer: C. Consistent ordering enables maximum prefix sharing
Explanation:
Transactions with the same frequent items align on common initial paths.
Inconsistent ordering would fragment the representation.
Choose an option to check your answer.
A.
The proportion of predicted positives that are correct
B.
The proportion of actual negatives identified
C.
The proportion of actual positives correctly identified
D.
The total number of positive predictions
Show Answer
Correct Answer: C. The proportion of actual positives correctly identified
Explanation:
Recall is also called sensitivity or true positive rate.
It is important when missing a positive case is costly.
Choose an option to check your answer.
A.
They have no parameters
B.
They always contain one node
C.
Predictions can be traced through explicit feature-based rules
D.
They use only binary data
Show Answer
Correct Answer: C. Predictions can be traced through explicit feature-based rules
Explanation:
A root-to-leaf path forms an understandable if-then rule.
This transparency supports explanation and auditing.
Choose an option to check your answer.
A.
The probability of a feature after observing the class only
B.
The distance to the nearest neighbor
C.
The probability of a class before considering the current features
D.
The margin width
Show Answer
Correct Answer: C. The probability of a class before considering the current features
Explanation:
Class priors often come from class frequencies in training data.
They represent baseline class prevalence.
Choose an option to check your answer.
A.
Set the entire posterior to zero
B.
Replace every feature with the class prior
C.
Exclude that feature's likelihood contribution
D.
Choose a random class
Show Answer
Correct Answer: C. Exclude that feature's likelihood contribution
Explanation:
Under the model, available feature likelihoods can still contribute.
This avoids inventing a value solely for prediction.
Choose an option to check your answer.
A.
Training accuracy versus test accuracy
B.
Bias versus label cost
C.
Tree depth versus margin size
D.
Pattern coverage versus computational and interpretive complexity
Show Answer
Correct Answer: D. Pattern coverage versus computational and interpretive complexity
Explanation:
Lower support finds more and rarer patterns but increases workload and output volume.
Higher support is efficient but may overlook important associations.
Choose an option to check your answer.
A.
One-nearest neighbor
B.
Naive Bayes
C.
Linear regression
D.
FP-growth
Show Answer
Correct Answer: D. FP-growth
Explanation:
FP-growth avoids generating all candidate combinations explicitly.
Its compressed tree is often more efficient for dense transactional data.
Choose an option to check your answer.
A.
The proportion of actual positives identified
B.
The proportion of predicted negatives that are correct
C.
The total classification accuracy
D.
The proportion of actual negatives correctly identified
Show Answer
Correct Answer: D. The proportion of actual negatives correctly identified
Explanation:
Specificity is the true negative rate.
It measures how well the classifier avoids false alarms among negatives.
Choose an option to check your answer.
A.
Adding every possible feature split
B.
Converting the tree to a graph database
C.
Deleting the target class
D.
Removing branches that add complexity without sufficient generalization benefit
Show Answer
Correct Answer: D. Removing branches that add complexity without sufficient generalization benefit
Explanation:
Pruning reduces overfitting by simplifying weak or noisy branches.
It can improve test performance and interpretability.
Choose an option to check your answer.
A.
The probability that the model is correct
B.
The overall classification accuracy
C.
The number of training examples
D.
The probability of the observed features given a class
Show Answer
Correct Answer: D. The probability of the observed features given a class
Explanation:
Likelihood measures how compatible an observation's features are with a class.
It is combined with the prior to form a posterior.
Choose an option to check your answer.
A.
The model cannot predict classes
B.
The model needs no training data
C.
The model uses too many parameters
D.
Interactions among features are not represented directly
Show Answer
Correct Answer: D. Interactions among features are not represented directly
Explanation:
When feature combinations carry information beyond individual effects, naive factorization can miss it.
Feature engineering or more flexible models may perform better.