MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
Training the first candidate model
B.
Removing all outliers
C.
Drawing a summary histogram
D.
Integrating a validated model into an operational process
Show Answer
Correct Answer: D. Integrating a validated model into an operational process
Explanation:
Deployment makes model outputs available for real decisions or automated actions.
It also requires monitoring, maintenance, and governance.
Choose an option to check your answer.
A.
Values between zero and one in every sample
B.
Only positive integer values
C.
Equal category frequencies
D.
Values centered at zero with unit standard deviation
Show Answer
Correct Answer: D. Values centered at zero with unit standard deviation
Explanation:
Each value is transformed by subtracting the mean and dividing by the standard deviation.
This supports comparison across variables with different units.
Choose an option to check your answer.
A.
To guarantee every feature is normal
B.
To increase test-set size
C.
To make labels unnecessary
D.
To prevent information from the test set leaking into model development
Show Answer
Correct Answer: D. To prevent information from the test set leaking into model development
Explanation:
Means, scales, and imputation values estimated from all data expose the model to test information.
A proper pipeline fits these steps only on training data.
Choose an option to check your answer.
A.
The support of X with the support of Y only
B.
The number of rules with the number of items
C.
The variance of transaction sizes
D.
The expected frequency of X without Y under independence with the observed frequency
Show Answer
Correct Answer: D. The expected frequency of X without Y under independence with the observed frequency
Explanation:
Conviction focuses on rule violations where X occurs without Y.
Higher values indicate fewer violations than expected under independence.
Choose an option to check your answer.
A.
Removing items always decreases support
B.
Support and confidence are identical
C.
All itemsets have positive lift
D.
Adding items to an itemset cannot increase its support
Show Answer
Correct Answer: D. Adding items to an itemset cannot increase its support
Explanation:
A transaction containing a larger itemset must contain all its subsets.
Therefore a superset's support is never greater.
Choose an option to check your answer.
A.
Changing item names during mining
B.
Recalculating only confidence
C.
Counting one-itemsets after long itemsets
D.
Introducing and counting new candidates before completing a full database pass
Show Answer
Correct Answer: D. Introducing and counting new candidates before completing a full database pass
Explanation:
Dynamic counting reduces the rigid level-by-level waiting of standard Apriori.
Candidates can enter the process at checkpoints within a scan.
Choose an option to check your answer.
A.
Data distributions and relationships may change over time
B.
The training data become larger automatically
C.
Model parameters cannot be saved
D.
Monitoring increases the number of labels
Show Answer
Correct Answer: A. Data distributions and relationships may change over time
Explanation:
Concept drift and data drift can reduce model performance after deployment.
Monitoring helps detect degradation and trigger retraining or review.
Choose an option to check your answer.
A.
Distance calculations are otherwise dominated by large-scale features
B.
KNN requires binary attributes only
C.
Standardization creates labels
D.
KNN cannot handle positive values
Show Answer
Correct Answer: A. Distance calculations are otherwise dominated by large-scale features
Explanation:
KNN bases predictions on proximity in feature space.
Unscaled variables with large numerical ranges can overwhelm other features.
Choose an option to check your answer.
A.
A collection of one or more items
B.
A sequence of class labels
C.
A set of cluster centroids
D.
A list of numeric predictions
Show Answer
Correct Answer: A. A collection of one or more items
Explanation:
An itemset represents items considered together in transactional data.
For example, {bread, milk} is a two-itemset.
Choose an option to check your answer.
A.
Lift
B.
Confidence
C.
Conviction
D.
Rule direction
Show Answer
Correct Answer: A. Lift
Explanation:
Lift depends on the joint support divided by the product of marginal supports.
Reversing X and Y leaves this value unchanged.
Choose an option to check your answer.
A.
It prunes all supersets containing that itemset
B.
It generates every possible superset
C.
It converts it to a class label
D.
It raises its support count
Show Answer
Correct Answer: A. It prunes all supersets containing that itemset
Explanation:
By anti-monotonicity, no superset of an infrequent set can be frequent.
Pruning avoids unnecessary support counting.
Choose an option to check your answer.
A.
A change in file format only
B.
A change over time in the relationship between predictors and the target
C.
A reduction in storage capacity
D.
A random change in class names
Show Answer
Correct Answer: B. A change over time in the relationship between predictors and the target
Explanation:
When the predictive relationship changes, an old model may no longer be valid.
Examples include evolving fraud strategies or customer behavior.