MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
Correct Answer: D. Combining frequent itemsets that share the required prefix to form larger candidates
Explanation:
Compatible (k-1)-itemsets are united to create k-itemset candidates.
Prefix ordering prevents many duplicates.
Choose an option to check your answer.
Correct Answer: A. An unavailable or unrecorded attribute value
Explanation:
Missingness means the true value is not present in the dataset.
It should not automatically be interpreted as zero.
Choose an option to check your answer.
Correct Answer: A. Replacing values in each bin with that bin's mean
Explanation:
This method reduces local noise by representing nearby values with a common summary.
It also sacrifices within-bin detail.
Choose an option to check your answer.
Correct Answer: A. An itemset whose support meets or exceeds a minimum threshold
Explanation:
Frequency is determined by the user-defined minimum support.
Only qualifying itemsets are used to generate standard association rules.
Choose an option to check your answer.
Correct Answer: A. A maximal itemset has no frequent superset, while a closed itemset has no superset with equal support
Explanation:
Every maximal frequent itemset is closed, but not every closed itemset is maximal.
Closed sets retain more support information.
Choose an option to check your answer.
Correct Answer: A. Removing a candidate if any of its required subsets is infrequent
Explanation:
A candidate with an infrequent subset cannot be frequent.
This follows directly from the Apriori property.
Choose an option to check your answer.
Correct Answer: B. Missing completely at random
Explanation:
Under MCAR, missing records form an unbiased subset with respect to the variables.
This is a strong assumption and often unrealistic.
Choose an option to check your answer.
Correct Answer: B. An observation that differs markedly from most other observations
Explanation:
Outliers may reflect error, rare events, or legitimate extremes.
They should be investigated rather than automatically removed.
Choose an option to check your answer.
Correct Answer: B. X → Y, where X and Y are disjoint itemsets
Explanation:
The antecedent X implies an increased occurrence of consequent Y.
The two sides must not share items in a standard rule.
Choose an option to check your answer.
Correct Answer: B. To filter rules that do not predict the consequent reliably enough
Explanation:
Confidence evaluates conditional rule strength.
Rules below the chosen threshold are excluded from the final set.
Choose an option to check your answer.
Correct Answer: B. Support counts are collected separately for candidates of increasing size
Explanation:
Each level typically generates new candidates and counts their occurrences.
This scan-intensive behavior can be expensive for large datasets.
Choose an option to check your answer.
Correct Answer: C. When the variable is strongly skewed or contains influential outliers
Explanation:
The mean can be unrepresentative for skewed data.
Mean imputation also reduces variance and can distort relationships.