MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
A single support threshold may miss meaningful patterns involving infrequent items
B.
Rare items always have high confidence
C.
Infrequent items cannot appear in rules
D.
Rare items make transactions identical
Show Answer
Correct Answer: A. A single support threshold may miss meaningful patterns involving infrequent items
Explanation:
Different items may have very different baseline frequencies.
Multiple minimum supports or specialized methods can better handle rare but important items.
Choose an option to check your answer.
A.
The set of candidate k-itemsets
B.
The set of class labels
C.
The clustering result at iteration k
D.
The confidence of rule k
Show Answer
Correct Answer: A. The set of candidate k-itemsets
Explanation:
Ck contains itemsets being considered at level k.
After support filtering, qualifying sets form Lk.
Choose an option to check your answer.
A.
Removing all duplicate features
B.
Combining data from multiple sources into a consistent dataset
C.
Sorting one table by a key
D.
Training several classifiers
Show Answer
Correct Answer: B. Combining data from multiple sources into a consistent dataset
Explanation:
Integration reconciles schemas, identifiers, formats, and overlapping records.
It is essential when mining enterprise data from different systems.
Choose an option to check your answer.
A.
The number of categories in one variable
B.
Whether two variables tend to vary together
C.
The probability of a class label
D.
The amount of missing data
Show Answer
Correct Answer: B. Whether two variables tend to vary together
Explanation:
Positive covariance means variables tend to rise together, while negative covariance indicates opposite movement.
Its magnitude depends on measurement units.
Choose an option to check your answer.
A.
Confidence ignores the antecedent completely
B.
A rule may have high confidence even when X adds little predictive information
C.
Confidence can exceed one
D.
Common consequents have zero support
Show Answer
Correct Answer: B. A rule may have high confidence even when X adds little predictive information
Explanation:
If Y occurs in nearly every transaction, many antecedents will appear to predict it.
Lift compares confidence with Y's baseline support.
Choose an option to check your answer.
A.
A rule contains no antecedent
B.
Multiple rules convey essentially the same information
C.
A rule has confidence above one
D.
The dataset has duplicate rows only
Show Answer
Correct Answer: B. Multiple rules convey essentially the same information
Explanation:
Association mining can generate many overlapping rules from related itemsets.
Pruning and concise representations improve interpretability.
Choose an option to check your answer.
A.
The loss of classifier k
B.
The set of frequent k-itemsets
C.
The list of all transactions
D.
The lift of every rule
Show Answer
Correct Answer: B. The set of frequent k-itemsets
Explanation:
Lk consists of k-itemsets meeting minimum support.
It is used to generate candidates for the next level.
Choose an option to check your answer.
A.
Estimating a regression coefficient
B.
Assigning cluster centroids
C.
Determining which records refer to the same real-world entity
D.
Scaling variables to unit variance
Show Answer
Correct Answer: C. Determining which records refer to the same real-world entity
Explanation:
Names, addresses, and identifiers may vary across sources.
Entity resolution links duplicate or related records correctly.
Choose an option to check your answer.
A.
Correlation always proves causation
B.
Correlation ignores direction
C.
Correlation is standardized and unitless
D.
Correlation can only be positive
Show Answer
Correct Answer: C. Correlation is standardized and unitless
Explanation:
Correlation scales covariance by the variables' standard deviations.
This gives a comparable range from -1 to 1.
Choose an option to check your answer.
A.
support(X) plus support(Y)
B.
confidence(Y → X) divided by support(X)
C.
confidence(X → Y) divided by support(Y)
D.
support(X ∪ Y) divided by confidence
Show Answer
Correct Answer: C. confidence(X → Y) divided by support(Y)
Explanation:
Lift compares the observed conditional occurrence of Y with its overall frequency.
It therefore evaluates whether X and Y co-occur more or less than expected under independence.
Choose an option to check your answer.
A.
Rules contain no items
B.
Support is always zero
C.
Co-occurrence can arise from confounding, common popularity, or chance
D.
Confidence measures randomized experiments
Show Answer
Correct Answer: C. Co-occurrence can arise from confounding, common popularity, or chance
Explanation:
Association measures observational dependence rather than intervention effects.
Business context and causal analysis are needed before claiming causation.
Choose an option to check your answer.
A.
It increases every item's support
B.
It makes transactions continuous
C.
It avoids duplicate candidates and enables systematic joining
D.
It guarantees high confidence
Show Answer
Correct Answer: C. It avoids duplicate candidates and enables systematic joining
Explanation:
A consistent lexicographic order gives itemsets a canonical form.
This simplifies joins, subset checks, and duplicate control.