MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
The median
B.
The interquartile range
C.
The arithmetic mean
D.
The mode
Show Answer
Correct Answer: C. The arithmetic mean
Explanation:
The mean uses the magnitude of every observation.
A few extreme values can shift it substantially.
Choose an option to check your answer.
A.
The support of X minus Y
B.
The confidence of Y → X
C.
The support of X ∪ Y
D.
The number of items in X
Show Answer
Correct Answer: C. The support of X ∪ Y
Explanation:
A rule is supported by transactions containing both antecedent and consequent.
Thus its support is the frequency of their union.
Choose an option to check your answer.
A.
It guarantees no itemsets are found
B.
It makes confidence undefined for every rule
C.
It can produce an enormous number of patterns, many of limited value
D.
It removes rare patterns
Show Answer
Correct Answer: C. It can produce an enormous number of patterns, many of limited value
Explanation:
Lower support allows many combinations to qualify as frequent.
This increases computation and the burden of interpreting results.
Choose an option to check your answer.
A.
It cannot compute support
B.
It works only with continuous data
C.
It may generate a huge number of candidate itemsets
D.
It requires labeled classes
Show Answer
Correct Answer: C. It may generate a huge number of candidate itemsets
Explanation:
Dense datasets and low support thresholds cause candidate explosion.
Memory use and repeated counting can become prohibitive.
Choose an option to check your answer.
A.
The median always preserves correlations
B.
The median creates new observations
C.
The median guarantees normality
D.
The median is less affected by extreme values
Show Answer
Correct Answer: D. The median is less affected by extreme values
Explanation:
For skewed numeric variables, the median often better represents the center.
However, any single-value imputation still understates uncertainty.
Choose an option to check your answer.
A.
The full range
B.
The variance
C.
The standard deviation
D.
The interquartile range
Show Answer
Correct Answer: D. The interquartile range
Explanation:
The IQR depends on the first and third quartiles.
It ignores the most extreme 25 percent in each tail.
Choose an option to check your answer.
A.
support(X) divided by support(Y)
B.
support(Y) minus support(X)
C.
support(X ∩ Y) divided by all items
D.
support(X ∪ Y) divided by support(X)
Show Answer
Correct Answer: D. support(X ∪ Y) divided by support(X)
Explanation:
Confidence estimates the conditional probability of Y given X.
It measures how often the consequent appears among transactions containing the antecedent.
Choose an option to check your answer.
A.
It always increases false positives
B.
It makes every itemset frequent
C.
It prevents transaction scanning
D.
It can miss rare but important associations
Show Answer
Correct Answer: D. It can miss rare but important associations
Explanation:
Some valuable patterns, such as rare adverse events, naturally have low support.
An excessive threshold filters them out.
Choose an option to check your answer.
A.
When the item universe is tiny and support is high
B.
When there is one transaction only
C.
When all items are absent
D.
When transactions are dense and many long patterns are frequent
Show Answer
Correct Answer: D. When transactions are dense and many long patterns are frequent
Explanation:
Dense data produce many possible combinations and long candidates.
Apriori must enumerate and count a large search space.
Choose an option to check your answer.
A.
Use the most frequent category or a separate 'Unknown' category
B.
Use the arithmetic mean
C.
Use the standard deviation
D.
Apply logarithmic transformation
Show Answer
Correct Answer: A. Use the most frequent category or a separate 'Unknown' category
Explanation:
Categorical values require a valid category rather than a numeric summary.
The choice should reflect the cause and meaning of missingness.
Choose an option to check your answer.
A.
As the average squared deviation from the mean
B.
As the difference between maximum and minimum
C.
As the median of all values
D.
As the most frequent value
Show Answer
Correct Answer: A. As the average squared deviation from the mean
Explanation:
Squaring prevents deviations of opposite signs from cancelling.
Variance measures overall dispersion around the mean.
Choose an option to check your answer.
A.
Transactions containing X often also contain Y
B.
X occurs in most transactions necessarily
C.
Y causes X
D.
X and Y are negatively correlated
Show Answer
Correct Answer: A. Transactions containing X often also contain Y
Explanation:
Confidence focuses on the conditional frequency of Y among X transactions.
It does not by itself show whether the association exceeds Y's baseline frequency.