MCQ Collection
Data Mining MCQs
Practice Data Mining questions with answers and explanations.
Choose an option to check your answer.
A.
Discard the query permanently
B.
Increase every feature value
C.
Return all classes as correct
D.
Use distance-weighted voting or a deterministic tie rule
Show Answer
Correct Answer: D. Use distance-weighted voting or a deterministic tie rule
Explanation:
Weighted votes prefer classes represented by closer neighbors.
A predefined deterministic rule ensures reproducibility.
Choose an option to check your answer.
A.
The model becomes strictly linear
B.
Every point has identical similarity
C.
The margin becomes infinite
D.
The model may create highly irregular regions around training points
Show Answer
Correct Answer: D. The model may create highly irregular regions around training points
Explanation:
Very localized kernels can memorize training details.
Validation is needed to balance flexibility and generalization.
Choose an option to check your answer.
A.
Compact, roughly spherical clusters of similar scale
B.
Arbitrarily shaped connected clusters
C.
Clusters defined only by categories
D.
Nested graph communities only
Show Answer
Correct Answer: A. Compact, roughly spherical clusters of similar scale
Explanation:
Nearest-centroid partitions create convex Voronoi regions.
Elongated, unequal-density, or nonconvex clusters can be poorly represented.
Choose an option to check your answer.
A.
A supervised decision tree
B.
A frequent-pattern tree
C.
A probabilistic classifier
D.
An unsupervised neural method that maps high-dimensional data onto a low-dimensional grid
Show Answer
Correct Answer: D. An unsupervised neural method that maps high-dimensional data onto a low-dimensional grid
Explanation:
SOM units compete to represent input vectors while preserving neighborhood structure.
The grid supports clustering and visualization.
Choose an option to check your answer.
A.
Distances cannot be computed
B.
Dense regions contain no anomalies
C.
All points have equal distance
D.
Normal points in sparse regions may look anomalous
Show Answer
Correct Answer: D. Normal points in sparse regions may look anomalous
Explanation:
One global cutoff assumes comparable density everywhere.
Local methods better distinguish sparse normal clusters from true outliers.
Choose an option to check your answer.
A.
Maximizing complexity regardless of use
B.
Eliminating all human oversight
C.
Using accuracy as the only criterion
D.
Developing models that are fair, explainable, private, and robust
Show Answer
Correct Answer: D. Developing models that are fair, explainable, private, and robust
Explanation:
Modern systems must perform well while respecting social and operational constraints.
Research increasingly addresses bias, transparency, privacy, security, and distribution shift.
Choose an option to check your answer.
A.
Predicting a numeric value from the responses of nearby training points
B.
Grouping points into k clusters
C.
Generating k association rules
D.
Classifying only binary outcomes
Show Answer
Correct Answer: A. Predicting a numeric value from the responses of nearby training points
Explanation:
The prediction is often the mean or weighted mean of neighbor targets.
It is a local nonparametric regression method.
Choose an option to check your answer.
A.
Feature magnitude affects dot products, distances, and regularization
B.
SVM accepts only values from zero to one
C.
Scaling creates support vectors
D.
Scaling guarantees separability
Show Answer
Correct Answer: A. Feature magnitude affects dot products, distances, and regularization
Explanation:
Large-scale features can dominate the geometry of the optimization.
Consistent scaling also makes C and gamma tuning more meaningful.
Choose an option to check your answer.
A.
Classification precision
B.
Association lift
C.
The change in within-cluster error as k increases
D.
Tree information gain
Show Answer
Correct Answer: C. The change in within-cluster error as k increases
Explanation:
Within-cluster error always falls with more clusters.
The elbow marks where additional clusters yield diminishing improvement.
Choose an option to check your answer.
A.
The map unit whose weight vector is closest to the input
B.
The unit with the largest class prior
C.
The most frequent transaction
D.
The farthest grid node
Show Answer
Correct Answer: A. The map unit whose weight vector is closest to the input
Explanation:
For each input, distances to prototype vectors are compared.
The nearest unit wins and guides the update.
Choose an option to check your answer.
A.
Anomalies are isolated with fewer random partitioning steps
B.
Anomalies have the largest class priors
C.
Anomalies form the deepest clusters
D.
Anomalies maximize association support
Show Answer
Correct Answer: A. Anomalies are isolated with fewer random partitioning steps
Explanation:
Rare, extreme points tend to separate early in random trees.
Short average path length becomes an anomaly score.
Choose an option to check your answer.
A.
They reduce the number of observations
B.
They distort distance without adding predictive information
C.
They make labels continuous
D.
They increase support counts
Show Answer
Correct Answer: B. They distort distance without adding predictive information
Explanation:
Distance treats included dimensions as part of similarity.
Feature selection or metric learning can improve neighborhoods.