MCQ Collection
Data Visualization MCQs
Practice Data Visualization questions with answers and explanations.
Choose an option to check your answer.
A.
Mean versus median
B.
Rows versus columns
C.
Bias versus variance
D.
Color versus shape
Show Answer
Correct Answer: C. Bias versus variance
Explanation:
Small bandwidth lowers smoothing bias but increases random variability.
Large bandwidth reduces variability but may obscure real structure.
Choose an option to check your answer.
A.
When both variables are nominal with no order
B.
When there is only one observation
C.
When the relationship is monotonic but nonlinear or strongly affected by outliers
D.
When exact causal effects are required
Show Answer
Correct Answer: C. When the relationship is monotonic but nonlinear or strongly affected by outliers
Explanation:
Rank-based association does not require a straight-line relationship.
It is also more robust to unusual numeric magnitudes.
Choose an option to check your answer.
A.
The probability that the null hypothesis is true
B.
The size of the observed effect
C.
The probability, assuming the null hypothesis, of obtaining a result at least as extreme as observed
D.
The probability that the data were collected correctly
Show Answer
Correct Answer: C. The probability, assuming the null hypothesis, of obtaining a result at least as extreme as observed
Explanation:
A p-value measures compatibility between the data and the null model.
It does not directly give the probability that a hypothesis is true.
Choose an option to check your answer.
A.
The most distant point in the dataset
B.
A categorical response label
C.
The mean vector of observations assigned to a cluster
D.
The first observation in each cluster
Show Answer
Correct Answer: C. The mean vector of observations assigned to a cluster
Explanation:
Each centroid summarizes a cluster's center in feature space.
Assignments are based on distance from these mean vectors.
Choose an option to check your answer.
A.
A binary response variable
B.
A survival censoring indicator only
C.
A dissimilarity or distance matrix
D.
A confidence level
Show Answer
Correct Answer: C. A dissimilarity or distance matrix
Explanation:
Hierarchical clustering relies on pairwise proximity information.
The choice of distance should match variable types and analytical goals.
Choose an option to check your answer.
A.
The symmetric uniform distribution
B.
The Bernoulli distribution only
C.
The standard normal distribution
D.
The Pareto distribution
Show Answer
Correct Answer: D. The Pareto distribution
Explanation:
Pareto models describe positive values with a slowly decaying right tail.
They are associated with phenomena such as wealth concentration and file sizes.
Choose an option to check your answer.
A.
A method that assumes every dataset is normal
B.
A discrete category-count table
C.
A clustering algorithm that requires labels
D.
A nonparametric method for estimating a smooth probability density
Show Answer
Correct Answer: D. A nonparametric method for estimating a smooth probability density
Explanation:
KDE places a smooth kernel around observations and adds their contributions.
It estimates distribution shape without choosing a specific parametric family.
Choose an option to check your answer.
A.
A normal density with estimated mean and standard deviation
B.
An exponential model with an estimated rate
C.
A fixed Pareto model
D.
A kernel density estimate
Show Answer
Correct Answer: D. A kernel density estimate
Explanation:
KDE does not restrict the population to a finite-parameter distribution family.
Its shape is driven flexibly by the observations.
Choose an option to check your answer.
A.
The variables follow a perfect circle
B.
Both variables have identical means
C.
The relationship changes direction repeatedly
D.
As one variable increases, the other generally does not decrease
Show Answer
Correct Answer: D. As one variable increases, the other generally does not decrease
Explanation:
Monotonicity concerns consistent ordering rather than a constant slope.
The relationship can curve while still generally increasing.
Choose an option to check your answer.
A.
Prove the null hypothesis
B.
Increase the p-value to α
C.
Conclude the effect is practically large
D.
Reject the null hypothesis
Show Answer
Correct Answer: D. Reject the null hypothesis
Explanation:
The result is considered statistically significant at the selected level.
This decision still depends on assumptions and does not measure practical importance.
Choose an option to check your answer.
A.
Standardization creates the true cluster labels
B.
K-means accepts only negative values
C.
It guarantees spherical clusters
D.
Variables with larger scales would otherwise dominate distance calculations
Show Answer
Correct Answer: D. Variables with larger scales would otherwise dominate distance calculations
Explanation:
K-means commonly relies on Euclidean distance.
Scale differences can give some variables excessive influence unrelated to importance.
Choose an option to check your answer.
A.
It is unaffected by variable scale
B.
It always finds objectively true clusters
C.
It can use no distance measure
D.
It provides a nested structure without requiring one fixed k at the start
Show Answer
Correct Answer: D. It provides a nested structure without requiring one fixed k at the start
Explanation:
The dendrogram allows several levels of grouping to be explored after fitting.
However, distance, linkage, and scaling still strongly affect results.