MCQ Collection
Big Data Analytics MCQs
Practice Big Data Analytics questions with answers and explanations.
Choose an option to check your answer.
Correct Answer: C. A record of changes made to the file-system namespace
Explanation:
Namespace operations are written to the edit log for durability.
They are later merged with the namespace image during checkpointing.
Choose an option to check your answer.
Correct Answer: C. Processes all intermediate values associated with a key
Explanation:
The framework groups mapper output by key before reduction.
The reducer aggregates or transforms each key's values.
Choose an option to check your answer.
Correct Answer: C. A MapReduce job configured with zero reducers
Explanation:
Map-only jobs are suitable when records can be processed independently.
Mapper output is written directly as final output.
Choose an option to check your answer.
Correct Answer: C. When one dataset is small enough to distribute to every mapper
Explanation:
Each mapper loads the small table and joins it with its input partition.
This avoids reducer-side network transfer for the large dataset.
Choose an option to check your answer.
Correct Answer: C. Finding the N records with the largest or smallest scores
Explanation:
Local top-N lists can be produced by mappers and merged globally.
This reduces the amount of data sent to the final stage.
Choose an option to check your answer.
Correct Answer: D. Stream processing
Explanation:
Stream processing evaluates events as they arrive or in short windows.
It supports monitoring, alerts, and near-real-time analytics.
Choose an option to check your answer.
Correct Answer: A. The data volume, velocity, or complexity can exceed one machine's storage or processing capacity
Explanation:
Big Data systems distribute storage and computation across multiple machines.
This allows workloads to scale beyond the limits of one server.
Choose an option to check your answer.
Correct Answer: A. The time between submitting or receiving data and obtaining a result
Explanation:
Latency measures responsiveness.
Interactive and streaming applications usually require lower latency than batch jobs.
Choose an option to check your answer.
Correct Answer: B. Volume
Explanation:
Volume describes the scale of a dataset.
Large volume motivates distributed storage and parallel processing.
Choose an option to check your answer.
Correct Answer: B. The amount of data or work processed per unit time
Explanation:
Throughput measures processing capacity over time.
Batch platforms often optimize for high throughput.
Choose an option to check your answer.
Correct Answer: C. Velocity
Explanation:
Velocity concerns data generation, ingestion, and response speed.
Streaming systems are often used when low-latency processing is required.
Choose an option to check your answer.
Correct Answer: D. Variety
Explanation:
Variety captures differences in format, schema, and media type.
Examples include tables, logs, JSON, images, and text.