MCQ Collection
Big Data Analytics MCQs
Practice Big Data Analytics questions with answers and explanations.
Choose an option to check your answer.
Correct Answer: C. Large sequential reads and writes
Explanation:
HDFS targets high-throughput streaming access to large files.
It is not designed as a general-purpose low-latency file system.
Choose an option to check your answer.
Correct Answer: C. Its design favors large immutable or append-oriented files rather than in-place mutation
Explanation:
HDFS prioritizes throughput and fault-tolerant block storage.
Systems such as HBase are better suited to random record access.
Choose an option to check your answer.
Correct Answer: C. To define how final key-value pairs are written
Explanation:
OutputFormat controls destination structure and serialization.
It also creates the RecordWriter used by tasks.
Choose an option to check your answer.
Correct Answer: C. A distributed metric aggregated across task attempts
Explanation:
Counters track events such as malformed records or processed rows.
They are useful for monitoring and data-quality checks.
Choose an option to check your answer.
Correct Answer: C. Emitting only records that satisfy a condition
Explanation:
Filters are often implemented as map-only jobs.
Records that fail the predicate produce no output.
Choose an option to check your answer.
Correct Answer: D. Availability
Explanation:
Availability requires the system to respond even when some nodes are unavailable.
The response may not contain the most recent data in some designs.
Choose an option to check your answer.
Correct Answer: D. Files are generally appended or read rather than modified at arbitrary positions
Explanation:
Avoiding random in-place updates simplifies replication and consistency.
HDFS supports appends in many deployments but not arbitrary block edits.
Choose an option to check your answer.
Correct Answer: D. The scheduler can exploit data locality when assigning tasks
Explanation:
Tasks can run on or near nodes holding the required blocks.
This reduces expensive network transfer.
Choose an option to check your answer.
Correct Answer: D. Only successful task output becomes visible as final job output
Explanation:
Task attempts may fail or be speculated.
Commit protocols prevent partial or duplicate files from appearing as final results.
Choose an option to check your answer.
Correct Answer: D. Too many distinct counters create coordination and memory overhead
Explanation:
Counters are aggregated through the framework and are not intended as arbitrary per-key storage.
A small, meaningful set is most effective.
Choose an option to check your answer.
Correct Answer: D. A mapping from terms to the documents or records containing them
Explanation:
Mappers emit terms with document identifiers.
Reducers aggregate the document lists for each term.
Choose an option to check your answer.
Correct Answer: A. The system continues operating despite lost or delayed network messages between nodes
Explanation:
Distributed networks can experience communication failures.
Partition-tolerant systems continue functioning while groups of nodes cannot communicate.