MCQ Collection
Big Data Analytics MCQs
Practice Big Data Analytics questions with answers and explanations.
Choose an option to check your answer.
Correct Answer: B. Keeps elements satisfying a Boolean predicate
Explanation:
The predicate is evaluated for each element.
Only elements for which it returns true remain.
Choose an option to check your answer.
Correct Answer: B. Embedding expressions directly inside a string
Explanation:
Expressions preceded by $ are evaluated and inserted.
This improves readability over repeated concatenation.
Choose an option to check your answer.
Correct Answer: B. Applies a partial function to matching elements and returns the results
Explanation:
collect combines filtering and transformation.
Inputs for which the partial function is undefined are skipped.
Choose an option to check your answer.
Correct Answer: C. The ResourceManager allocates a container to start its ApplicationMaster
Explanation:
The ApplicationMaster is the first application-specific process launched.
It then requests additional containers for the workload.
Choose an option to check your answer.
Correct Answer: C. Copying application files and resources to the node where a container will run
Explanation:
Executables, libraries, and cached files must be available locally.
The NodeManager prepares these resources before launching the process.
Choose an option to check your answer.
Correct Answer: C. Maps each element to a collection and flattens the results
Explanation:
flatMap is useful when one input can produce zero or many outputs.
Tokenizing lines into words is a common example.
Choose an option to check your answer.
Correct Answer: C. Expressing iteration, filtering, and mapping over collections
Explanation:
For-comprehensions translate into methods such as map, flatMap, and withFilter.
They can produce values using yield.
Choose an option to check your answer.
Correct Answer: D. To store very large files reliably across a cluster
Explanation:
HDFS divides files into blocks and distributes them across DataNodes.
Replication provides resilience to node failure.
Choose an option to check your answer.
Correct Answer: D. A persistent snapshot of the file-system namespace
Explanation:
The fsimage stores namespace metadata at a point in time.
It is combined with edit-log entries during NameNode startup.
Choose an option to check your answer.
Correct Answer: D. Transferring mapper output partitions to the appropriate reducers
Explanation:
Shuffle moves intermediate records across the cluster by partition.
It can be one of the most network-intensive stages.
Choose an option to check your answer.
Correct Answer: D. When each record can be transformed without grouping across records
Explanation:
Independent filtering or field conversion can finish in mappers.
Avoiding reducers eliminates shuffle and sort overhead.
Choose an option to check your answer.
Correct Answer: D. A join that groups records from multiple datasets by key at reducers
Explanation:
Mappers tag and emit records under the join key.
Reducers receive all matching records and construct joined outputs.