Question
Why is reduceByKey generally preferable to groupByKey for summing values?
Select an option. Your answer will be checked instantly.
Correct Answer: B. It performs map-side aggregation before the shuffle
Explanation:
reduceByKey combines values locally within each mapper partition.
This substantially reduces shuffle volume for aggregations.
Leave a Reply