Question
Why can a Python or Scala UDF be slower than built-in Spark SQL functions?
Select an option. Your answer will be checked instantly.
Correct Answer: A. The optimizer has less visibility and may incur serialization overhead
Explanation:
Built-in expressions participate fully in Catalyst optimization and code generation.
UDF boundaries can limit optimization and add data-conversion costs.
Leave a Reply