Question

Why can a Python or Scala UDF be slower than built-in Spark SQL functions?

Accepted Answer

A. The optimizer has less visibility and may incur serialization overhead Explanation: Built-in expressions participate fully in Catalyst optimization and code generation.
UDF boundaries can limit optimization and add data-conversion costs.

Why can a Python or Scala UDF be slower than built-in Spark SQL functions?

Why can a Python or Scala UDF be slower than built-in Spark SQL functions?

Correct Answer: A. The optimizer has less visibility and may incur serialization overhead

Leave a Reply Cancel reply