Question
Why should preprocessing parameters be learned from the training set only?
Select an option. Your answer will be checked instantly.
Correct Answer: D. To prevent information from the test set leaking into model development
Explanation:
Means, scales, and imputation values estimated from all data expose the model to test information.
A proper pipeline fits these steps only on training data.
Leave a Reply