Question
Why must feature selection be performed using training data rather than the entire dataset before splitting?
Select an option. Your answer will be checked instantly.
Correct Answer: A. Using test information to select features would leak information into model development
Explanation:
Test data must represent unseen cases and cannot influence which variables the model receives. Selection based on all data can make the final evaluation optimistically biased.
Leave a Reply