A. Gradient descent updates weights randomly without gradients B. Gradient descent is a file compression algorithm C. Gradient descent only applies to databases D. Gradient descent updates parameters in the direction that reduces loss
Correct Answer: D. Gradient descent updates parameters in the direction that reduces loss
A. Data augmentation means deleting all training images B. Data augmentation is used only after deployment C. Data augmentation makes validation unnecessary D. Data augmentation creates modified training examples to improve robustness
Correct Answer: D. Data augmentation creates modified training examples to improve robustness
A. A test set is used to update weights every epoch B. A test set should be identical to the training set C. A test set is used for data normalization only D. A test set estimates performance on unseen data
Correct Answer: D. A test set estimates performance on unseen data
A. The learning rate controls the step size of parameter updates B. The learning rate is the number of output classes C. The learning rate is the dataset size D. The learning rate disables optimization
Correct Answer: A. The learning rate controls the step size of parameter updates
A. Data normalization scales or centers data to support stable training B. Data normalization changes labels into random values C. Data normalization always increases image size D. Data normalization is the same as model pruning
Correct Answer: A. Data normalization scales or centers data to support stable training
A. A validation set supports model selection and hyperparameter tuning B. A validation set is always larger than the training set C. A validation set replaces the loss function D. A validation set stores GPU kernels
Correct Answer: A. A validation set supports model selection and hyperparameter tuning
A. Convolution removes all spatial structure B. Convolution applies learnable filters to local regions of input data C. Convolution always sorts neurons by value D. Convolution is used only for text tokenization
Correct Answer: B. Convolution applies learnable filters to local regions of input data
A. Dropout is a file handling operation B. Dropout randomly disables units during training to reduce co-adaptation C. Dropout permanently deletes neurons during inference D. Dropout always increases overfitting
Correct Answer: B. Dropout randomly disables units during training to reduce co-adaptation
A. Generalization can only be measured on training loss B. Generalization is the ability to perform well on unseen examples C. Generalization means memorizing every training sample D. Generalization is unrelated to overfitting
Correct Answer: B. Generalization is the ability to perform well on unseen examples
A. Pooling is the same as dropout B. Pooling changes class labels C. Pooling reduces spatial dimensions while keeping important features D. Pooling increases image resolution in every layer
Correct Answer: C. Pooling reduces spatial dimensions while keeping important features
A. Batch normalization replaces the optimizer entirely B. Batch normalization is only used for text spelling correction C. Batch normalization normalizes intermediate activations to stabilize and speed up training D. Batch normalization stores batches in a database
Correct Answer: C. Batch normalization normalizes intermediate activations to stabilize and speed up training
A. Overfitting always improves test accuracy B. Overfitting occurs only in linear regression C. Overfitting occurs when a model learns noise or specific training patterns too strongly D. Overfitting means the model is too simple for the data
Correct Answer: C. Overfitting occurs when a model learns noise or specific training patterns too strongly