PS#5

.pdf

School

California Lutheran University *

*We aren’t endorsed by this school

Course

IDS575

Subject

Economics

Date

Apr 3, 2024

Type

pdf

Pages

11

Uploaded by SuperHumanCrabPerson1153

Report
Q1 Model Selection 30 Points Q1.1 4 Points You will perform cross-validation on a dataset with 100 examples. Your goal is to robustly measure the validation error to approximate out-of- sample performance of your model. If using a 10-fold cross-validation, you need to compute the validation error N1 times. To compute individual errors, you should train your model with a training data of size N2, and test the model on a validation data of size N3. Your final validation error will be the average of N1 individual validation errors. What are the appropriate numbers for N1, N2, N3? Q1.2 4 Points Which of the following cross-validation methods may not be suitable for a very large dataset with hundreds of thousands of examples? N1 = 10, N2 = 90, N3 = 10 N1 = 1, N2 = 90, N3 = 10 N1 = 10, N2 = 10, N3 = 90 N1 = 1, N2 = 10, N3 = 90 N1 = 10, N2 = 100, N3 = 100 N1 = 1, N2 = 100, N3 = 100
Q1.3 4 Points To use holdout validation for your classification problem, you are going to randomly split your supervised dataset into training, validation, and test partitions. Assume your dataset is sufficiently large. Select all correct Q1.4 4 Points Now you will run 10-fold cross-validation for training k-NN. For each of candidate k values, you train k-NN on all dataset but one of the 10 folds, then measuring an approximate validation error on the examples in that heldout fold. When you have 5 different candidate k values to decide the best model, you will train k-NN total N1 times. The performance of individual k-fold cross-validation Leave-one-out cross-validation Holdout validation All of the above Some partitions may consist of substantially more difficult- or easier-to-predict cases. Some partitions may contain a larger or smaller proportion among different label classes. Training performance could be decreased due to holding out subsets of data for validation and test. Measuring out-of-sample performance could become less accurate due to the random split.
models (with a specific k value) will be evaluated by the mean of N2 validation errors. Q1.5 4 Points To report and launch your prediction system, now you are to choose the final model (with the best performing k) given the results from the 10-fold cross-validation in Q1.4. Choose the best convention to come up with the final validation error and the final decision boundary. Q1.6 5 Points N1=10, N2=5 N1=5, N2=10 N1=45, N2=10 N2=45, N1=50 N1=50, N2=10 N1=10, N2=50 Pick the with the lowest mean validation error as the best model; Report that lowest mean validation error as the final validation error; Launch the decision boundary trained for -NN as it is. k k Pick the as the closest candidate to the weighted average of 5 candidate k values (where the weights are given by each k-NN's accuracy); Report the weighted average of mean validation errors as the final validation error; Launch the decision boundary trained for -NN as it is. k k Pick the as with the lowest mean validation error as the best model; Report that lowest mean validation error as the final validation error; Launch the decision boundary by retraining -NN on the entire dataset (including all 10 folds). k k
For spam classification, you have 100 emails in the validation data. When your trained hypothesis classifies 90 emails with their correct labels, what interval does the true error lie in with 95% confidence? (0.0412, 0.1588) Q1.7 5 Points Assume you evenly split the data D into 3 disjoint subsets , and . You run cross-validation to determine the better model between and . When you achieve the following test errors: , , , , , . Which model we should pick? Q2 Model Assessment 70 Points Q2.1 5 Points Choose all correct ones: h ^ Err ( ) p h ^ D 1, D 2 D 3 M 1 M 2 Err ( ) = D 1 h ^ M 1 0.32 Err ( ) = D 2 h ^ M 1 0.41 Err ( ) = D 3 h ^ M 1 0.15 Err ( ) = D 1 h ^ M 2 0.24 Err ( ) = D 2 h ^ M 2 0.38 Err ( ) = D 3 h ^ M 2 0.17 M1 M2 Indifferent M2 could be better but may not be significantly.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help