Regularization 2

.pptx

School

Denmark Technical College *

*We aren’t endorsed by this school

Course

291

Subject

Computer Science

Date

Feb 20, 2024

Type

pptx

Pages

16

Uploaded by MajorManateePerson634

Report
Cross Validation (CV) How can evaluate the model with new data? We can mimic an out-of-sample (OOS) experiment to select the best model using cross-validation. Split the dataset into K evenly-sized folds. For each fold, repeat the following steps: a. Use k-1 folds as the training dataset and fit the model. b. Hold out one fold as the out-of-sample (OOS) set for evaluation. Repeat the model k times The dataset has been split into 5 equal bins OOS OOS OOS OOS OOS
Cross-validation ALGORITHM: K -fold Cross-Validation Given a dataset of n observations, , and M candidate models (or algorithms), Split the data into K roughly evenly sized nonoverlapping random subsets ( folds ). For k = 1 . . . K : Fit the parameters for each candidate model/algorithm using all but the k th fold of data. Record deviance (or, equivalently R 2 ) on the left-out k th fold based on predictions from each model. This will yield a set of K OOS deviances for each of your candidate models. This sample is an estimate of the distribution of each model’s predictive performance on new data, and you can select the model with the best OOS performance.
CV for Lasso Rather than IC, we can actually do an OOS experiment. For Lasso paths, you want to design a CV experiment to evaluate the OOS predictive performance of different λ penalty values. To run the CV algorithm: Fit the Lasso path for the full dataset Run a CV experiment where you split your data into K folds and apply λ t penalties in Lasso estimation on the training data excluding each fold. Record OOS deviances for prediction on each left-out fold. Select the λ t with “best” OOS performance. Your selected model is defined by the corresponding coefficients that were obtained through Lasso estimation on the full dataset with penalty λ t .
CV for Lasso: How many folds? A common question around CV is “How do I choose K?” More folds reduce Monte Carlo variation, which we want. However, using too many folds: gets computationally very expensive And using too many folds (anything approaching K = n) gives bad results if there is even a tiny amount of dependence between your observations. Smaller values of K lead to CV that is more robust to this type of mis- specification.
CV for Lasso: How many folds? If you run your CV experiment and the uncertainty around average OOS deviance is larger than you want, then you can re-run the experiment with more folds. However, if adding a small number of folds doesn’t significantly reduce the uncertainty then you are probably better off using the AICc for model selection.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help