Weekly Quiz - ML Pipeline and Hyperparameter Tuning

.pdf

School

University of Texas *

*We aren’t endorsed by this school

Course

DSBA

Subject

Industrial Engineering

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by BrigadierRainCat57

Q No: 1 (Correct Answer) Marks: 1/1 Which of the following is the correct way to define a pipeline object using sklearn Pipeline() function if the steps are defined as - steps = [('scaler’, MinMaxScaler()), ('model’, LogisticRegression())] e = Pineline(steps (You Selected\l - J Pipeline() is a function and we need to pass arguments to the function. The pipeline takes a list of tuples that is passed as an argument. Hence, Pipeline(steps) is the correct answer. Q No: 2 CCorrect Answer> Marks: 1/1 While tuning hyperparameters, the data should be split into three parts - train, validation, and test to avoid data leakage. l,/You Selected\l - J Data leakage happens when a certain part of the data is already seen in the training process. That's why it is always advised to keep the test dataset away and use it only for final evaluation. When we impute the missing values for the entire data and then split the data into train-test then a certain part of the data is leaked in the training process. Regularization is used to deal with overfitting. Hence, the best measure to avoid data leakage is to split the data into three sets.

Q No: 3 (Correct Answer) Marks: 2/2 Which of the following statements are true about Randomized search CV? It evaluates all the possible combinations available in the grid The number of parameter settings that are tried is given by n_iter Only a fixed number of hyperparameter values are tried out from the provided parameter grid Wy 2 and 3 t(You Selected)l Random search CV tries random combinations and not all the possible combinations are tried out. Random search cv tries some random combinations based on n_iter value and hence, the execution is comparatively faster than grid search. Hence, Option 2, and 3 are Correct.

Q No: 4 ( Incorrect Answeri ) Marks: 0/1 Which of the following is NOT a hyperparameter for their corresponding model? a LInear Regression mode Correct Option Hyperparameter is a parameter value that can be controlled in the learning process. The number of estimators, depth of the tree, and shrinkage factors are the parameters that can be controlled while tuning the model but weights are the values that need to be optimized(learned while training process). We tune the parameters to get the optimal weights. QNo: 5 (Correct Answer) Marks: 1/1 RandomsearchCV always uses sampling with replacement to pick up the parameters from the parameter list. ‘./You Selected\l . J/ If all parameters are presented as a list, sampling without replacement is performed. If at least one parameter is given as a distribution, sampling with replacement is used.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help