CS484_IML_Assignment_4_Answer_Key

.docx

School

Illinois Institute Of Technology *

*We aren’t endorsed by this school

Course

584

Subject

Computer Science

Date

Dec 6, 2023

Type

docx

Pages

Uploaded by ColonelTeamCamel27

CS 484: Introduction to Machine Learning Fall Semester 2023 Assignment 4 Answer Key Question 1 (50 points) The Homeowner_Claim_History.xlsx contains the claim history of 27,513 homeowner policies. The following table describes the eleven columns in the HOCLAIMDATA sheet. Name Description Categories policy Policy Identifier exposure Duration a Policy is Exposed to Risk Measured in Portion of a Year num_claims Number of Claims in a Year amt_claims Total Claim Amount in a Year f_primary_age_tier Age Tier of Primary Insured < 21, 21 - 27, 28 - 37, 38 - 60, > 60 f_primary_gender Gender of Primary Insured Female, Male f_marital Marital Status of Primary Insured Not Married, Married, Un-Married f_residence_location Location of Residence Property Urban, Suburban, Rural f_fire_alarm_type Fire Alarm Type None, Standalone, Alarm Service f_mile_fire_station Distance to Nearest Fire Station < 1 mile, 1 - 5 miles, 6 - 10 miles, > 10 miles f_aoi_tier Amount of Insurance Tier < 100K, 100K - 350K, 351K - 600K, 601K - 1M, > 1M We want to predict the Frequency which is number of claims per unit of exposure using the above features. We first divide the reported number of claims by the exposure. This gives the Frequency . Next, we put the policies into four groups according to their Frequency values. Frequency Group Frequency Value 0 Frequency = 0 1 0 < Frequency <= 1 2 1 < Frequency <= 2 3 2 < Frequency <= 3 4 3 < Frequency We will use the above Frequency Group as our target variable which has four levels. After dropping the missing target values, we will divide the observations into the training and the testing partitions. Observations whose Policy Identifier starts with the letters A, G, and P will go to the training partition. The remaining observations go to the testing partition. Page 1

CS 484: Fall Semester 2023 Assignment 4 Answer Key Since we have sufficient computing resources, we will train multinomial logistic models for all the possible subsets of combinations of the seven categorical predictors, namely, f_aoi_tier , f_fire_alarm_type , f_marital , f_mile_fire_station , f_primary_age_tier , f_primary_gender , and f_residence_location . All models must include the Intercept term. To help us select our “optimal” model, we will calculate the AIC and the BIC criteria of the Training partition, the Accuracy of the Testing partition, and the Root Average Squared Error of the Testing partition. The string predictor f_fire_alarm_type contains the word “None”. Unfortunately, Pandas read it as NaN. Therefore, we need to call fillna() function to replace the NaN back to the word “None”. (a) (10 points) How many policies are in each of the four groups in the Training partition? Also, in the Testing partition? Partition Training Testing Number of Policy 20,661 6,852 (b) (10 points) What is the lowest AIC value on the Training partition? Also, which model produces that AIC value? The lowest AIC value on the Training partition is 47836.1176. The model that produces this AIC value is: Intercept + f_aoi_tier + f_fire_alarm_type + f_mile_fire_station + f_primary_age_tier + f_residence_location. (c) (10 points) What is the lowest BIC value on the Training partition? Also, which model produces that BIC value? The lowest BIC value on the Training partition is 48344.0218. The model that produces this BIC value is: Intercept + f_aoi_tier + f_fire_alarm_type + f_mile_fire_station + f_primary_age_tier + f_residence_location. (d) (10 points) What is the highest Accuracy value on the Testing partition? Also, which model produces that Accuracy value? Page 2

CS 484: Fall Semester 2023 Assignment 4 Answer Key The highest Accuracy value on the Testing partition is 0.5641. The model that produces this accuracy value is: Intercept + f_aoi_tier + f_fire_alarm_type + f_marital + f_mile_fire_station + f_primary_age_tier. (e) (10 points) What is the lowest Root Average Squared Error value on the Testing partition? Also, which model produces that RASE value? The lowest Root Average Squared Error value on the Testing partition is 0.3430. The model that produces this RASE value is: Intercept + f_aoi_tier + f_fire_alarm_type + f_marital + f_mile_fire_station + f_primary_age_tier + f_residence_location. Page 3

CS 484: Fall Semester 2023 Assignment 4 Answer Key Question 2 (50 points) The Center for Machine Learning and Intelligent Systems at the University of California, Irvine manages the Machine Learning Repository ( https://archive.ics.uci.edu/ml/index.php ). We will use two of the datasets in the repository for analyses, namely, the WineQuality_Train.csv for training and the WineQuality_Test.csv for testing. The categorical target variable is quality_grp . It has two categories, namely, 0 and 1. The input features are alcohol , citric_acid , free_sulfur_dioxide , residual_sugar , and sulphates . These five input features are considered interval variables. We will train a Multi-Layer Perceptron neural network with the following specifications. 1. Perform a grid search to select the most desired network structure. 2. The maximum number of iterations is 10000. 3. The random seed is 2023484. 4. Try all the Hyperbolic Tangent , the Identity , and the Linear Rectifier activation functions. 5. Try the number of layers from 1 to 10 inclusively with an increment of 1. 6. Try the common number of neurons per layer from 2 to 10 inclusively with an increment of 2. We will predict an observation with quality_grp of 1 if Prob( quality_grp = 1)  1.5 c where c is the proportion of observations where quality_grp = 1 in the training partition. Otherwise, the predicted quality_grp is 0. (a) (10 points). What is the proportion of observations where quality_grp = 1 in the training partition? The proportion of observations where quality_grp = 1 in the training partition is 0.1962. (b) (10 points). What is the proportion of observations where quality_grp = 1 in the testing partition? The proportion of observations where quality_grp = 1 in the testing partition is 0.1974. (c) (10 points). Show your grid search results in a table. The table should contain (1) the activation function type, (2) the number of layers, (3) the common number of neurons per layer, (4) the number of iterations performed ( n_iter_ attribute), (5) the best loss value ( best_loss_ attribute), (6) the root average squared error of the testing partition, (7) the misclassification rate of the testing partition, and (9) the elapsed time in seconds. Page 4

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version