preview

Predictive Analysis Method

Decent Essays

Dividing it by home price was an attempt take into consideration the cost of living per state. Other than tedious formatting, I did not have any other challenges.

Analysis Method The method I chose to analyze my data with was SAS Enterprise Guide’s Rapid Predictive Modeler. This method provides predictive models quick and accurately and provides easy to understand graphs, charts, and reports. The Modeler will look at the data and try different predictive models and make a final choice on which model is the best one. This method will also automatically take care of outliers, missing values, rare target events, skewness. It will select the variables that are most important for the data model it choices to provide the best results. This …show more content…

The model chosen was a Decision Tree. The ROC Plot looks great as it is far from the baseline and close to the top edge of the chart. This means that the model performed well at classifying my variable. An important statistic, in this case, is the misclassification rate. In this week, the training misclassification rate was 0.2679 and the validation miscalculation rate was 0.3182. There is not much difference between the train and validation rates which means this model is useful. The ROC for validation is 0.872 which is good. It would be nice to see this number about 0.9, but this is acceptable. I have also provided the output of this model in the Excel sheet that will show the models predictions for each of the states.
For the second week, December 10 – 16, the important variables were CY48, VaccinationRate, AverIncome, WinterAfterHumidity. The best model was also a decision tree. In this week, the training misclassification rate was 0.3158 and the validation miscalculation rate was 0.4651. The difference in the rates is noticeable and I would declare this model as useless. The model fell apart during the validation process and I would not trust it prediction future events.
For the week of December 17 – 23, the important variables were CY48, AverageHousePrice, Annual Precipitation, 2015_2016, and 2016_2017. The final two variables are binary variables that determines what year the record is able, 1 being yes and 0

Get Access