IST 707 Applied Machine Learning
HW5: Use a Decision Tree to Solve a Mystery in History
In this homework assignment, you are going to use the decision tree algorithm to solve the disputed essay problem from last week.
Organize your report using the following template: Section 1: Data preparation You will need to separate the original data set to training and testing data for classification experiments. Describe the contents of your training and test data. What steps did you take? Section 2: Build and tune decision tree models First, build a DT model using the default settings, then tune the parameters to see if a better model can be generated. Compare these models using appropriate evaluation measures. Describe and compare the patterns learned in these models. Section 3: Prediction After building the classification model, apply it to the disputed papers to find out the authorship. Does the DT model reach the same conclusion the clustering algorithms did? Explain any differences.
Provide your code in a separate script.