nk to csv file : https://www.dropbox.com/s/umxdnzxsp44gg5g/ex32_data.csv?dl=0   link for required functions to solve question b) :  https://www.dropbox.com/s/l0xbnf1yuk1nix8/Chapter10-LogisticRegression.pd

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
icon
Concept explainers
Question
  • link to csv file : https://www.dropbox.com/s/umxdnzxsp44gg5g/ex32_data.csv?dl=0

 

  • link for required functions to solve question b) :  https://www.dropbox.com/s/l0xbnf1yuk1nix8/Chapter10-LogisticRegression.pdf?dl=0
Download the associated file' from E-Learning. In this exercise we will try to predict whether
a person has Diabetes using logistic regression.
a) Pre-process the data as follows. First, read the csv file, then divide the columns into
two types of variables: The target variable (also called dependent variable) is the last
column, which you should store in the variable y. The feature variables (also called
independent variables) are all other columns. Store these in the variable X.
Next, as in the lecture, divide X and y into training data (train_set_x, train_set_y)
and test data (test_set_x, test_set_y) using 75% of the data as training data and
the other 25% as test data.
b) Copy all the functions that you need for logistic regression from the lecture notes (in-
cluding initialize_parameters(dim)) and modify the following: In the propagate
function, after the cost is calculated, check whether it is NaN and if so, change it to
np.inf. In the optimize function, every 10000 steps (instead of every 100 steps),
append the cost to the costs list and output the quadratic Euclidean norm of the
gradient of the cost: np. sum (grads ['dw'] ** 2)+ np.sum(grads ['db'] ** 2)
c) As in the lecture notes, call the model function to run the logistic regression and then
plot the costs.
Important: Set np.random.seed (0) before the pre-processing step. When calling the
model function, set the number of steps to 1000001 and the learning rate to 0.00025.
Hints: When calling the model function, you might need to fix any errors regarding
the dimension of arrays. You can do so by comparing the shapes of the respective
arrays to the ones from the lecture.
d) Output the so-called confusion matrix. It displays the number of correct predictions
on the diagonal and the incorrect predictions on the off-diagonal, similar to
Predicted
1
118
12
26
36
To output it, proceed as follows: First, import metrics from the sklearn module:
from sklearn import metrics. Then use predict to calculate the predictions on the
test set. Next, call cnf_matrix = metrics.confusion_matrix(arg1, arg2),
you should replace arg1 by the test set of y and arg2 by the predictions on that test
set. Finally, print(cnf_matrix).
where
e) Perform the tasks c) and d) with
• a learning rate of 0.00025, once with 100001 steps and once with 10001 steps,
• a learning rate of 0.0002, once with 1000001 steps, once with 100001 steps, and
once with 10001 steps,
and compare the corresponding confusion matrices.
Actual
Transcribed Image Text:Download the associated file' from E-Learning. In this exercise we will try to predict whether a person has Diabetes using logistic regression. a) Pre-process the data as follows. First, read the csv file, then divide the columns into two types of variables: The target variable (also called dependent variable) is the last column, which you should store in the variable y. The feature variables (also called independent variables) are all other columns. Store these in the variable X. Next, as in the lecture, divide X and y into training data (train_set_x, train_set_y) and test data (test_set_x, test_set_y) using 75% of the data as training data and the other 25% as test data. b) Copy all the functions that you need for logistic regression from the lecture notes (in- cluding initialize_parameters(dim)) and modify the following: In the propagate function, after the cost is calculated, check whether it is NaN and if so, change it to np.inf. In the optimize function, every 10000 steps (instead of every 100 steps), append the cost to the costs list and output the quadratic Euclidean norm of the gradient of the cost: np. sum (grads ['dw'] ** 2)+ np.sum(grads ['db'] ** 2) c) As in the lecture notes, call the model function to run the logistic regression and then plot the costs. Important: Set np.random.seed (0) before the pre-processing step. When calling the model function, set the number of steps to 1000001 and the learning rate to 0.00025. Hints: When calling the model function, you might need to fix any errors regarding the dimension of arrays. You can do so by comparing the shapes of the respective arrays to the ones from the lecture. d) Output the so-called confusion matrix. It displays the number of correct predictions on the diagonal and the incorrect predictions on the off-diagonal, similar to Predicted 1 118 12 26 36 To output it, proceed as follows: First, import metrics from the sklearn module: from sklearn import metrics. Then use predict to calculate the predictions on the test set. Next, call cnf_matrix = metrics.confusion_matrix(arg1, arg2), you should replace arg1 by the test set of y and arg2 by the predictions on that test set. Finally, print(cnf_matrix). where e) Perform the tasks c) and d) with • a learning rate of 0.00025, once with 100001 steps and once with 10001 steps, • a learning rate of 0.0002, once with 1000001 steps, once with 100001 steps, and once with 10001 steps, and compare the corresponding confusion matrices. Actual
Expert Solution
steps

Step by step

Solved in 3 steps

Blurred answer
Knowledge Booster
Correlation, Regression, and Association
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman