Assignment1
.pdf
keyboard_arrow_up
School
University of Toronto *
*We aren’t endorsed by this school
Course
421
Subject
Computer Science
Date
Apr 3, 2024
Type
Pages
5
Uploaded by AgentStarGorilla125
ECE421 - Winter 2022
Assignment 1: Logistic Regression
Due date: January 31
Submission:
Submit both your report (a single PDF file) and all codes (.py/.ipynb/.html) on
Quercus
.
Objectives:
In this assignment, you will first implement a simple logistic regression classifier using Numpy and
train your model by applying (Stochastic) Gradient Descent algorithm. Next, you will implement
the same model, this time in TensorFlow and use Stochastic Gradient Descent and ADAM to train
your model.
You are encouraged to look up TensorFlow APIs for useful utility functions, at:
https://www.
tensorflow.org/api_docs/python/
.
General Note:
•
Full points are given for complete solutions, including justifying the choices or assumptions
you made to solve each question.
•
A written report should be included in the final submission. Do not dump your codes and
outputs in the report. Keep it short, readable, and well-organized.
•
Programming assignments are to be solved and submitted
individually
. You are encouraged
to discuss the assignment with other students, but you must solve it on your own.
•
Please ask all questions related to this assignment on Piazza, using the tag
pa1
.
Two-class notMNIST dataset
The notMNIST dataset is a image recognition dataset of font glyphs for the letters A through J
useful with simple neural networks. It is quite similar to the classic MNIST dataset of handwritten
digits 0 through 9. We use the following script to generate a smaller dataset that only contains the
images from two letter classes: “C”(the positive class) and “J”(the negative class). This smaller
subset of the data contains 3500 training images, 100 validation images and 145 test images.
1
1
LOGISTIC REGRESSION WITH NUMPY[20 POINTS]
with np.load(
’notMNIST.npz’
) as data :
Data, Target = data [
’images’
], data[
’labels’
]
posClass = 2
negClass = 9
dataIndx = (Target==posClass) + (Target==negClass)
Data = Data[dataIndx]/255.
Target = Target[dataIndx].reshape(-1, 1)
Target[Target==posClass] = 1
Target[Target==negClass] = 0
np.random.seed(521)
randIndx = np.arange(
len
(Data))
np.random.shuffle(randIndx)
Data, Target = Data[randIndx], Target[randIndx]
trainData, trainTarget = Data[:3500], Target[:3500]
validData, validTarget = Data[3500:3600], Target[3500:3600]
testData, testTarget = Data[3600:], Target[3600:]
1
Logistic Regression with Numpy[20 points]
Logistic regression is one the most widely used linear classification models in machine learning. In
logistic regression, we model the probability of a sample
x
belonging to the positive class as
ˆ
y
(
x
) =
σ
(
w
>
x
+
b
)
,
where
z
=
w
>
x
+
b
, also called
logit
, is basically the linear transformation of input vector
x
using
weight vector
w
and bias scalar
b
.
σ
(
z
) = 1
/
(1 + exp(
-
z
)) is the sigmoid or logistic function: it
“squashes” the real-valued logits to fall between zero and one.
The cross-entropy loss
L
CE
and the regularization term
L
w
will form the total loss function as:
L
=
L
CE
+
L
w
=
1
N
N
X
n
=1
-
y
(
n
)
log ˆ
y
(
x
(
n
)
)
-
(
1
-
y
(
n
)
)
log
(
1
-
ˆ
y
(
x
(
n
)
))
+
λ
2
k
w
k
2
2
Note that
y
(
n
)
∈ {
0
,
1
}
is the class label for the
n
-th training image and
λ
is the regularization
parameter.
2
1
LOGISTIC REGRESSION WITH NUMPY[20 POINTS]
Note
: For part 1 of the assignment, you are not allowed to use TensorFlow or PyTorch. Your
implementations should solely be based on Numpy.
1.
Loss Function and Gradient [8 pts]:
Implement two
vectorized
Numpy functions (i.e. avoid using for loops by employing matrix
products and broadcasting) to compute the loss function and its gradient. The
grad_loss
function should compute and return an analytical expression of the gradient of the loss with
respect to both the weights and bias. Both function headers are below. Include the analytical
expressions in your report as well as a snippet of your Python code.
def
loss(w, b, x, y, reg):
#Your implementation
def
grad_loss(w, b, x, y, reg):
#Your implementation
2.
Gradient Descent Implementation [6 pts]:
Using the gradient computed from part 1, implement the batch Gradient Descent algorithm
to classify the two classes in the
notMNIST
dataset. The function should accept 8 arguments
- the weight vector, the bias, the data matrix, the labels, the learning rate, the number of
epochs
1
,
λ
and an error tolerance (set to 1
×
10
-
7
). The training should stop if the total
number of epochs is reached, or the norm of the difference between the old and updated
weights are smaller than the error tolerance.
The function should return the optimized
weight vector and bias. The function header looks like the following:
def
grad_descent(w, b, x, y, alpha, epochs, reg, error_tol):
#Your implementation here#
You may also wish to print and/or store the training, validation, and test losses/accuracies
in this function for plotting. (In this case, you can add more inputs for validation and test
data to your functions).
3.
Tuning the Learning Rate[3 pts]:
Test your implementation of Gradient Descent with 5000 epochs and
λ
= 0. Investigate the
impact of learning rate,
α
=
{
0
.
005
,
0
.
001
,
0
.
0001
}
on the performance of your classifier. Plot
the training and validation loss (on one figure) vs. number of passed epochs for each value of
α
. Repeat this for training and validation accuracy. You should submit a total of 6 figures
in your report for this part. Also, explain how you choose the best learning rate, and what
accuracy you report for the selected learning rate.
4.
Generalization [3 pts]:
Investigate the impact of regularization by modifying the regularization parameter,
λ
=
{
0
.
001
,
0
.
1
,
0
.
5
}
for
α
= 0
.
005. Plot the training/validation loss/accuracy vs. epochs, similar
1
Epoch is defined as a complete pass of the training data. By definition, batch gradient descent operates on the
entire training dataset
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Explain how Logistic Regression works.
Note: Please do it with your own words. Thankyou
arrow_forward
I need to plot the data from a simple linear regression with its values x and y with the fitted y values from fitting a logistic regression model. How do I plot the x values with two different y's, y values on the left side with fitted y values on the right? I want to generate the fitted/predicted y values using logistic regression and then create a plot using all those three variables? Here is an example of my SAS code that doesn't work.
data coupons;input redeemed discount;cards;100 5147 9176 11211 13244 15277 17310 19343 21;run;
ods graphics on;proc logistic data=coupons;model redeemed(event='1') = discount;effectplot;run;
It give me this warnings and I do not understand….
NOTE: Option EVENT= is ignored since LINK=CLOGIT.
NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are summed over the responses having the lower
Ordered Values in the Response Profile table. Use the response variable option DESCENDING if you want to reverse the
assignment of…
arrow_forward
In python, for a sample data with 4 columns and 60 rows how do you find the parameters for the regression with the feature map (see attached) where we consider the loss function to be the square of residuals. Once this is done, how do you compute the empirical risk? I've attached some of the data below, it would be sufficient to see how you get results for the question using the above dataset.
1
14
25
620
-1
69
29
625
0
83
27
850
0
28
25
1315
1
41
25
2120
-1
153
31
1315
0
55
25
2600
0
55
31
490
1
69
25
3110
1
83
25
3535
arrow_forward
This is a coding question. Now that you have worked out the gradient descent and the update rules. Try to progrum a Ridge regression. Please complete the coding. Note that here the data sct we use has just one explanatory variable and the Ridge regression we try to create here has just one variable (or feature).
Now that you have finished the program. What are the observations and the corresponding predictions using Ridge? Now, make a plot to showoase how well your model predicts against the observations. Use scatter plot for observations, line plot for your model predictions. Observations are in color red. and prodictions are in color green. Add appropriate labeis to the x axis and y axis and a title to the plot You may also need to fine tune hyperparameters such as Icurning rate and the number ofliterations.
arrow_forward
Model evaluation
Create a predictions variable using your fitted model and the test dataset; call it y_pred. Then get the accuracy score of your predictions and save it in a variable called accuracy. Finally get the confusion matrix for your predictions and save it in a variable called confusion_mat.
Code:
y_pred = Noneaccuracy = Noneconfusion_mat = None
arrow_forward
Go to UCI data repository (https://archive.ics.uci.edu/ml/datasets.php) and identify two
data sets, one for estimation/regression and one for classification. Using Microsoft Excel
or any other statistical software:
a. Develop and interpret a linear regression model.
b. Develop and interpret a logistic regression model.
Note: Answers must be in your own words.
arrow_forward
Python
regression
b)
When adding more variables to a linear model, what is true about the R-squared value?
c)
Adding more variables always leads to a lower test MSE.FalseTrue
arrow_forward
Explain the flaws in this model training strategy. What's your solution? We want to create a hip X-Ray deformity prediction model. 100 individuals have 640 frontal X-rays. Three orthopedic physicians label the photos as positive or negative for hip deformity. The picture dataset was randomly divided among 80% training (training and validation) and 20% testing.
arrow_forward
This is a coding question. Now that you have worked out the gradient descent and the update rules. "Try to progrum a Ridge regression. Please complete the coding. Note that here the data set we use has just one explanatory variable and the Ridge regression we try to create here has just one variable (or feature).
Now that you have finished the program. What are the observations and the corresponding predictions using Ridge? Now, make a plot to showease how well your model predicts against the observations. Use spatter plot for observations, line plot for your model predictions. Observations are in color red. and predictions are in color green. Add appropriate labels to the x axis and y axis and a title to the plot You may also nood to fine tune hyperparameters such as leurning rate and the number of'aterations.
arrow_forward
This is a coding question. Try to program a Ridge regression. Please complete the coding. Note that here the data set we use has just one explanatory variable and the Ridgc regression we try to create here has just one variable (or feature).
Now that you have finished the program. What are the obscrvations and the corresponding predictions using Ridge? Now, make a plot to showoase haw well your model predicts against the observations. Use scatter plot for observations, line plot for your model predictions. Observations are in color red, and predictions are in color green. Add appropriate labels to the x axis and y axis and a title to the plot. You may also need to tine tune hyperparamoters such as learning rate and the number of iterations.
side note- make sure the code runs successfully
arrow_forward
Compared to stepwise regression, all-subsets regression offers many advantages.
arrow_forward
Write code to define a K-NN regression model with K=5 and the neighbors are weighted by the inverse
of their distance.
[ ] # Write your code here
arrow_forward
population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code')
Question 1: Population Growth
The world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the population growth rate in a given year might be. We will calculate the population growth rate as follows:-
?????ℎ_????=???????_????_??????????−????????_????_??????????????????_????_??????????
As such, we can only calculate the growth rate for the year 1961 onwards.
Write a function that takes the population_df and a country_code as input and computes the population growth rate for a given country starting from the year 1961. This function must return a return a 2-d numpy array that contains the year and corresponding growth rate for the country.
Function Specifications:
Should take a population_df and country_code string as input and return a numpy array…
arrow_forward
population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code')
Question 1: Population Growth
The world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the population growth rate in a given year might be. We will calculate the population growth rate as follows:-
The formula to use in calculating the growth rate is as below:-
Growth_rate =( Current_year_ population - Previous_year_population) / Previous_year_population
As such, we can only calculate the growth rate for the year 1961 onwards.
Write a function that takes the population_df and a country_code as input and computes the population growth rate for a given country starting from the year 1961. This function must return a return a 2-d numpy array that contains the year and corresponding growth rate for the country.
Function Specifications:
Should take…
arrow_forward
In this assignment you will implement linear regression model and evaluate their performance on the California house price data set. (housing.csv) Apply the codes and save them in a seperate .py or .ipynb file DO NOT PUT THE CODE IN YOUR REPORT DOCUMENT, only present your output metrics as well as requested graphs and personal comments in the report. Name the report and code files with surname_studentID_section. You will submit a report and a .py/.ipynb file. Only use the data set version provided with the assignment do not download other versions or use the ready made version in google colab. In the assignment you will do the following: - apply linear regression on each individual numerical feature (drop features : 'ocean_proximity' ‘longitude', 'latitude') - output the coefficients and your self implemented error measures: sum of squared error SSE, mean squared error MSE, use split percentage cross validation with 30% test size and shuffle as True refer to documentations during…
arrow_forward
What is the type of method to create regression models in which the coefficients are
penalized for being too large than what they should be if multicollinearity was not there?
Answer Choices:
a) Elastic Net
b) Lasso
c) Ridge
d) Regularization
arrow_forward
Answer the questions below within 50 minutes. Use Google Colaboratory and submit your answer in an “ipynb” file.
arrow_forward
python
arrow_forward
Plot following curves in the SAME figure where x-axis is "Model Complexity" and y-axis is the value: (1) Bias; (2) Variance; (3) Training Accuracy; (4) Testing Accuracy; [Paste your plot here]
arrow_forward
This is a coding question. Try to progrum a Ridge regression. Please complete the coding. Note that here the data set we use has just one explanatory variable and the Ridge regression we try to create here has just one variable (or feature).
Now that you have finished the program. What are the observations and the corresponding predictions using Ridge? Now, make a plot to showcase how well your model predicts against the observations. Use scatter plot tor observations, line plot for your model predictions. Observations are in color red, and predictions are in color green. Add appropriate labels to the x axis and y axis and a title to the plot. You may also need to fine tune hyperparameters such as leurning rate and the number of iterations.
arrow_forward
Build A Machine Learning Regression Model with Python and TensorFlow for :
• One Variable
• Multiple inputs :
A DNN regression
• One variable
• Full model
arrow_forward
Write a computer code to do linear regression analysis of a given dataset to find the relation between two variables which gives the least sum of squares error.
Using Excel or Matlab.
Include a copy of the script, a sample input, and a sample output from your codes.
Sample output must include modelfit parameters and the sum of the squared errors for the fit. Please also run the code for thefollowing data set to find and report the relation between y and x:x y25 3040 80120 15075 80150 200300 350270 240400 320450 470575 583
arrow_forward
First, perform the following tasks:
• Make a linear regression model with all the features in the dataset. Use train_test_split to keep 20% of the data for
testing.
• Use your model to predict values for test set and print the predictions for the first 10 instances of the test data and
compare them with actual values.
• Print the coefficient values and their corresponding feature name (e.g. age 43, bmi 200, .)
• Note that you can access feature_names from diabetes dataset directly
• Calculate training-MSE, testing-MSE, and R-squared value.
Compare the two models. Did using all available features improve the performance?
In [ ]: # Your code goes here
In [ ]: # Your code goes here
arrow_forward
a. Feature selection can be done through both filter and wrapper method. Which of these methods is more
accurate and which of these is more efficient? Explain the tradeoff and justify your answer with an
example.
b. Why in some situations logistic regression model is preferred over linear regression?
arrow_forward
Which Python module and method are used to create a multiple regression model for a given data set? Select one.
linregress method from scipy module
linregress method from statsmodel module
ols method from scipy module
ols method from statsmodel module
arrow_forward
To run any stats you must first import statistics
True
False
2. For relatively low-level stats, it generally doesn't matter if you run them in statsmodels, the statistics library, or the other ways we saw
True
False
3. By default, scikit-learn and statsmodels create equivalent logistic regression models
True
False
4. which of these might you want to add to the default statmodels regression model?
Constant
Multiple comparisons
Scatterplot
5.Which of the following is how we would create a logistic regression model?
sm.OLS
sm.Logit()
sm.LogReg()
sm.Logistic()
6.Which of the following are common ways of installing packages?
conda
pip
install.py
install.packages()…
arrow_forward
The lower the misclassification rate, the better the predictive performance of a logistic regression model.
True
False
PLEASE ANSWER CORRECTLY
arrow_forward
What is a stepwise regression?
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Related Questions
- Explain how Logistic Regression works. Note: Please do it with your own words. Thankyouarrow_forwardI need to plot the data from a simple linear regression with its values x and y with the fitted y values from fitting a logistic regression model. How do I plot the x values with two different y's, y values on the left side with fitted y values on the right? I want to generate the fitted/predicted y values using logistic regression and then create a plot using all those three variables? Here is an example of my SAS code that doesn't work. data coupons;input redeemed discount;cards;100 5147 9176 11211 13244 15277 17310 19343 21;run; ods graphics on;proc logistic data=coupons;model redeemed(event='1') = discount;effectplot;run; It give me this warnings and I do not understand…. NOTE: Option EVENT= is ignored since LINK=CLOGIT. NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are summed over the responses having the lower Ordered Values in the Response Profile table. Use the response variable option DESCENDING if you want to reverse the assignment of…arrow_forwardIn python, for a sample data with 4 columns and 60 rows how do you find the parameters for the regression with the feature map (see attached) where we consider the loss function to be the square of residuals. Once this is done, how do you compute the empirical risk? I've attached some of the data below, it would be sufficient to see how you get results for the question using the above dataset. 1 14 25 620 -1 69 29 625 0 83 27 850 0 28 25 1315 1 41 25 2120 -1 153 31 1315 0 55 25 2600 0 55 31 490 1 69 25 3110 1 83 25 3535arrow_forward
- This is a coding question. Now that you have worked out the gradient descent and the update rules. Try to progrum a Ridge regression. Please complete the coding. Note that here the data sct we use has just one explanatory variable and the Ridge regression we try to create here has just one variable (or feature). Now that you have finished the program. What are the observations and the corresponding predictions using Ridge? Now, make a plot to showoase how well your model predicts against the observations. Use scatter plot for observations, line plot for your model predictions. Observations are in color red. and prodictions are in color green. Add appropriate labeis to the x axis and y axis and a title to the plot You may also need to fine tune hyperparameters such as Icurning rate and the number ofliterations.arrow_forwardModel evaluation Create a predictions variable using your fitted model and the test dataset; call it y_pred. Then get the accuracy score of your predictions and save it in a variable called accuracy. Finally get the confusion matrix for your predictions and save it in a variable called confusion_mat. Code: y_pred = Noneaccuracy = Noneconfusion_mat = Nonearrow_forwardGo to UCI data repository (https://archive.ics.uci.edu/ml/datasets.php) and identify two data sets, one for estimation/regression and one for classification. Using Microsoft Excel or any other statistical software: a. Develop and interpret a linear regression model. b. Develop and interpret a logistic regression model. Note: Answers must be in your own words.arrow_forward
- Python regression b) When adding more variables to a linear model, what is true about the R-squared value? c) Adding more variables always leads to a lower test MSE.FalseTruearrow_forwardExplain the flaws in this model training strategy. What's your solution? We want to create a hip X-Ray deformity prediction model. 100 individuals have 640 frontal X-rays. Three orthopedic physicians label the photos as positive or negative for hip deformity. The picture dataset was randomly divided among 80% training (training and validation) and 20% testing.arrow_forwardThis is a coding question. Now that you have worked out the gradient descent and the update rules. "Try to progrum a Ridge regression. Please complete the coding. Note that here the data set we use has just one explanatory variable and the Ridge regression we try to create here has just one variable (or feature). Now that you have finished the program. What are the observations and the corresponding predictions using Ridge? Now, make a plot to showease how well your model predicts against the observations. Use spatter plot for observations, line plot for your model predictions. Observations are in color red. and predictions are in color green. Add appropriate labels to the x axis and y axis and a title to the plot You may also nood to fine tune hyperparameters such as leurning rate and the number of'aterations.arrow_forward
- This is a coding question. Try to program a Ridge regression. Please complete the coding. Note that here the data set we use has just one explanatory variable and the Ridgc regression we try to create here has just one variable (or feature). Now that you have finished the program. What are the obscrvations and the corresponding predictions using Ridge? Now, make a plot to showoase haw well your model predicts against the observations. Use scatter plot for observations, line plot for your model predictions. Observations are in color red, and predictions are in color green. Add appropriate labels to the x axis and y axis and a title to the plot. You may also need to tine tune hyperparamoters such as learning rate and the number of iterations. side note- make sure the code runs successfullyarrow_forwardCompared to stepwise regression, all-subsets regression offers many advantages.arrow_forwardWrite code to define a K-NN regression model with K=5 and the neighbors are weighted by the inverse of their distance. [ ] # Write your code herearrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education