Assignment1

.pdf

School

University of Toronto *

*We aren’t endorsed by this school

Course

421

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by AgentStarGorilla125

ECE421 - Winter 2022 Assignment 1: Logistic Regression Due date: January 31 Submission: Submit both your report (a single PDF file) and all codes (.py/.ipynb/.html) on Quercus . Objectives: In this assignment, you will first implement a simple logistic regression classifier using Numpy and train your model by applying (Stochastic) Gradient Descent algorithm. Next, you will implement the same model, this time in TensorFlow and use Stochastic Gradient Descent and ADAM to train your model. You are encouraged to look up TensorFlow APIs for useful utility functions, at: https://www. tensorflow.org/api_docs/python/ . General Note: • Full points are given for complete solutions, including justifying the choices or assumptions you made to solve each question. • A written report should be included in the final submission. Do not dump your codes and outputs in the report. Keep it short, readable, and well-organized. • Programming assignments are to be solved and submitted individually . You are encouraged to discuss the assignment with other students, but you must solve it on your own. • Please ask all questions related to this assignment on Piazza, using the tag pa1 . Two-class notMNIST dataset The notMNIST dataset is a image recognition dataset of font glyphs for the letters A through J useful with simple neural networks. It is quite similar to the classic MNIST dataset of handwritten digits 0 through 9. We use the following script to generate a smaller dataset that only contains the images from two letter classes: “C”(the positive class) and “J”(the negative class). This smaller subset of the data contains 3500 training images, 100 validation images and 145 test images. 1

1 LOGISTIC REGRESSION WITH NUMPY[20 POINTS] with np.load( ’notMNIST.npz’ ) as data : Data, Target = data [ ’images’ ], data[ ’labels’ ] posClass = 2 negClass = 9 dataIndx = (Target==posClass) + (Target==negClass) Data = Data[dataIndx]/255. Target = Target[dataIndx].reshape(-1, 1) Target[Target==posClass] = 1 Target[Target==negClass] = 0 np.random.seed(521) randIndx = np.arange( len (Data)) np.random.shuffle(randIndx) Data, Target = Data[randIndx], Target[randIndx] trainData, trainTarget = Data[:3500], Target[:3500] validData, validTarget = Data[3500:3600], Target[3500:3600] testData, testTarget = Data[3600:], Target[3600:] 1 Logistic Regression with Numpy[20 points] Logistic regression is one the most widely used linear classification models in machine learning. In logistic regression, we model the probability of a sample x belonging to the positive class as ˆ y ( x ) = σ ( w > x + b ) , where z = w > x + b , also called logit , is basically the linear transformation of input vector x using weight vector w and bias scalar b . σ ( z ) = 1 / (1 + exp( - z )) is the sigmoid or logistic function: it “squashes” the real-valued logits to fall between zero and one. The cross-entropy loss L CE and the regularization term L w will form the total loss function as: L = L CE + L w = 1 N N X n =1 - y ( n ) log ˆ y ( x ( n ) ) - ( 1 - y ( n ) ) log ( 1 - ˆ y ( x ( n ) )) + λ 2 k w k 2 2 Note that y ( n ) ∈ { 0 , 1 } is the class label for the n -th training image and λ is the regularization parameter. 2

1 LOGISTIC REGRESSION WITH NUMPY[20 POINTS] Note : For part 1 of the assignment, you are not allowed to use TensorFlow or PyTorch. Your implementations should solely be based on Numpy. 1. Loss Function and Gradient [8 pts]: Implement two vectorized Numpy functions (i.e. avoid using for loops by employing matrix products and broadcasting) to compute the loss function and its gradient. The grad_loss function should compute and return an analytical expression of the gradient of the loss with respect to both the weights and bias. Both function headers are below. Include the analytical expressions in your report as well as a snippet of your Python code. def loss(w, b, x, y, reg): #Your implementation def grad_loss(w, b, x, y, reg): #Your implementation 2. Gradient Descent Implementation [6 pts]: Using the gradient computed from part 1, implement the batch Gradient Descent algorithm to classify the two classes in the notMNIST dataset. The function should accept 8 arguments - the weight vector, the bias, the data matrix, the labels, the learning rate, the number of epochs 1 , λ and an error tolerance (set to 1 × 10 - 7 ). The training should stop if the total number of epochs is reached, or the norm of the difference between the old and updated weights are smaller than the error tolerance. The function should return the optimized weight vector and bias. The function header looks like the following: def grad_descent(w, b, x, y, alpha, epochs, reg, error_tol): #Your implementation here# You may also wish to print and/or store the training, validation, and test losses/accuracies in this function for plotting. (In this case, you can add more inputs for validation and test data to your functions). 3. Tuning the Learning Rate[3 pts]: Test your implementation of Gradient Descent with 5000 epochs and λ = 0. Investigate the impact of learning rate, α = { 0 . 005 , 0 . 001 , 0 . 0001 } on the performance of your classifier. Plot the training and validation loss (on one figure) vs. number of passed epochs for each value of α . Repeat this for training and validation accuracy. You should submit a total of 6 figures in your report for this part. Also, explain how you choose the best learning rate, and what accuracy you report for the selected learning rate. 4. Generalization [3 pts]: Investigate the impact of regularization by modifying the regularization parameter, λ = { 0 . 001 , 0 . 1 , 0 . 5 } for α = 0 . 005. Plot the training/validation loss/accuracy vs. epochs, similar 1 Epoch is defined as a complete pass of the training data. By definition, batch gradient descent operates on the entire training dataset 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version