two_layer_net

.py

School

University of Michigan *

*We aren’t endorsed by this school

Course

599

Subject

Computer Science

Date

Apr 3, 2024

Type

Pages

Uploaded by abigailrafter

""" Implements a two-layer Neural Network classifier in PyTorch. WARNING: you SHOULD NOT use ".to()" or ".cuda()" in each implementation block. """ import torch import random import statistics from rob599.p2_helpers import sample_batch from typing import Dict, List, Callable, Optional def hello_two_layer_net(): """ This is a sample function that we will try to import and run to ensure that our environment is correctly set up on Google Colab. """ print("Hello from two_layer_net.py!") # Template class modules that we will use later: Do not edit/modify this class class TwoLayerNet(object): def __init__( self, input_size: int, hidden_size: int, output_size: int, dtype: torch.dtype = torch.float32, device: str = "cuda", std: float = 1e-4, ): """ Initialize the model. Weights are initialized to small random values and biases are initialized to zero. Weights and biases are stored in the variable self.params, which is a dictionary with the following keys: W1: First layer weights; has shape (D, H) b1: First layer biases; has shape (H,) W2: Second layer weights; has shape (H, C) b2: Second layer biases; has shape (C,) Inputs: - input_size: The dimension D of the input data. - hidden_size: The number of neurons H in the hidden layer. - output_size: The number of classes C. - dtype: Optional, data type of each initial weight params - device: Optional, whether the weight params is on GPU or CPU - std: Optional, initial weight scaler. """ # reset seed before start random.seed(0) torch.manual_seed(0) self.params = {} self.params["W1"] = std * torch.randn( input_size, hidden_size, dtype=dtype, device=device ) self.params["b1"] = torch.zeros(hidden_size, dtype=dtype, device=device) self.params["W2"] = std * torch.randn( hidden_size, output_size, dtype=dtype, device=device

) self.params["b2"] = torch.zeros(output_size, dtype=dtype, device=device) def loss( self, X: torch.Tensor, y: Optional[torch.Tensor] = None, reg: float = 0.0, ): return nn_forward_backward(self.params, X, y, reg) def train( self, X: torch.Tensor, y: torch.Tensor, X_val: torch.Tensor, y_val: torch.Tensor, learning_rate: float = 1e-3, learning_rate_decay: float = 0.95, reg: float = 5e-6, num_iters: int = 100, batch_size: int = 200, verbose: bool = False, ): # fmt: off return nn_train( self.params, nn_forward_backward, nn_predict, X, y, X_val, y_val, learning_rate, learning_rate_decay, reg, num_iters, batch_size, verbose, ) # fmt: on def predict(self, X: torch.Tensor): return nn_predict(self.params, nn_forward_backward, X) def save(self, path: str): torch.save(self.params, path) print("Saved in {}".format(path)) def load(self, path: str): checkpoint = torch.load(path, map_location="cpu") self.params = checkpoint if len(self.params) != 4: raise Exception("Failed to load your checkpoint") for param in ["W1", "b1", "W2", "b2"]: if param not in self.params: raise Exception("Failed to load your checkpoint") # print("load checkpoint file: {}".format(path)) def nn_forward_pass(params: Dict[str, torch.Tensor], X: torch.Tensor): """ The first stage of our neural network implementation: Run the forward pass of the network to compute the hidden layer features and classification scores. The network architecture should be: FC layer -> ReLU (hidden) -> FC layer (scores)

As a practice, we will NOT allow to use torch.relu and torch.nn ops just for this time (you can use it from A3). Inputs: - params: a dictionary of PyTorch Tensor that store the weights of a model. It should have following keys with shape W1: First layer weights; has shape (D, H) b1: First layer biases; has shape (H,) W2: Second layer weights; has shape (H, C) b2: Second layer biases; has shape (C,) - X: Input data of shape (N, D). Each X[i] is a training sample. Returns a tuple of: - scores: Tensor of shape (N, C) giving the classification scores for X - hidden: Tensor of shape (N, H) giving the hidden layer representation for each input value (after the ReLU). """ # Unpack variables from the params dictionary W1, b1 = params["W1"], params["b1"] W2, b2 = params["W2"], params["b2"] N, D = X.shape # Compute the forward pass hidden = None scores = None ############################################################################ # TODO: Perform the forward pass, computing the class scores for the input.# # Store the result in the scores variable, which should be an tensor of # # shape (N, C). # ############################################################################ # Replace "pass" statement with your code pass ########################################################################### # END OF YOUR CODE # ########################################################################### return scores, hidden def nn_forward_backward( params: Dict[str, torch.Tensor], X: torch.Tensor, y: Optional[torch.Tensor] = None, reg: float = 0.0 ): """ Compute the loss and gradients for a two layer fully connected neural network. When you implement loss and gradient, please don't forget to scale the losses/gradients by the batch size. Inputs: First two parameters (params, X) are same as nn_forward_pass - params: a dictionary of PyTorch Tensor that store the weights of a model. It should have following keys with shape W1: First layer weights; has shape (D, H) b1: First layer biases; has shape (H,) W2: Second layer weights; has shape (H, C) b2: Second layer biases; has shape (C,) - X: Input data of shape (N, D). Each X[i] is a training sample. - y: Vector of training labels. y[i] is the label for X[i], and each y[i] is

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version