Homework-0

.pdf

School

Carnegie Mellon University *

*We aren’t endorsed by this school

Course

10315

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by ProfessorCrab6037

H OMEWORK 0 P Y T ORCH P RIMER * 10-423/10-623 G ENERATIVE AI http://423.mlcourse.org OUT: Jan. 18, 2024 DUE: Jan. 24, 2024 TAs: Haoyang, Jing, Qin, Ifigeneia, and Tiancheng Instructions • Collaboration Policy : Please read the collaboration policy in the syllabus. • Late Submission Policy: See the late submission policy in the syllabus. • Submitting your work: You will use Gradescope to submit answers to all questions and code. – Written: You will submit your completed homework as a PDF to Gradescope. Please use the provided template. Submissions can be handwritten, but must be clearly legible; otherwise, you will not be awarded marks. Alternatively, submissions can be written in L A T E X. Each answer should be within the box provided. If you do not follow the template or your submission is misaligned, your assignment may not be graded correctly by our AI assisted grader. – Programming: You will submit your code for programming questions to Gradescope. There is no autograder. We will examine your code by hand and may award marks for its submission. • Materials: The data that you will need in order to complete this assignment is posted along with the writeup and template on the course website. Question Points Background Reading 2 Image Classification 41 Text Classification 22 Code Upload 0 Collaboration Questions 2 Total: 67 * Compiled on Friday 19 th January, 2024 at 16:01 1

Homework 0: PyTorch Primer 10-423/10-623 Introduction In this assignment, you will choose-your-own-adventure as you get up to speed on (or do a quick review of) PyTorch and Weights & Biases. 1 PyTorch is a general purpose deep learning library. It allows you to define a computation graph, loss , dynamically in Python, and then a simple call to loss.backward() computes all the adjoints (aka. gradients of the loss with respect to each parameter) for you. Gone are the days in which we needed to work through complicated matrix calculus just to train our models. Well of course, you’d very much need to do that for any function that isn’t easily or efficiently expressed in PyTorch, but those cases are becoming less and less common. Weights & Biases is a logging tool that allows you to easily track the behavior of your model during training and evaluation. As well, once you’ve logged the interesting bits of data (say, the validation loss every 10 epochs), you can easily create a plot with just a few clicks showing that information. There is another advantage: Each run of your code might be with different hyperparameters and, if you carefully log these as well, then with a few more clicks you can compare the model’s behavior across different hyperparameter settings. At a high-level, you will proceed as follows: 1. Read the PyTorch tutorial. 2. Read the Weights and Biases (wandb) tutorial. 3. Review the HW0 starter code. You’ll find it closely mirrors the code described in the PyTorch tutorial. 4. Modify the starter code so that it incorporates Weights & Biases logging. 5. Run the requested experiments and report your results as tables/plots from the wandb interface. 6. Modify your code further so that it supports a different model (you will choose the model!). 7. Allow your code to choose a different optimizer (you will choose the optimizer!). 8. Run additional experiments in order to better understand PyTorch. You will carry out these tasks on two applications: image classification and text classification. 1 Although all students in this class have taken an Introduction to Machine Learning course before, some of those (even here at CMU) did cover PyTorch and others did not. We want to ensure that everyone here is ready to start HW1 at the same level. 2 of 21

Homework 0: PyTorch Primer 10-423/10-623 Computing Environment First you need to setup your computing environment. Below we outline how you could do so on your laptop, or on Google Colab. Local Environment To use PyTorch on your laptop, we recommend the following setup. 1. Follow the instructions linked below to install Python using MiniConda. https://docs.conda.io/projects/miniconda/en/latest/ miniconda-install.html Then create and activate a new python environment. For example: conda create -n py31 python=3.11 conda activate py31 2. Next follow the instructions from PyTorch on how to install locally by selecting ”Conda” for the ”Package” options. For most laptops, you would select ”CPU” or ”Default” as the ”Compute Plat- form”. https://pytorch.org/get-started/locally/ 3. Install Weights & Biases https://docs.wandb.ai/quickstart pip install wandb 4. Install any other Python packages you may need with conda when possible, and pip otherwise. conda install <package> pip install <package> Colab Google Colab provides free easy access to some amount of GPU compute. On the free tier, your GPU jobs will time out after a fixed number of hours and you may be temporarily unable to use a GPU if you use too many hours. The limits are dynamic and not clearly documented. There are two ways to use Colab: 1. As a Jupyter Notebook: To see an example of PyTorch in Colab, click the Run in Google Colab link from the tutorial: https://pytorch.org/tutorials/beginner/basics/intro. html 2. As a Terminal: You can also treat Google Colab as a VM and run code as you would at the terminal. You should first put your code and data in a Google Drive folder. Then create a Code cell with the following snippet and run it to mount your Google Drive folder. from google.colab import drive drive.mount(’/content/drive’) Now when you open the Files window on the left, you’ll be able to view drive/MyDrive which contains all your Google Drive files. You can run terminal commands by prefixing the command with an exclamation point in a cell. For example, if your code helloworld.py is in mycode/ , then you can run: !pwd !cd /content/drive/MyDrive/mycode 3 of 21

Homework 0: PyTorch Primer 10-423/10-623 !pwd !python helloworld.py Whether you’re using Colab as a notebook or a terminal, if you want to use a GPU, you must set the Runtime to use a GPU via Runtime → Change runtime type → T4 GPU . Better GPUs are available by upgrading to Colab Pro. 4 of 21

Homework 0: PyTorch Primer 10-423/10-623 1 Background Reading (2 points) This assignment is primarily a reading assignment. You will read both the starter code and various tutorials. Complete the following readings before you begin. • PyTorch Tutorial. Please read the full collection of the Introduction to PyTorch, i.e. Learn the Basics ∥ Quickstart ∥ Tensors ∥ Datasets & DataLoaders ∥ Transforms ∥ Build Model ∥ Autograd ∥ Optimization ∥ Save & Load Model. https://pytorch.org/tutorials/beginner/basics/intro.html • Weights & Biases Tutorial. Please read the Quickstart. https://docs.wandb.ai/quickstart 1.1. (1 point) Did you read the PyTorch tutorial? If you already read this or an equivalent reading for another course, you may answer ‘yes’ here. ⃝ Yes ⃝ No 1.2. (1 point) Did you read the Weights & Biases Quickstart tutorial? If you already read this or an equivalent reading for another course, you may answer ‘yes’ here. ⃝ Yes ⃝ No 5 of 21