lab06-sol

.pdf

School

Concordia University *

*We aren’t endorsed by this school

Course

6721

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

Uploaded by CaptainMaskWildcat29

COMP 6721 Applied Artificial Intelligence (Fall 2023) Lab Exercise #6: Artificial Neural Networks Solutions Question 1 Given the training instances below, use scikit-learn to implement a Perceptron classifier 1 that classifies students into two categories, predicting who will get an ‘A’ this year, based on an input feature vector x . Here’s the training data again: Feature(x) Output f(x) Student ’A’ last year? Black hair? Works hard? Drinks? ’A’ this year? X1: Richard Yes Yes No Yes No X2: Alan Yes Yes Yes No Yes X3: Alison No No Yes No No X4: Jeff No Yes No Yes No X5: Gail Yes No Yes Yes Yes X6: Simon No Yes Yes Yes No Use the following Python imports for the perceptron: import numpy as np from sklearn.linear _ model import Perceptron All features must be numerical for training the classifier, so you have to trans- form the ‘Yes’ and ‘No’ feature values to their binary representation: # Dataset with binary representation of the features dataset = np.array([[1,1,0,1,0], [1,1,1,0,1], [0,0,1,0,0], [0,1,0,1,0], [1,0,1,1,1], [0,1,1,1,0],]) For our feature vectors, we need the first four columns: X = dataset[:, 0:4] and for the training labels, we use the last column from the dataset: 1 https://scikit-learn.org/stable/modules/linear _ model.html#perceptron 1

y = dataset[:, 4] (a) Now, create a Perceptron classifier (same approach as in the previous labs) and train it. Most of the solution is provided above. Here is the additional code required to create a Perceptron classifier and train it using the provided dataset: perceptron _ classifier = Perceptron(max _ iter=40, eta0=0.1, random _ state=1) perceptron _ classifier.fit(X,y) The parameters we’re using here are: max _ iter The maximum number of passes over the training data (aka epochs). It’s set to 40, meaning the dataset will be passed 40 times to the Perceptron during training. eta0 This is the learning rate, determining the step size during the weights update in each iteration. A value of 0.1 is chosen, which is a moderate learning rate. random _ state This ensures reproducibility of results. The classifier will produce the same output for the same input data every time it’s run, aiding in debugging and comparison. Try experimenting with these values, for example, by changing the number of iterations or learning rate. Make sure you understand the significance of setting random _ state . (b) Let’s examine our trained Perceptron in more detail. You can look at the weights it learned with: print ( "Weights: " , perceptron _ classifier.coef _ ) And the bias, here called intercept term, with: print ( "Bias: " , perceptron _ classifier.intercept _ ) The activation function is not directly exposed, but scikit-learn is using the step activation function. Now check how your Perceptron would classify a training sample by computing the net activation (input vector × weights + bias) and applying the step function. You can use the following code to compute the net activation on all training data samples and compare this with your results: net _ activation = np.dot(X, perceptron _ classifier.coef _ .T) + → perceptron _ classifier.intercept _ print (net _ activation) 2

Remember that the step activation function classifies a sample as 1 if the net activation is non-negative and 0 otherwise. So, if a net activation is non-negative, the perceptron’s step function would classify it as 1, and otherwise, it would classify it as 0. (c) Apply the trained model to all training samples and print out the predic- tion. This works just like for the other classifiers we used before: y _ pred = perceptron _ classifier.predict(X) print (y _ pred) This will print the classification results like: [0 1 0 0 1 0] Compare the predicted labels with the actual labels from the dataset. How many predictions match the actual labels? What does this say about the performance of our classifier on the training data? 3

Question 2 Consider the neural network shown below. It consists of 2 input nodes, 1 hidden node, and 2 output nodes, with an additional bias at the input layer (attached to the hidden node) and a bias at the hidden layer (attached to the output nodes). All nodes in the hidden and output layers use the sigmoid activation function ( σ ). (a) Calculate the output of y1 and y2 if the network is fed x = (1 , 0) as input. h in = b h + w x 1 - h x 1 + w x 2 - h x 2 = (0 . 1) + (0 . 3 × 1) + (0 . 5 × 0) = 0 . 4 h = σ ( h in ) = σ (0 . 4) = 1 1 + e - 0 . 4 = 0 . 599 y 1 ,in = b y 1 + w h - y 1 h = 0 . 6 + (0 . 2 × 0 . 599) = 0 . 72 y 1 = σ (0 . 72) = 1 1 + e - 0 . 72 = 0 . 673 y 2 ,in = b y 2 + w h - y 2 h = 0 . 9 + (0 . 2 × 0 . 599) = 1 . 02 y 2 = σ (1 . 22) = 1 1 + e - 1 . 02 = 0 . 735 As a result, the output is calculated as y = ( y 1 , y 2) = (0 . 673 , 0 . 735) . (b) Assume that the expected output for the input x = (1 , 0) is supposed to be t = (0 , 1) . Calculate the updated weights after the backpropagation of the error for this sample. Assume that the learning rate η = 0 . 1 . δ y 1 = y 1 (1 - y 1 )( y 1 - t 1 ) = 0 . 673(1 - 0 . 673)(0 . 673 - 0) = 0 . 148 δ y 2 = y 2 (1 - y 2 )( y 2 - t 2 ) = 0 . 735(1 - 0 . 735)(0 . 735 - 1) = - 0 . 052 4

δ h = h (1 - h ) i =1 , 2 w h - y i δ y i = 0 . 599(1 - 0 . 599)[0 . 2 × 0 . 148+0 . 2 × ( - 0 . 052)] = 0 . 005 Δ w x 1 - h = - ηδ h x 1 = - 0 . 1 × 0 . 005 × 1 = - 0 . 0005 Δ w x 2 - h = - ηδ h x 2 = - 0 . 1 × 0 . 005 × 0 = 0 Δ b h = - ηδ h = - 0 . 1 × 0 . 005 = - 0 . 0005 Δ w h - y 1 = - ηδ y 1 h = - 0 . 1 × 0 . 148 × 0 . 599 = - 0 . 0088652 Δ b y 1 = - ηδ y 1 = - 0 . 1 × 0 . 148 = - 0 . 0148 Δ w h - y 2 = - ηδ y 2 h = - 0 . 1 × ( - 0 . 052) × 0 . 599 = 0 . 0031148 Δ b y 2 = - ηδ y 2 = - 0 . 1 × ( - 0 . 052) = 0 . 0052 w x 1 - h,new = w x 1 - h + Δ w x 1 - h = 0 . 3 + ( - 0 . 0005) = 0 . 2995 w x 2 - h,new = w x 2 - h + Δ w x 2 - h = 0 . 5 + 0 = 0 . 5 b h,new = b h + Δ b h = 0 . 1 + ( - 0 . 0005) = 0 . 0995 w h - y 1 ,new = w h - y 1 + Δ w h - y 1 = 0 . 2 + ( - 0 . 0088652) = 0 . 1911348 b y 1 ,new = b y 1 + Δ b y 1 = 0 . 6 + ( - 0 . 0148) = 0 . 5852 w h - y 2 ,new = w h - y 2 + Δ w h - y 2 = 0 . 2 + 0 . 0031148 = 0 . 2031148 b y 2 ,new = b y 2 + Δ b y 2 = 0 . 9 + 0 . 0052 = 0 . 9052 5

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version