Exam-3 Practice Exam

.pdf

School

University of Southern California *

*We aren’t endorsed by this school

Course

561

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

19

Uploaded by MegaRam3541

Final Exam CSCI 561 Fall 2022: Foundation of Artificial Intelligence Instructions: 1. Maximum credits/points for this midterm: 100 points. 2. No books (or any other material) are allowed. 3. All the questions in this exam are going to be auto-graded. This means that you should exactly follow the instructions in entering your results. 4. You are allowed to use a calculator. 5. Some questions have hints. Be sure to check them before solving the problem. 6. Adhere to the Academic Integrity Code. 7. Please make sure that you write the answers in the format discussed. Problems 100 Percent Total 1-General Al Knowledge 18% 2 Decision Trees 12% 3 - Neural Networks 15% 4 Bayesian Networks 15% 5 Probability Theory 15% 6 —HMM, Temporal Model 15% 7 Naive Bayes 10%
1. True/False [18%] For each of the statements below, fill in the bubble T if the statement is always and unconditionally true, or fill in the bubble F if it is always false, sometimes false, or just does not make sense: 1. Both deductive and inductive learning agents learn new rules/facts from a dataset. 2. Learning is useful as a system construction method, because we only need to expose the agents to reality without any manual input. 3. Inthe ID3 algorithm, we need to choose the attribute that has the largest expected information gain. 4. Both perceptron and decision tree learning can learn majority function (output 1 if and only if more than half of n binary variables are 1) easily. It is representable within a perceptron and only needs a few branches in DTL (Decision Tree Learning). 5. The process of learning of a neural network happens in both the feed-forward (prediction) part and back-propagation part. 6. The basic principles of deep learning are similar to those of basic neural networks, but deep learning has newer methods for larger datasets. 7. Probabilities of propositions may change with new evidence. 8. A complete probability model specifies every entry in the joint distribution for all the variables. 9. When calculating a probability distribution, normalization will be needed in the end to make the distribution sum to 1. However, even if you use the inference rules properly, the normalization may not be preserved. 10. Probabilistically speaking, two coin tosses are conditionally independent. 11. The reward for a probabilistic decision making model can be given from states R(s), stats-action R(s, a), or transition R(s, a, s). 12. The principle of a MEU (Maximum Expected Utility) is that a rational agent should always choose the action that maximizes the utility. 13. The major difference between a POMDP (Partially Observed Markov Decision Process) and a general MDP (Markov Decision Process) is merely a sensor model P(e|s). 14. States transit randomly for Markov Chains and Hidden Markov Models. 15. In HMM (Hidden Markov Models), there are two important independence properties. The first is that the future depends only on the present; the second is that observations are independent of each other. 16. Forward procedure computes all at(si) on state trellis, while the Viterbi algorithm only computes the best for each step.
17. Discrete valued dynamic Bayes nets are not HMMs (Hidden Markov Models). 18. For Bayesian learning, we are given a set of new data D, background knowledge X, and are supposed to predict a concept C where P(C|DX) is the most probable.
2. Decision Trees [12%] Lyft wants to analyze if a student at USC gets a lyft depending on if it is raining around the university, if the destination is near or far and whether or not the ride was free. They have provided the training data below and they need your help to train a machine to decide whether a student gets a lyft. Note: (for calculations, always take digits up to 3 decimal places and drop the rest without rounding. Eg. 0.9737 becomes 0.973) For all the following questions, use log base 2 (Use Table 2.1 to answer Q1-3) (Table 2.1) # Rain Free? Near? Takes Lyft 1| Yes No Yes Yes 2| No No Yes No 3| Yes No No Yes 4| Yes No No Yes 5| No Yes Yes Yes 6| Yes Yes Yes Yes 7 | No Yes Yes No 8| Yes Yes No Yes 9 [ No Yes Yes No 10 | Yes Yes Yes Yes
Q1. Calculate the information conveyed by the distribution of the Takes Lyft column to 3 decimal places [2%] 1. 0.879 2. 0.933 3. 1 4. 0.325 Q2. Which would be the best attribute to split on? (Assume this attribute to be X for further questions) [4%] A. Rain B. Free C. Near Q3. What is the value of Remainder(Free) ? (Ans up to 3 decimal places) [2%] a. 0.423 b. 0.634 c. 0.875 d. 0 Q4. Assume that the Entropy and Remainder values of the given training data is: (Use Table 2.2 to answer questions Q4 a. and Q4 b.) Entropy = 0.910 Remainder(X) = 0.230 Remainder(Y) = 0.510 Remainer(Z) = 0.810 ( Table 2.2)
SrNo X Y z is Correct? 1 TRUE TRUE FALSE Yes 2 FALSE TRUE TRUE No 3 FALSE TRUE TRUE No 4 TRUE FALSE TRUE Yes 5 TRUE FALSE TRUE Yes 6 FALSE FALSE TRUE No 7 TRUE TRUE TRUE Yes 8 FALSE FALSE FALSE No 9 TRUE TRUE FALSE Yes a. Which would be the worst attribute to split on for the given data? [2%] 1. X 2. Y 3 z b. What output (IS Correct) would the machine give after being trained (assuming the calculations are correct for the root node) for the test data where X=False, Y= False and Z=True. [Hint: The decision tree learned uses only one attribute] [2%)] 1. Yes
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help