Skip to main content

Documents Computer Science

2019_final.pdf

2019_final

.pdf

School

University of Illinois, Urbana Champaign *

*We aren’t endorsed by this school

Course

553

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

2

Uploaded by CorporalHerring3809

Machine Learning - Final Exam - 2019 The duration of the midterm is 1h and 10 minutes. Good luck! December, 17 2019 Question 1 - Short questions (20 points, equally distributed) Please briefly answer the following questions. Be precise and concise in your answer. In the context of reiforcement learning: 1) What is a policy function? 2) What is a value function? 3) What is a Bellman equation? 4) Cite one advantage of TD (temporal difference) methods compared to MC (Monte Carlo). 5) In model-free reinforcement learning, the policy improvement step is typically an -greedy improve- ment, as opposed the the greedy improvement used in dynamic programming. Why? 6) In dynamic programming and reinforcement learning in general, it is typical to discount future rewards by a parameter γ < 1 . Please give at least one justification for discounting future rewards. 7) Explain the difference between SARSA and Q-learning. 8) Give an example of a potential application of reinforcement learning in finance. Question 2 (20 points) Please briefly explain the picture below. Question 3 (20 points) Consider a markov reward process (MRP) with two states: the good state and the bad state. In the good state, the reward is equal to 4 , and in the bad state the reward is 2 . The probability of transition from the good state to the bad state is 0 . 5 . The probability of transitioning from the bad state to the good state is also 0 . 5 . The time discount factor is γ = 0 . 5 . Find the value function of this MRP.

Question 4 (20 points) As you have seen the last graded assignent, we can find the theoretical price of a stock by solving the Bellman equation: V ( s t ) = E t [ R t + γV ( s t +1 )] As in your homework, suppose you have a data set of realized rewards R t and a sequence of visited states s t stored in a buffer. You parametrize the value function using a neural network. Write a pseudo-algorithm to solve the Bellman equation. Question 5 (20 points) Q-learning and neural networks have been known for decades, but for a long time it was believed that the use of neural networks to represent value functions was unstable. That changed in 2014, when Deepmind showed that a single deep neural network learned to play many Atari video games, archieving “super human” performance on many of them. Two key ingredients that made training more stable are “replay buffers” and “fixed targets”. Explain what are replay buffers and fixed targets, and how they help overcome the unstability of learning value functions with neural networks.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Questions

Make known the following:- General objective- Specific objectives- Breakdown of activities- Justification and contribution:What is the research for?Who benefits from the results?Does it help solve any practical problems?Does it contribute to increasing knowledge?Can the results be generalized?- ViabilityCan this research be carried out?How long will it take to do it?- Planning of the following statement: Personnel are required to register their name, name, surname, date of birth, address, pension, telephone number. and institutional mail. The client is interested in knowing his name, name, address, business, telephone, email, date of birth. Each person can only have one telephone number, one address, one forecast and one email. Work Orders must be registered in the system, which record the date and number. Each order is associated with a single vehicle and a single customer. The cause(s) of the breakdown must be established: Accident Address Alarm Bodywork Electric Clutch On…

Which of the following is not a valid assertion regarding supervised learning?

Computer Science Scenario: Suppose you were to let your friend use your computer for a couple of hours in order for them to complete their homework assignments. What kind of precautions would you take before letting them use your computer? In those precautions, what is the logic behind your reasoning? How would your precautions change if this computer was being used by the general public instead?

Please help step by step with explanation for Program R (CS) with a final code for understanding thank you.

While conducting research, you find the following pieces of evidence: 1. “As a result of my research, I believe that the fictional works of Bret Harte influenced Jack London’s writing most strongly. -Betty Boynton (No sources were listed as documentation.) 2. “Jack London is quoted as saying on frequent occasions that he was influenced the most by Herman Melville.” (Three sources were listed as documentation. Why is Evidence Piece #2 a stronger piece of evidence than Evidence Piece #1? - Evidence Piece #2 lists no sources. - none of these

Management of information System

Which of the following statements is true regarding choosing the machine learning approach? * a) If we have a target variable, we use unsupervised learning b) If we have a target variable, we use supervised learning c) If we have a target variable, we use either supervised learning or unsupervised learning d) A decision cannot be made between supervised or unsupervised learning on the basis of the target variable.

Give proper explaination otherwise dislike

Choose the correct answer in the picture below.

Consider how Descartes and his method(s) may be applied to a IT Major student. Consider his four moral rules and norms. Are any of these benefits to one's life. Please explain briefly in your perspective in 5 to 6 paragraphs. reference - Descartes, Discourse on Method

In the context of supervised learning, which of the following assertions is not accurate? * a) It is a kind of machine learning known as reinforcement learning; b) It does not need any response variable at all; and c) It includes classification, a type of supervised learning that is included in this category. d) None of the choices presented before it

Note to Expert: I had asked this question before and was answered by someone, thank you! But I think the expert was mistaken for question 2 and question 1 answer and I don't know if the answer he provided for question 3 is based on question 2 or not. So I had to ask this question again. I did not understand the previous answer. Thank You! These questions are in sequence and the previous question needs to be referred to answer the next question: This is the previous question that was askedQuestion 1(question to be answered at the end): Create the Entity-Relationship (ER) Diagram for the following scenario: We want to maintain a database for a Faculty within a multi-campus regional university. The faculty has a number of schools and each school is characterized by its school-code which is unique, name, and campus location. Schools employ professors who are characterized by their unique staff-id, name and the school they work for. A school offers courses that are characterized by their…

How does the idea of "interdependence" differ from other concepts when used to the setting of systems thinking? [There is a need for clarification]

Base on your own experience, what are the positive and negative consequences of cohesion? Give at most 3 negative and 3 positive consequences. Explain Further. (200-250 words)

Answer the 2 questions, will upvote if complete, short solution/explanation will dopls do not reject. thanks

Can a model include several dependent variables? Can decision problems contain several variables?

Identify the independent variable(s) and dependent variable(s) of the study. Develop the conceptual framework Suggest null and alternative hypothesis/hypotheses of the study.

The Nobel peace prize is awarded to persons who excel in various fields of endeavor. Currently only a handful of panel members determine and choose people who qualify for a particular award based on nomination from a wide range of disciplines’. This process of selecting the winners for the various prices have been criticized as being biased especially from the side of the panel members. The committee of experts at a general council meeting decided to automate the selection process. On a trial bases, you have been engaged to develop a system that will automatically select a nominee as the winner of the award in two categories (peace and science). Your program should have the following methodsalities. a) You are to define a class named NobelAward. Your class must have two methodss, one of the methodss should be named AwardCategory, and the other methods should be named AwardWinner. Your AwardCategory must have a string argument and return the string when called.…

The Nobel peace prize is awarded to persons who excel in various fields of endeavor. Currently only a handful of panel members determine and choose people who qualify for a particular award based on nomination from a wide range of disciplines’. This process of selecting the winners for the various prices have been criticized as being biased especially from the side of the panel members. The committee of experts at a general council meeting decided to automate the selection process. On a trial bases, you have been engaged to develop a system that will automatically select a nominee as the winner of the award in two categories (peace and science). Your program should have the following methodsalities.a) You are to define a class named NobelAward. Your class must have two methodss, one of the methodss should be named AwardCategory, and the other methods should be named AwardWinner. Your AwardCategory must have a string argument and return the string when called. Your AwardWinner methods…

Is it impossible to provide opposing views on the topic of IT?

C.W. Churchman once said that "Mathematics...tends to lull the unsuspecting that he who thinks elaborately thinks well." Do you think that the best QA Models are the ones that are most elaborate and complex Mathematically? Why?

Question#2) Case Study: JazzN!ghts is a famous Jazz festival, held in Zurich every year. Since its first edition in 1986, it has gone through several major changes regarding its structure, length and location, but the tickets have always been sold in a traditional way: through two events agencies. The organizers decided to completely modernize the tickets selling system and created the following concept. From this year on, the tickets will be sold in three distinct ways: traditionally, i.e. by the two events agencies, in electronic format directly on the festival website, and through SBB. All parties will have access to the same unique tickets database of the new system, to avoid double selling. A partnership with the SBB railway company needs to be set up, such that SBB can sell combi-tickets including both the festival admission fee and the train ride to the festival venue at reduced price, from anywhere in Switzerland. This way, more music fans would have easier and cheaper access…

Course: Research Methodology Which one of the following tests theories and hypotheses? (choose one) a. Applied Research b. Inductive Research c. Fundamental Research d. Deductive Research

Write a compare/contrast essay between two of the short stories Freeing the Pike by Richard Wagamese and Borders by Thomas King relating to either theme or character. Compare two characters in the story. How are they similar? How are they different? The length of this essay should be approximately 800 words and should follow traditional essay structure. The thesis statement must be written as the last sentence in the introductory paragraph, and must explicitly list the subtopics you will be exploring. You should use quotations from the text to support your opinions. It should be written in third-person and avoid use of personal pronouns. There is no need to attach a bibliography unless you make reference to outside sources.

SEE MORE QUESTIONS

Recommended textbooks for you

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS

Related Questions

SEE MORE QUESTIONS

Recommended textbooks for you

Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS