2019_final
.pdf
keyboard_arrow_up
School
University of Illinois, Urbana Champaign *
*We aren’t endorsed by this school
Course
553
Subject
Computer Science
Date
Jan 9, 2024
Type
Pages
2
Uploaded by CorporalHerring3809
Machine Learning - Final Exam - 2019
The duration of the midterm is 1h and 10 minutes. Good luck!
December, 17 2019
Question 1 - Short questions (20 points, equally distributed)
Please briefly answer the following questions.
Be precise and concise in your answer.
In the context of
reiforcement learning:
1) What is a policy function?
2) What is a value function?
3) What is a Bellman equation?
4) Cite one advantage of TD (temporal difference) methods compared to MC (Monte Carlo).
5) In model-free reinforcement learning, the policy improvement step is typically an
-greedy improve-
ment, as opposed the the greedy improvement used in dynamic programming. Why?
6) In dynamic programming and reinforcement learning in general, it is typical to discount future rewards
by a parameter
γ <
1
. Please give at least one justification for discounting future rewards.
7) Explain the difference between SARSA and Q-learning.
8) Give an example of a potential application of reinforcement learning in finance.
Question 2 (20 points)
Please briefly explain the picture below.
Question 3 (20 points)
Consider a markov reward process (MRP) with two states: the
good
state and the
bad
state. In the good
state, the reward is equal to
4
, and in the bad state the reward is
2
. The probability of transition from the
good state to the bad state is
0
.
5
. The probability of transitioning from the bad state to the good state is
also
0
.
5
. The time discount factor is
γ
= 0
.
5
. Find the value function of this MRP.
Question 4 (20 points)
As you have seen the last graded assignent, we can find the theoretical price of a stock by solving the
Bellman equation:
V
(
s
t
) =
E
t
[
R
t
+
γV
(
s
t
+1
)]
As in your homework, suppose you have a data set of realized rewards
R
t
and a sequence of visited states
s
t
stored in a buffer. You parametrize the value function using a neural network. Write a pseudo-algorithm
to solve the Bellman equation.
Question 5 (20 points)
Q-learning and neural networks have been known for decades, but for a long time it was believed that the
use of neural networks to represent value functions was unstable. That changed in 2014, when Deepmind
showed that a single deep neural network learned to play many Atari video games, archieving “super human”
performance on many of them. Two key ingredients that made training more stable are “replay buffers” and
“fixed targets”. Explain what are replay buffers and fixed targets, and how they help overcome the unstability
of learning value functions with neural networks.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Make known the following:- General objective- Specific objectives- Breakdown of activities- Justification and contribution:What is the research for?Who benefits from the results?Does it help solve any practical problems?Does it contribute to increasing knowledge?Can the results be generalized?- ViabilityCan this research be carried out?How long will it take to do it?- Planning
of the following statement:
Personnel are required to register their name, name, surname, date of birth, address, pension, telephone number.
and institutional mail.
The client is interested in knowing his name, name, address, business, telephone, email, date of birth.
Each person can only have one telephone number, one address, one forecast and one email.
Work Orders must be registered in the system, which record the date and number.
Each order is associated with a single vehicle and a single customer.
The cause(s) of the breakdown must be established:
Accident
Address
Alarm
Bodywork
Electric
Clutch
On…
arrow_forward
Which of the following is not a valid assertion regarding supervised learning?
arrow_forward
Computer Science
Scenario: Suppose you were to let your friend use your computer for a couple of hours in order for them to complete their homework assignments.
What kind of precautions would you take before letting them use your computer?
In those precautions, what is the logic behind your reasoning?
How would your precautions change if this computer was being used by the general public instead?
arrow_forward
Please help step by step with explanation for Program R (CS) with a final code for understanding thank you.
arrow_forward
While conducting research, you find the following pieces of evidence:
1. “As a result of my research, I believe that the fictional works of Bret Harte influenced Jack London’s writing most strongly. -Betty Boynton (No sources were listed as documentation.)
2. “Jack London is quoted as saying on frequent occasions that he was influenced the most by Herman Melville.” (Three sources were listed as documentation.
Why is Evidence Piece #2 a stronger piece of evidence than Evidence Piece #1?
- Evidence Piece #2 lists no sources.
- none of these
arrow_forward
Management of information System
arrow_forward
Which of the following statements is true regarding choosing the machine learning approach? *
a) If we have a target variable, we use unsupervised learning
b) If we have a target variable, we use supervised learning
c) If we have a target variable, we use either supervised learning or unsupervised learning
d) A decision cannot be made between supervised or unsupervised learning on the basis of the target variable.
arrow_forward
Give proper explaination otherwise dislike
arrow_forward
Choose the correct answer in the picture below.
arrow_forward
Consider how Descartes and his method(s) may be applied to a IT Major student. Consider his four moral rules and norms. Are any of these benefits to one's life. Please explain briefly in your perspective in 5 to 6 paragraphs. reference - Descartes, Discourse on Method
arrow_forward
In the context of supervised learning, which of the following assertions is not accurate? * a) It is a kind of machine learning known as reinforcement learning; b) It does not need any response variable at all; and c) It includes classification, a type of supervised learning that is included in this category. d) None of the choices presented before it
arrow_forward
Note to Expert: I had asked this question before and was answered by someone, thank you! But I think the expert was mistaken for question 2 and question 1 answer and I don't know if the answer he provided for question 3 is based on question 2 or not. So I had to ask this question again. I did not understand the previous answer. Thank You!
These questions are in sequence and the previous question needs to be referred to answer the next question:
This is the previous question that was askedQuestion 1(question to be answered at the end):
Create the Entity-Relationship (ER) Diagram for the following scenario:
We want to maintain a database for a Faculty within a multi-campus regional university. The faculty has a number of schools and each school is characterized by its school-code which is unique, name, and campus location. Schools employ professors who are characterized by their unique staff-id, name and the school they work for. A school offers courses that are characterized by their…
arrow_forward
How does the idea of "interdependence" differ from other concepts when used to the setting of systems thinking? [There is a need for clarification]
arrow_forward
Base on your own experience, what are the positive and negative consequences of cohesion? Give at most 3 negative and 3 positive consequences. Explain Further. (200-250 words)
arrow_forward
Answer the 2 questions, will upvote if complete, short solution/explanation will dopls do not reject. thanks
arrow_forward
Can a model include several dependent variables?
Can decision problems contain several variables?
arrow_forward
Identify the independent variable(s) and dependent variable(s) of the study.
Develop the conceptual framework
Suggest null and alternative hypothesis/hypotheses of the study.
arrow_forward
The Nobel peace prize is awarded to persons who excel in various fields of
endeavor. Currently only a handful of panel members determine and choose
people who qualify for a particular award based on nomination from a wide
range of disciplines’. This process of selecting the winners for the various prices
have been criticized as being biased especially from the side of the panel
members. The committee of experts at a general council meeting decided to
automate the selection process. On a trial bases, you have been engaged to
develop a system that will automatically select a nominee as the winner of the
award in two categories (peace and science). Your program should have the
following methodsalities.
a) You are to define a class named NobelAward. Your class must have two
methodss, one of the methodss should be named AwardCategory, and the
other methods should be named AwardWinner. Your AwardCategory
must have a string argument and return the string when called.…
arrow_forward
The Nobel peace prize is awarded to persons who excel in various fields of endeavor. Currently only a handful of panel members determine and choose people who qualify for a particular award based on nomination from a wide range of disciplines’. This process of selecting the winners for the various prices have been criticized as being biased especially from the side of the panel members. The committee of experts at a general council meeting decided to automate the selection process. On a trial bases, you have been engaged to develop a system that will automatically select a nominee as the winner of the award in two categories (peace and science). Your program should have the following methodsalities.a) You are to define a class named NobelAward. Your class must have two methodss, one of the methodss should be named AwardCategory, and the other methods should be named AwardWinner. Your AwardCategory must have a string argument and return the string when called. Your AwardWinner methods…
arrow_forward
Is it impossible to provide opposing views on the topic of IT?
arrow_forward
C.W. Churchman once said that "Mathematics...tends to lull the unsuspecting that he who thinks elaborately thinks well." Do you think that the best QA Models are the ones that are most elaborate and complex Mathematically? Why?
arrow_forward
Question#2) Case Study:
JazzN!ghts is a famous Jazz festival, held in Zurich every year. Since its first edition in 1986, it has gone
through several major changes regarding its structure, length and location, but the tickets have always
been sold in a traditional way: through two events agencies. The organizers decided to completely
modernize the tickets selling system and created the following concept. From this year on, the tickets will
be sold in three distinct ways: traditionally, i.e. by the two events agencies, in electronic format directly
on the festival website, and through SBB. All parties will have access to the same unique tickets database
of the new system, to avoid double selling. A partnership with the SBB railway company needs to be set
up, such that SBB can sell combi-tickets including both the festival admission fee and the train ride to the
festival venue at reduced price, from anywhere in Switzerland. This way, more music fans would have
easier and cheaper access…
arrow_forward
Course: Research Methodology
Which one of the following tests theories and hypotheses? (choose one)
a. Applied Research
b. Inductive Research
c. Fundamental Research
d. Deductive Research
arrow_forward
Write a compare/contrast essay between two of the short stories Freeing the Pike by Richard Wagamese and Borders by Thomas King relating to either theme or character. Compare two characters in the story. How are they similar? How are they different? The length of this essay should be approximately 800 words and should follow traditional essay structure. The thesis statement must be written as the last sentence in the introductory paragraph, and must explicitly list the subtopics you will be exploring. You should use quotations from the text to support your opinions. It should be written in third-person and avoid use of personal pronouns. There is no need to attach a bibliography unless you make reference to outside sources.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
Related Questions
- Make known the following:- General objective- Specific objectives- Breakdown of activities- Justification and contribution:What is the research for?Who benefits from the results?Does it help solve any practical problems?Does it contribute to increasing knowledge?Can the results be generalized?- ViabilityCan this research be carried out?How long will it take to do it?- Planning of the following statement: Personnel are required to register their name, name, surname, date of birth, address, pension, telephone number. and institutional mail. The client is interested in knowing his name, name, address, business, telephone, email, date of birth. Each person can only have one telephone number, one address, one forecast and one email. Work Orders must be registered in the system, which record the date and number. Each order is associated with a single vehicle and a single customer. The cause(s) of the breakdown must be established: Accident Address Alarm Bodywork Electric Clutch On…arrow_forwardWhich of the following is not a valid assertion regarding supervised learning?arrow_forwardComputer Science Scenario: Suppose you were to let your friend use your computer for a couple of hours in order for them to complete their homework assignments. What kind of precautions would you take before letting them use your computer? In those precautions, what is the logic behind your reasoning? How would your precautions change if this computer was being used by the general public instead?arrow_forward
- Please help step by step with explanation for Program R (CS) with a final code for understanding thank you.arrow_forwardWhile conducting research, you find the following pieces of evidence: 1. “As a result of my research, I believe that the fictional works of Bret Harte influenced Jack London’s writing most strongly. -Betty Boynton (No sources were listed as documentation.) 2. “Jack London is quoted as saying on frequent occasions that he was influenced the most by Herman Melville.” (Three sources were listed as documentation. Why is Evidence Piece #2 a stronger piece of evidence than Evidence Piece #1? - Evidence Piece #2 lists no sources. - none of thesearrow_forwardManagement of information Systemarrow_forward
- Which of the following statements is true regarding choosing the machine learning approach? * a) If we have a target variable, we use unsupervised learning b) If we have a target variable, we use supervised learning c) If we have a target variable, we use either supervised learning or unsupervised learning d) A decision cannot be made between supervised or unsupervised learning on the basis of the target variable.arrow_forwardGive proper explaination otherwise dislikearrow_forwardChoose the correct answer in the picture below.arrow_forward
- Consider how Descartes and his method(s) may be applied to a IT Major student. Consider his four moral rules and norms. Are any of these benefits to one's life. Please explain briefly in your perspective in 5 to 6 paragraphs. reference - Descartes, Discourse on Methodarrow_forwardIn the context of supervised learning, which of the following assertions is not accurate? * a) It is a kind of machine learning known as reinforcement learning; b) It does not need any response variable at all; and c) It includes classification, a type of supervised learning that is included in this category. d) None of the choices presented before itarrow_forwardNote to Expert: I had asked this question before and was answered by someone, thank you! But I think the expert was mistaken for question 2 and question 1 answer and I don't know if the answer he provided for question 3 is based on question 2 or not. So I had to ask this question again. I did not understand the previous answer. Thank You! These questions are in sequence and the previous question needs to be referred to answer the next question: This is the previous question that was askedQuestion 1(question to be answered at the end): Create the Entity-Relationship (ER) Diagram for the following scenario: We want to maintain a database for a Faculty within a multi-campus regional university. The faculty has a number of schools and each school is characterized by its school-code which is unique, name, and campus location. Schools employ professors who are characterized by their unique staff-id, name and the school they work for. A school offers courses that are characterized by their…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Np Ms Office 365/Excel 2016 I NtermedComputer ScienceISBN:9781337508841Author:CareyPublisher:Cengage
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage