Midterm Quiz 2

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

6501

Subject

Computer Science

Date

May 27, 2024

Type

pdf

Pages

30

Uploaded by GrandMoonTrout26

Report
Course Midterm Quiz 2 - Summer 2020 Midterm Quiz 2 Midterm Quiz 2 - GT Students and Veri±ed MM Learners Midterm Quiz 2 - GT Students and Veri±ed MM Learners Midterm Quiz 2 due Jul 7, 2020 23:00 PDT Past Due 90 Minute Time Limit Instructions Work alone. Do not collaborate with or copy from anyone else. You may use any of the following resources: One sheet (both sides) of handwritten (not photocopied or scanned) notes If any question seems ambiguous, use the most reasonable interpretation (i.e. don't be like Calvin):
Good Luck! This is the beginning of Midterm Quiz 2. Please make sure that you submit all your answers before the time runs out. Once you submit an answer to a question, you cannot change it. There is no overall Submit button. After submitting all answers, please click the "End my Exam" button, above, before exiting from ProctorTrack to complete your exam. Information for Question 1 There are ±ve questions labeled "Question 1." Answer all ±ve questions. For each of the following ±ve questions, select the probability distribution that could best be used to model the described scenario. Each distribution might be used, zero, one, or more than one time in the ±ve questions. These scenarios are meant to be simple and straightforward; if you're an expert in the ±eld the question asks about, please do not rely on your expertise to ±ll in all the extra complexity (you'll end up making the questions below more di²cult than I intended).
Question 1 1.4/1.4 points (graded) Number of hits to a real estate web site each minute Poisson You have used 1 of 1 attempt Question 1 1.4/1.4 points (graded) Time between hits on a real estate web site Exponential You have used 1 of 1 attempt Question 1 1.4/1.4 points (graded) Time from when a house is put on the market until the ±rst o²er is received Weibull You have used 1 of 1 attempt Question 1 1.4/1.4 points (graded) Time between people entering the ID-check queue at an airport Submit Submit Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Exponential You have used 1 of 1 attempt Question 1 1.4/1.4 points (graded) Number of faces correctly identi±ed by deep learning (DL) software until an error is made Geometric You have used 1 of 1 attempt Questions 2a, 2b 10.0/10.0 points (graded) Five classi±cation models were built for predicting whether a neighborhood will soon see a large rise in home prices, based on public elementary school ratings and other factors. The training data set was missing the school rating variable for every new school (3% of the data points). Because ratings are unavailable for newly-opened schools, it is believed that locations that have recently experienced high population growth are more likely to have missing school rating data. Model 1 used imputation, ±lling in the missing data with the average school rating from the rest of the data. Model 2 used imputation, building a regression model to ±ll in the missing school rating data based on other variables. Model 3 used imputation, ±rst building a classi±cation model to estimate (based on other variables) whether a new school is likely to have been built as a result of recent population growth (or whether it has been built for another purpose, e.g. to replace a very old school), and then using that classi±cation to select one of two regression models to ±ll in an estimate of the school rating; there are Submit Submit
two di±erent regression models (based on other variables), one for neighborhoods with new schools built due to population growth, and one for neighborhoods with new schools built for other reasons. Model 4 used a binary variable to identify locations with missing information. Model 5 used a categorical variable: ²rst, a classi²cation model was used to estimate whether a new school is likely to have been built as a result of recent population growth; and then each neighborhood was categorized as "data available", "missing, population growth", or "missing, other reason". a. If school ratings can be reasonably well-predicted from the other factors, and new schools built due to recent population growth can be reasonably well-classi²ed using the other factors, which model would you recommend? b. In which of the following situations would you recommend using Model 2? [All predictions and classi²cations below are using the other factors.] Model 1 Model 2 Model 3 Model 4 Model 5 Ratings can be well-predicted, and reasons for building schools can be well-classi²ed. Ratings can be well-predicted, and reasons for building schools cannot be well-classi²ed.
You have used 1 of 1 attempt Information for Question 3 In a diet problem (like we saw in the lessons and homework), let x be the amount of food i in the solution ( x >= 0) , and let M be the maximum amount that can be eaten of any food. Suppose we added new variables y that are binary (i.e., they must be either 0 or 1): if food i is eaten in the solution, then it is part of the solution ( y = 1) ; otherwise y = 0 . There are ±ve questions labeled "Question 3." Answer all ±ve questions. For each of the following ±ve questions, select the mathematical constraint that best corresponds to the English sentence. Each constraint might be used, zero, one, or more than one time in the ±ve questions. Question 3 1.4/1.4 points (graded) Select the mathematical constraint that corresponds to the following English sentence: Ratings cannot be well-predicted, and reasons for building schools can be well-classi±ed. Ratings cannot be well-predicted, and reasons for building schools cannot be well-classi±ed. Submit i i i i i
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
If any amount of cheese sauce is eaten, then its binary variable must be 1. You have used 1 of 1 attempt Question 3 1.4/1.4 points (graded) Select the mathematical constraint that corresponds to the following English sentence: Submit
Either peanut butter or cheese sauce, but not both, must be eaten. You have used 1 of 1 attempt Question 3 1.4/1.4 points (graded) Select the mathematical constraint that corresponds to the following English sentence: Submit
If neither cheese sauce nor peanut butter is eaten, then broccoli can't be eaten either. You have used 1 of 1 attempt Question 3 0.0/1.4 points (graded) Select the mathematical constraint that corresponds to the following English sentence: Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Cheese sauce must be eaten. You have used 0 of 1 attempt Question 3 1.4/1.4 points (graded) Select the mathematical constraint that corresponds to the following English sentence: Submit
Broccoli can only be eaten if either cheese sauce or peanut butter (or both) is also eaten. You have used 1 of 1 attempt Question 4a 5.0/5.0 points (graded) A large company's internal IT helpdesk has created a stochastic discrete-event simulation model of its operations, including help-request arrivals, routing of requests to the appropriate sta± member, and the amount of time needed to give assistance. Submit
The helpdesk is not ±rst-come-±rst-served. A more-important problem (like the spread of serious malware to many company computers) will be dealt with ahead of a less-important problem (like a ripped mouse pad), and a higher-level employee (like the CEO) might be helped ahead of a lower-level employee (like a worker in the mailroom). When a new request for help comes in, the helpdesk will run the simulation to quickly give the requester an estimate of the expected wait time before being helped. How many times does the company need to run the simulation for each new help request (i.e., how many replications are needed)? You have used 1 of 1 attempt Information for Question 4b Once, because the outcome will be the same each time Many times, because of the variability and randomness Once, because each patient is unique Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The ±gure above shows the average of the ±rst x simulated wait times, as new replications ("runs") are run and added into the overall average. It is not showing the wait time just for each replication. For example, after x =101 replications, the wait time of the 101st replication is not necessarily 72, but the average of those 101 replications is about 72. Question 4b 4.0/5.0 points (graded) If the goal is to report the expected wait time to within +/- 2 minutes, what can you conclude from the ±gure above? Select all of the answers that are correct. The simulation could have been stopped after 400 runs (replications).
You have used 1 of 1 attempt Question 4c 6.0/6.0 points (graded) Suppose it is discovered that simulated wait times are 25% higher than actual wait times, on average. What would you recommend that they do? You have used 1 of 1 attempt The simulation could even have been stopped after 300 runs (replications). The simulated wait time was 50 or less at least once out of all the runs (replications). The expected wait time of simulated runs (replications) is likely to be between 65 and 75. There is signi±cant variability in the simulated wait time of the runs (replications). Submit Scale down all estimates by a factor of 1/1.25 to get the average simulation estimates to match the average actual wait times. Investigate to see what's wrong with the simulation, because it's a poor match to reality. Use the 25%-higher estimates, because that's what the simulation output is. Submit
Information for Question 5 For each of the optimization problems below, select its most precise classi±cation. In each model, x are the variables, all other letters (a,b,c) refer to known data, and the values of c are all positive. There are seven questions labeled "Question 5". Answer all seven questions. Each classi±cation might be used, zero, one, or more than one time in the seven questions. Question 5 1.0/1.0 point (graded) Maximize subject to for all all Integer program You have used 1 of 1 attempt Question 5 1.0/1.0 point (graded) Minimize subject to for all all Convex program Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
You have used 1 of 1 attempt Question 5 1.0/1.0 point (graded) Minimize subject to for all all Convex quadratic program You have used 1 of 1 attempt Question 5 1.0/1.0 point (graded) Minimize sin subject to for all all General non-convex program You have used 1 of 1 attempt Question 5 Submit Submit Submit
1.0/1.0 point (graded) Maximize subject to for all all General non-convex program You have used 1 of 1 attempt Question 5 1.0/1.0 point (graded) Maximize subject to for all all Linear program You have used 1 of 1 attempt Question 5 1.0/1.0 point (graded) Minimize log subject to for all Submit Submit
all Linear program You have used 1 of 1 attempt Questions 6a,6b,6c 8.0/12.0 points (graded) A medium-sized city is analyzing the size of its judicial system, speci±cally the number of judges it has available for hearing cases at di²erent times of the year. At busy times (about 10% of the times), the arrival rate is 20 new cases ready for trial per day. At other times, the arrival rate is 10 new cases ready for trial per day. Once an judge is assigned to a case (at any time), it takes an average of 0.5 days to complete. [NOTE: This is a very simpli±ed version of the judicial system. If you have deeper knowledge of how the judicial system works, please do not use it for this question; you would end up making the question more complex than it is designed to be.] a. The ±rst model the city tries is a queuing model with 3 judges always available. What would you expect the queuing model to show? b. The second model the city tries is a queuing model with 15 judges available during busy times and 7 judges available during non-busy times What would you expect the queuing model to show? Submit Wait times are low at both busy and non-busy times. Wait times are low at busy times and high at non-busy times. Wait times are low at non-busy times and high at busy times. Wait times are high at both busy and non-busy times.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
times. What would you expect the queuing model to show? The city now has decided that, when there are 20 cases waiting, the city will start to refer waiting cases to arbitrators who try to get the cases settled without a trial (arbitration also takes about 0.5 days per case). Once the arbitrators start hearing cases, the arbitrators continue to be assigned cases until no more cases are waiting. The city would like to model this new process with a Markov chain, where each state is the number of cases waiting (e.g., 0 cases waiting, 1 case waiting, etc.). Notice that now, the transition probabilities from a state like "3 cases waiting" depend on whether the arbitrators are currently hearing cases, and therefore depend on whether the system was more recently in the state "20 cases waiting" or "0 cases waiting". c. Which of the following statements about the process (the judicial system) and its relation to the Markov chain's memoryless property (previous states don't a±ect the probability of moving from one state to another) is true? Wait times are low at both busy and non-busy times. Wait times are low at busy times and high at non-busy times. Wait times are low at non-busy times and high at busy times. Wait times are high at both busy and non-busy times. The process is memoryless, so the Markov chain is an appropriate model. The process is memoryless and the Markov chain is an appropriate model only if the arrivals follow the Poisson distribution and the case durations follow the Exponential distribution. The process is not memoryless, so the Markov chain model would not be not well-de²ned.
You have used 1 of 1 attempt Questions 7a,7b 0.0/10.0 points (graded) A charity is testing two di±erent mailings to see whether one generates more donations than another. The charity is using A/B testing: For each person on the charity's mailing list, the charity randomly selects one mailing or the other to send. The results after 2000 trials are shown below. Trials Donation rate 95% con±dence interval Option A 1036 9.7% 7.9%-11.5% Option B 964 5.2% 3.8%-6.6% Note: The "donation rate" is the fraction of recipients who donate. Higher donation rates are better. a. What should the charity do? Later, the charity developed 7 new options, so they used a multi-armed bandit approach where each option is chosen with probability proportional to its likelihood of being the best. The results after 2000 total trials are shown below. Submit Switch to exploitation (utilize Option A only; A is clearly better) Switch to exploitation (utilize Option B only; B is clearly better) More exploration (test both options; it is unclear yet which is better)
Donation rate Mean donation Median donation Donation rate Mean donation Median donation Option #1 3.2% $112 $100 Option #2 4.2% $98 $75 Option #3 5.2% $174 $125 Option #4 5.5% $153 $100 Option #5 6.5% $122 $80 Option #6 10.8% $132 $100 Option #7 15.0% $106 $75 b. If the charity's main goal is to ±nd the option that has the highest mean donation, which type of test should they use to see if the option that appears best is signi±cantly better than each of the other options? You have used 1 of 1 attempt Information for Question 8a For each of the mathematical optimization models, select the variable-selection/regularization method it most-precisely represents (or select "none of the above" if none of the other choices are appropriate). In each model, x is the data, y is the response, a are the coe²cients, n is the number of data points, m is the number of predictors, and T and are appropriate constants. Binomial-based test Non-parametric test Parametric test Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
There are four questions labeled "Question 8a". Answer all four questions. Each of the choices might be used zero, one, or more than one time in the four questions. Question 8a 1.0/1.0 point (graded) Minimize subject to Ridge regression You have used 1 of 1 attempt Question 8a 1.0/1.0 point (graded) Minimize subject to Lasso regression You have used 1 of 1 attempt Question 8a 1.0/1.0 point (graded) Submit Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Minimize subject to Elastic net You have used 1 of 1 attempt Question 8a 1.0/1.0 point (graded) Minimize None of the above You have used 1 of 1 attempt Submit Submit Question 8b 4/4 points (graded) Keyboard Help Rank the following regression and variable-selection/regularization methods from fewest variables selected to most variables selected. All four methods will be used (the bottom contains two equivalent spaces).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
FEEDBACK Correctly placed 4 items. Good work! You have completed this drag and drop problem. Final attempt was used, highest score is 4.0 Show Answer You have used 1 of 1 attempts. Submit Reset Lasso regression Elastic net Ridge regression Linear regression
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 8c 6.0/6.0 points (graded) Select all of the following reasons that you might want to use stepwise regression, lasso, etc. to limit the number of factors in a model. You have used 1 of 1 attempt Question 8d 3.0/3.0 points (graded) In the simple linear regression model i. What are the variables from an optimization perspective? To ±nd a simpler model Because there isn't enough data to avoid over±tting a model with many factors To ±nd a more-complex model Submit Only Both and
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ii. What are the variables from a regression perspective? You have used 1 of 1 attempt Both and Only Only Only Both and Only Only Both and Submit Question 8e 7/7 points (graded) Keyboard Help
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Put the following seven steps in order, from what is done ±rst to what is done last. Remove outliers Impute missing data values Scale data
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
FEEDBACK Correctly placed 7 items. Good work! You have completed this drag and drop problem. Show Answer You have used 1 of 1 attempts. Submit Reset Fit lasso regression model on all variables Fit linear regression, regression tree, and random forest models using variables chosen by lasso regression Pick model to use based on performance on a di±erent data set Test model on another di±erent set of data to estimate quality
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
There are ±ve questions labeled "Question 9". Answer all ±ve questions. For each question, select the most appropriate model/approach to answer the question/analyze the situation described. Each model/approach might be used zero, one, or more than one time in the ±ve questions. Question 9 1.4/1.4 points (graded) Does Lasik surgery signi±cantly improve the median vision of people who get that surgery? Non-parametric test You have used 1 of 1 attempt Question 9 1.4/1.4 points (graded) How many servers are needed so database users don't need to wait too long for query processing? Queuing You have used 1 of 1 attempt Question 9 1.4/1.4 points (graded) Find sets of terrorists that have a lot of communication within each set. Final attempt was used, highest score is 7.0 Submit Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
© All Rights Reserved Louvain algorithm You have used 1 of 1 attempt Question 9 1.4/1.4 points (graded) What distinct sets of recipes can be identi±ed where there are many ingredients shared within each set? Louvain algorithm You have used 1 of 1 attempt Question 9 0.0/1.4 points (graded) Estimate the number of workers required to work at a call center based on call arrivals and lengths. Stochastic optimization You have used 1 of 1 attempt This is the end of Midterm Quiz 2. Please make sure that you submit all your answers before the time runs out. Once you submit an answer to a question, you cannot change it. There is no overall Submit button. After submitting all answers, please click the "End my Exam" button, above, before exiting from ProctorTrack to complete your exam. Submit Submit Submit
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help