2023S1_DATA1001_Exam_Main_v3_RELEASED (1)

.pdf

School

The University of Sydney *

*We aren’t endorsed by this school

Course

1001

Subject

Mathematics

Date

Feb 20, 2024

Type

pdf

Pages

22

Uploaded by BailiffBravery13461

Report
Final Exam A Semester 1 2023 The University of Sydney School of Mathematics and Statistics DATA1001/1901 Foundations of Data Science June 2023 Lecturers: Di Warren Time Allowed: Reading time — 10 minutes; Writing time — 1.5 hours Exam Conditions: This is a closed-book examination — no material permitted. Writing is not permitted at all during reading time. Family Name: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SID: . . . . . . . . . . . . . . . . . . . . . . . . . . . Other Names: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seat Number: . . . . . . . . . . . . . . . . . Please check that your examination paper is complete (23 pages) and indicate by signing below. I have checked the examination paper and affirm it is complete. Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date: . . . . . . . . . . . . . . . . . . . . . . . . . This examination has two sections: Multiple Choice and Extended Answer. The Multiple Choice Section is worth 50% of the total examination. There are 20 questions. The questions are of equal value. All questions may be attempted. Answers to the Multiple Choice questions must be entered on the Multiple Choice Answer Sheet before the end of the examination. The Extended Answer Section is worth 50% of the total examination. There are 3 questions. The questions are of equal value. All questions may be attempted. Working must be shown. Concept Sheet & Calculators: There is a concept sheet after the last question in this booklet. Calculators may NOT be used. THE QUESTION PAPER MUST NOT BE REMOVED FROM THE EXAMINATION ROOM. Marker’s use only Page 1 of 23
Final Exam A Semester 1 2023 Page 2 of 23 Multiple Choice Section In each question, choose at most one option. Your answers must be entered on the Multiple Choice Answer Sheet. 1. What is a complexity that is commonly associated with data linkage of human subjects? (a) Ensuring the privacy of participants (b) Data wrangling (c) Getting ethics approval (d) All of the other answers 2. Which of the following scenarios would most likely be conducted as a randomised con- trolled trial? (a) An Australian clinical trial for a new drug (b) Interviews for all new workers at Woolworths (c) Feedback on a new teaching method (d) A study of Sydney’s air pollution over 5 years 3. What graphical summary could represent 1 qualitative variable and 1 quantitative vari- able? (a) Q-Q plot (b) Scatter plot (c) Clustered bar chart (d) Comparative boxplot 4. A company decreases all their food prices by 2%. By how much will the mean and standard deviation of food prices change, respectively? (a) 2% and 4% (b) 2% and 2% (c) 0% and 2% (d) 2% and 0%
Final Exam A Semester 1 2023 Page 3 of 23 5. Given univariate, quantitative data, which of the following is impossible? (a) Mean= - 1 (b) Median = - 1 (c) Standard deviation = - 1 (d) Lower threshold = - 1 6. Which R command works out this area under the curve for X N (1 , 2 2 )? (a) pnorm(2,1,2)-pnorm(0,1,2) (b) pnorm(2,1,2)-pnorm(-2,1,2) (c) pnorm(2,1,4)-pnorm(0,1,4) (d) pnorm(2)-pnorm(0) 7. Measurement error is defined as follows: Individual measurement = exact value + chance error + bias. How could we estimate the chance error? (a) Remove any outliers and calculate the RMS. (b) Find the systematic error (related to the bias). (c) Replicate the measurements under the same conditions, and calculate the standard deviation. (d) Find the exact value and bias, and subtract them from the individual measurements.
Final Exam A Semester 1 2023 Page 4 of 23 8. Using just the following R output, which statement is correct. lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x 1.467 1.315 cor(x,y) [1] 0.7645729 (a) ˆ x = 1 . 467 + 1 . 315 y (b) The scatter plot could have many shapes. (c) The data fits well along a line of positive slope. (d) As x increases by 1 unit, y increases by 1.467 units. 9. Two variables X and Y have correlation 0.7. If we swap the data values of X and Y , and then minus 0.1 from each value of Y , what is the correlation of the new variables? (a) 0.5 (b) 0.6 (c) 0.7 (d) 0.8 10. In linear regression, what is the mean of the gaps between the data points and the regression line? (a) Always zero (b) The residuals (c) The RMS Error (d) The standard deviation
Final Exam A Semester 1 2023 Page 5 of 23 11. When does the Prosecutor’s Fallacy occur? (a) When it is assumed that the chance of evidence given innocence is the same as innocence given evidence. (b) When it is assumed that the chance of evidence given guilt is the same as evidence given innocence. (c) When it is assumed that the chance of evidence given innocence is the same as evidence given guilt. (d) When it is assumed that the chance of guilt given innocence is the same as evidence given guilt. 12. Suppose we toss a biased coin 10 times with P(head)=0.3 at every toss. The results of each toss are independent of each other. What is the chance we get exactly 3 heads? (a) 3 10 0 . 7 3 0 . 3 7 (b) 10 3 0 . 3 3 (c) 10 3 0 . 3 10 (d) 10 7 0 . 3 3 0 . 7 7 13. Suppose we randomly draw 100 times from a box with replacement, and sum the results. We then repeat this process many times and plot a simulation histogram of the sums. For which box would we expect to see an approximately normal-shaped histogram? (a) Box = 0,1 (b) Box = 1,2,3 (c) Box = 0,0,0,0,0,0,0,0,0,1 (d) All of the boxes 14. In a survey to determine Sydney students’ opinions on student fees, what is NOT a possible source of bias? (a) A poorly worded question (b) Surveying one statistics lecture (c) A wealthy student in the survey group (d) Conducting an online survey
Final Exam A Semester 1 2023 Page 6 of 23 15. In a market research study, 100 people were given a sample of brand-name chips and home-brand chips (in random order) and asked which they preferred in taste. 70 people preferred the brand-name chips. Let p = P(preference for brand-name chips). To test for no difference in preference between the two types of chips, what is the ap- propriate null hypothesis? (a) H 0 : p = 0 (b) H 0 : p = 0 . 5 (c) H 0 : p = 0 . 7 (d) H 0 : p > 0 . 7 16. What does a p-value of 0.85 mean? (a) The data is consistent with the null hypothesis. (b) There is a 85% chance that the null hypothesis is true. (c) There is a 15% chance that the alternative hypothesis is true. (d) We should accept the null hypothesis with probability 0.15. 17. The data in Milk.csv consists of the milk yield of 100 cows. t.test(Milk,mu=11) One Sample t-test data: Milk t = 4.9291, df = 99, p-value = 3.323e-06 alternative hypothesis: true mean is not equal to 11 95 percent confidence interval: 12.53485 14.60315 sample estimates: mean of x 13.569 What would be the conclusion of the hypothesis H 0 : μ = 13 vs H 1 : μ 6 = 13 when α = 0 . 05. (a) We should reject H 0 . (b) The data is consistent with H 0 . (c) The p-value is 0.000003 (6dp). (d) Not enough information given.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help