2023S1_DATA1001_Exam_Main_v3_RELEASED (1)
.pdf
keyboard_arrow_up
School
The University of Sydney *
*We aren’t endorsed by this school
Course
1001
Subject
Mathematics
Date
Feb 20, 2024
Type
Pages
22
Uploaded by BailiffBravery13461
Final Exam A Semester 1 2023
The University of Sydney
School of Mathematics and Statistics
DATA1001/1901
Foundations of Data Science
June 2023
Lecturers:
Di Warren
Time Allowed:
Reading time — 10 minutes; Writing time — 1.5 hours
Exam Conditions:
This is a closed-book examination — no material permitted. Writing
is not permitted at all during reading time.
Family Name:
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SID:
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Other Names:
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Seat Number:
. . . . . . . . . . . . . . . . .
Please check that your examination paper is complete (23 pages) and indicate by signing below.
I have checked the examination paper and affirm it is complete.
Signature:
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Date:
. . . . . . . . . . . . . . . . . . . . . . . . .
This examination has two sections: Multiple Choice and Extended Answer.
The Multiple Choice Section is worth 50% of the total examination.
There are 20 questions. The questions are of equal value.
All questions may be attempted.
Answers to the Multiple Choice questions must be entered on
the Multiple Choice Answer Sheet before the end of the examination.
The Extended Answer Section is worth 50% of the total examination.
There are 3 questions. The questions are of equal value.
All questions may be attempted. Working must be shown.
Concept Sheet & Calculators: There is a concept sheet after the last
question in this booklet. Calculators may NOT be used.
THE QUESTION PAPER MUST NOT BE REMOVED FROM THE
EXAMINATION ROOM.
Marker’s use
only
Page 1 of 23
Final Exam A Semester 1 2023
Page 2 of 23
Multiple Choice Section
In each question, choose at most one option.
Your answers must be entered on the Multiple Choice Answer Sheet.
1.
What is a complexity that is commonly associated with data linkage of human subjects?
(a) Ensuring the privacy of participants
(b) Data wrangling
(c) Getting ethics approval
(d) All of the other answers
2.
Which of the following scenarios would most likely be conducted as a randomised con-
trolled trial?
(a) An Australian clinical trial for a new drug
(b) Interviews for all new workers at Woolworths
(c) Feedback on a new teaching method
(d) A study of Sydney’s air pollution over 5 years
3.
What graphical summary could represent 1 qualitative variable and 1 quantitative vari-
able?
(a) Q-Q plot
(b) Scatter plot
(c) Clustered bar chart
(d) Comparative boxplot
4.
A company decreases all their food prices by 2%.
By how much will the mean and
standard deviation of food prices change, respectively?
(a) 2% and 4%
(b) 2% and 2%
(c) 0% and 2%
(d) 2% and 0%
Final Exam A Semester 1 2023
Page 3 of 23
5.
Given univariate, quantitative data, which of the following is impossible?
(a) Mean=
-
1
(b) Median =
-
1
(c) Standard deviation =
-
1
(d) Lower threshold =
-
1
6.
Which R command works out this area under the curve for
X
∼
N
(1
,
2
2
)?
(a)
pnorm(2,1,2)-pnorm(0,1,2)
(b)
pnorm(2,1,2)-pnorm(-2,1,2)
(c)
pnorm(2,1,4)-pnorm(0,1,4)
(d)
pnorm(2)-pnorm(0)
7.
Measurement error is defined as follows: Individual measurement = exact value + chance
error + bias.
How could we estimate the chance error?
(a) Remove any outliers and calculate the RMS.
(b) Find the systematic error (related to the bias).
(c) Replicate the measurements under the same conditions, and calculate the standard
deviation.
(d) Find the exact value and bias, and subtract them from the individual measurements.
Final Exam A Semester 1 2023
Page 4 of 23
8.
Using just the following R output, which statement is correct.
lm(y~x)
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept)
x
1.467
1.315
cor(x,y)
[1] 0.7645729
(a) ˆ
x
= 1
.
467 + 1
.
315
y
(b) The scatter plot could have many shapes.
(c) The data fits well along a line of positive slope.
(d) As
x
increases by 1 unit,
y
increases by 1.467 units.
9.
Two variables
X
and
Y
have correlation 0.7. If we swap the data values of
X
and
Y
,
and then minus 0.1 from each value of
Y
, what is the correlation of the new variables?
(a) 0.5
(b) 0.6
(c) 0.7
(d) 0.8
10.
In linear regression, what is the mean of the gaps between the data points and the
regression line?
(a) Always zero
(b) The residuals
(c) The RMS Error
(d) The standard deviation
Final Exam A Semester 1 2023
Page 5 of 23
11.
When does the Prosecutor’s Fallacy occur?
(a) When it is assumed that the chance of evidence given innocence is the same as
innocence given evidence.
(b) When it is assumed that the chance of evidence given guilt is the same as evidence
given innocence.
(c) When it is assumed that the chance of evidence given innocence is the same as
evidence given guilt.
(d) When it is assumed that the chance of guilt given innocence is the same as evidence
given guilt.
12.
Suppose we toss a biased coin 10 times with P(head)=0.3 at every toss. The results of
each toss are independent of each other. What is the chance we get exactly 3 heads?
(a)
3
10
0
.
7
3
0
.
3
7
(b)
10
3
0
.
3
3
(c)
10
3
0
.
3
10
(d)
10
7
0
.
3
3
0
.
7
7
13.
Suppose we randomly draw 100 times from a box with replacement, and sum the results.
We then repeat this process many times and plot a simulation histogram of the sums.
For which box would we expect to see an approximately normal-shaped histogram?
(a) Box = 0,1
(b) Box = 1,2,3
(c) Box = 0,0,0,0,0,0,0,0,0,1
(d) All of the boxes
14.
In a survey to determine Sydney students’ opinions on student fees, what is NOT a
possible source of bias?
(a) A poorly worded question
(b) Surveying one statistics lecture
(c) A wealthy student in the survey group
(d) Conducting an online survey
Final Exam A Semester 1 2023
Page 6 of 23
15.
In a market research study, 100 people were given a sample of brand-name chips and
home-brand chips (in random order) and asked which they preferred in taste. 70 people
preferred the brand-name chips.
Let
p
= P(preference for brand-name chips).
To test for no difference in preference between the two types of chips, what is the ap-
propriate null hypothesis?
(a)
H
0
:
p
= 0
(b)
H
0
:
p
= 0
.
5
(c)
H
0
:
p
= 0
.
7
(d)
H
0
:
p >
0
.
7
16.
What does a p-value of 0.85 mean?
(a) The data is consistent with the null hypothesis.
(b) There is a 85% chance that the null hypothesis is true.
(c) There is a 15% chance that the alternative hypothesis is true.
(d) We should accept the null hypothesis with probability 0.15.
17.
The data in
Milk.csv
consists of the milk yield of 100 cows.
t.test(Milk,mu=11)
One Sample t-test
data:
Milk
t = 4.9291, df = 99, p-value = 3.323e-06
alternative hypothesis: true mean is not equal to 11
95 percent confidence interval:
12.53485 14.60315
sample estimates:
mean of x
13.569
What would be the conclusion of the hypothesis
H
0
:
μ
= 13 vs
H
1
:
μ
6
= 13 when
α
= 0
.
05.
(a) We should reject
H
0
.
(b) The data is consistent with
H
0
.
(c) The p-value is 0.000003 (6dp).
(d) Not enough information given.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help