Final Exam Practice A Solution
docx
keyboard_arrow_up
School
Iowa State University *
*We aren’t endorsed by this school
Course
421
Subject
Statistics
Date
Jan 9, 2024
Type
docx
Pages
17
Uploaded by hengxu2017
1
Stat 421: Final Exam Formula Sheet
z
.005
= 2.576
z
.025
= 1.96
z
.05
= 1.645
1.
(
N
n
)
=
N !
n!
(
N
−
n
)
!
s
2
=
1
n
−
1
∑
i
=
1
n
(
y
i
−
¯
y
)
2
2.
¯
y
=
1
n
∑
i
=
1
n
y
i
^
V
(
¯
y
)
=
s
2
n
(
1
−
n
N
)
3.
^
t
=
N
¯
y
^
V
(
^
t
)
=
N
2
^
V
(
¯
y
)
4.
^
p
=
¯
y
^
V
( ^
p
) =
^
p
(
1
− ^
p
)
n
−
1
(
1
−
n
N
)
5.
n
=
z
α
/
2
2
s
2
e
2
+
z
α
/
2
2
s
2
N
n
=
z
α
/
2
2
N
2
s
2
e
2
+
z
α
/
2
2
Ns
2
6.
^
t
ψ
=
1
n
∑
i
∈
A
Q
i
y
i
ψ
i
^
V
(
^
t
ψ
)=
1
n
(
n
−
1
)
∑
i
∈
A
Q
i
[
y
i
ψ
i
−
^
t
ψ
]
2
¯
^
y
ψ
=
^
t
ψ
N
^
V
( ¯
y
ψ
)=
^
V
(
^
t
ψ
)
N
2
7.
^
t
str
=
∑
h
=
1
H
^
t
h
=
∑
h
=
1
H
N
h
¯
y
h
^
V
(
^
t
str
)=
∑
h
=
1
H
(
1
−
n
h
N
h
)
N
h
2
s
h
2
n
h
¯
y
h
=
∑
j
=
1
n
h
y
hj
n
h
^
t
h
=
N
h
n
h
∑
j
=
1
n
h
y
hj
=
N
h
¯
y
h
s
h
2
=
∑
j
=
1
n
h
(
y
hj
−¯
y
h
)
2
n
h
−
1
2
8.
^
p
str
=
∑
h
=
1
H
N
h
N
^
p
h
^
V
( ^
p
str
)=
∑
h
=
1
H
(
1
−
n
h
N
h
)
(
N
h
N
)
2
^
p
h
(
1
− ^
p
h
)
n
h
−
1
9.
^
B
=
¯
y
/
¯
x
^
V
(
^
B
) =
(
1
−
n
N
)
s
e
2
n
¯
x
U
2
e
i
=
y
i
−
x
i
^
B
s
e
2
=
1
n
−
1
∑
i
=
1
n
(
e
i
−¯
e
)
2
10.
¯
^
y
r
=
^
B
¯
x
U
^
V
(
¯
^
y
r
)
= ¯
x
U
2
^
V
(
^
B
)
11.
^
B
1
=
s
xy
s
x
2
=
rs
y
s
x
^
B
0
=¯
y
−
^
B
1
¯
x
s
xy
=
1
n
−
1
∑
i
=
1
n
(
x
i
−
¯
x
)(
y
i
−
¯
y
)
r
=
s
xy
/
s
x
s
y
¯
^
y
reg
=¯
y
+
^
B
1
(¯
x
U
−¯
x
)
=
^
B
0
+
^
B
1
¯
x
U
^
V
(
¯
^
y
reg
)=
(
1
−
n
N
)
s
e
2
n
e
i
=
y
i
−
^
B
0
−
x
i
^
B
1
s
e
2
=
1
n
−
1
∑
i
=
1
n
e
i
2
12.
n
c
S
N
c
S
N
n
H
l
l
l
l
h
h
h
h
1
/
/
H
l
l
l
l
h
h
h
h
c
S
N
c
S
N
n
n
1
/
/
2
2
2
/
2
1
2
2
2
2
/
e
z
e
S
N
n
n
z
n
H
h
h
h
h
3
13.
¯
y
d
=
^
B
=
¯
u
¯
x
^
V
(
¯
y
d
)=
(
1
−
n
N
)
1
n
¯
x
U
2
∑
i
=
1
n
(
u
i
−
^
B x
i
)
2
n
−
1
¯
y
d
=
1
n
d
∑
i
∈
S
d
y
i
^
V
(
¯
y
d
)≃
(
1
−
n
N
)
s
yd
2
n
d
s
yd
2
=
1
n
d
−
1
∑
i
∈
S
d
(
y
i
−
¯
y
d
)
2
^
t
d
=
N
d
¯
y
d
^
t
d
=
N
¯
u
14.
¯
^
y
post
=
∑
h
=
1
H
N
h
N
¯
y
hR
^
V
(
¯
^
y
post
) ≃
∑
h
=
1
H
(
N
h
N
)
2
(
1
−
n
hR
N
h
)
(
s
hR
2
n
hR
)
15.
^
t
unb
=
N
n
∑
i
=
1
n
t
i
^
V
(
^
t
unb
)
=
N
2
(
1
−
n
N
)
s
t
2
n
s
t
2
=
∑
i
=
1
n
(
t
i
−
^
t
unb
N
)
n
−
1
2
¯
^
y
unb
=
^
t
unb
K
^
V
(
¯
^
y
unb
)
=
^
V
(
^
t
unb
)
K
2
¯
^
y
r
=
∑
i
=
1
n
t
i
∑
i
=
1
n
M
i
^
V
(
¯
^
y
r
)
=
(
1
−
n
N
)
1
n
¯
M
U
2
∑
i
=
1
n
(
t
i
−
¯
^
y
r
M
i
)
n
−
1
2
=
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4
(
1
−
n
N
)
1
n
¯
M
U
2
∑
i
=
1
n
M
i
2
(
¯
y
i
−
¯
^
y
r
)
n
−
1
2
^
t
r
=
K
¯
^
y
r
^
V
(
^
t
r
)
=
K
2
^
V
(
¯
^
y
r
)
16.
^
t
i
=
M
i
¯
y
i
¯
y
i
=
1
m
i
∑
j
=
1
m
i
y
ij
^
t
unb
=
N
n
∑
i
=
1
n
^
t
i
^
V
(
^
t
unb
)
=
N
2
(
1
−
n
N
)
s
t
2
n
+
N
n
∑
i
=
1
n
(
1
−
m
i
M
i
)
M
i
2
s
i
2
m
i
s
t
2
=
∑
i
=
1
n
(
^
t
i
−
^
t
unb
N
)
n
−
1
2
¯
^
y
r
=
∑
i
=
1
n
M
i
¯
y
i
∑
i
=
1
n
M
i
^
V
(
¯
^
y
r
)
=
(
1
¯
M
2
)
[
(
1
−
n
N
)
s
r
2
n
+
1
nN
∑
i
=
1
n
(
1
−
m
i
M
i
)
M
i
2
s
i
2
m
i
]
s
r
2
=
∑
i
=
1
n
(
M
i
¯
y
i
−
M
i
¯
^
y
r
)
2
n
−
1
=
∑
i
=
1
n
M
i
2
(
¯
y
i
−
¯
^
y
r
)
n
−
1
2
5
Stat 421 – Spring 2016
Name: _____________________________
Final Exam, May 5, 2016
This midterm has 3 “short answer” questions,
each with multiple parts, and 5 True/False
questions.
The questions cover pages 5-16 of this packet. Please show your work, and
remember to include units where relevant. For the short answer computations, if you
include a complete formula with numbers, you do not need to complete the calculation.
1.
A real estate company wants to understand the composition of houses in a community
with a total of 40 houses. The real estate company selects an SRSWOR of 4 houses. For
the sampled houses, the real estate company collects information on the size of the garage
and the number bedrooms in the house.
The table below contains the collected data. [30
points]
Sample House
ID #s
Garage size
(square feet)
Number of
Bedrooms
1
384
2
2
308
1
3
484
2
4
576
4
a.
First, the real estate company wants to estimate the average number of bedrooms
among houses with a garage size of at least 400 square feet.
(i)
Define the
domain
U
d
of interest in words. [3 points]
Houses with a garage size of at least 400 feet.
(ii)
Provide a mathematical expression for the domain population parameter
of interest. Define the meanings of the symbols used in your expression,
including the definition of
y
i
. [3 points]
´
y
U
d
=
1
N
d
∑
i
∈
U
d
y
i
, where
N
d
is the number of elements in
U
d
, and
y
i
is the number
of bedrooms in house
i
6
(iii)
Record the values for the following terms. [3 points]
N
40
n
4
n
d
2
(iv)
Fill in the columns in the table below for the variables
x
i
and
u
i
[6 points]
Sample
House ID
#s
Garage size
(square
feet)
Number of
Bedrooms
x
i
u
i
1
384
2
0
0
2
308
1
0
0
3
484
2
1
2
4
576
4
1
4
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
7
(v) Estimate the average number of bedrooms among households with a garage size of
at least 400 square feet. [5 points]
´
y
d
=
1
n
d
∑
i
∈
A
d
y
i
=
2
+
4
2
=
3
bedrooms
/
house
(vi) Estimate the standard error of the estimator used in (v).
[5 points]
´
y
^
V
(
¿¿
d
)=
(
1
−
n
N
)
S
d
2
n
d
=
(
1
−
4
40
)
2
2
¿
´
y
^
SE
(
¿¿
d
)=
√
(
1
−
4
40
)
=
0.95
¿
8
(vii) Now, the real estate company wants to estimate the total number of bedrooms
for the households that have a garage size of at least 400 square feet. The real estate
company does not know the total number of households that have a garage size of at
least 400 square feet. Using the variable
u
i
defined in (iv), estimate the total
number of bedrooms among households that have a garage size of at least 400
square feet. [5 points]
^
t
d
=
N
n
∑
i
∈
A
u
i
=
40
4
(
2
+
4
)
=
60
bedrooms
∈
houses with garagesthat areat least
400
squarefeet
2.
A soil scientist wants to estimate the average water erosion (soil loss due to rainfall; units
of tons/acre) on crop fields in a region.
She selects an SRSWOR of
n
=
100
crop
9
fields from the
N
=
1000
crop fields in the population. Many of the operators of the
sampled crop fields refuse to participate in the survey. She decides to use post-
stratification to adjust for nonresponse. The post-strata are the two groups defined by
whether or not corn was grown in the field last year.
The table below contains the
population size, respondent sample size, sample mean of the respondents, and sample
standard deviation of the respondents for each post-stratum. [20 points]
Post-stratum (
h
)
Number in
population
(
N
h
)
Number of
respondents
(
n
hR
)
Mean erosion for
respondents
( ´
y
hR
)
(tons/acre)
Standard deviation of
erosion for
respondents
(
s
hR
)
Corn not grown
(
h = 1
)
250
15
2
1
Corn grown
(
h
= 2
)
750
35
4
2
a.
Why might the simple mean of the 50 respondents be a biased estimator of the
overall mean erosion in this region? Provide two reasons. [2 points]
The probability of responding might be related to the level of erosion in the field.
For example, if farmers with higher erosion are less likely to respond to the survey,
then we would expect the simple mean to have a negative bias for the population
mean.
b.
Give a formula for the post-stratified estimator of the mean erosion in the whole
population, and define the symbols used in your formula.[6 points]
^
´
y
post
=
∑
h
=
1
H
N
h
N
´
y
hR
,
where
N
h
is the population size for stratum
h
,
N
is
the total population size, and
´
y
hR
is the mean of respondents in stratum
h
.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
10
c.
Using the formula from (b), estimate the mean erosion in this region using post-
stratification. (Remember to include units) [5 points]
^
´
y
post
=
250
1000
2
+
750
1000
4
=
3.5
tons
/
acre
d.
What is the weight for the post-stratified estimator for a respondent that grew corn
last year? [3 points]
N
h
n
hR
=
750
35
=
21.43
e.
What assumption justifies the use of post-stratification to adjust for nonresponse?[4
points]
11
The distribution of soil erosion is the same for the responding and
nonresponding portions of the population within each post-strata, where the post-strata are
defined by fields were corn is grown and fields where corn is not grown.
12
3.
A researcher from the department of education in Rhode Island is interested in the
smoking behaviors among high school students in Rhode Island. Rhode Island has a total
of 45,000 high school students enrolled in a total of 60 high schools. The researcher
selects an SRSWOR of 2 high schools from the 60 high schools in Rhode Island. From
each selected high school, the researcher selects an SRSWOR of 10% of the students in
the high school. That is, for both sampled high schools,
m
i
M
i
=
0.1
.
The researcher asks each sampled student, “Do you smoke cigarettes?”
The table below
summarizes the collected data for the 2 high schools in the sample. [40 points]
Sample high
school ID #s
M
i
^
t
i
1
400
100
2
1000
200
a.
What is the target population?[2 points]
High school students in Rhode Island
b.
Define the variable of interest,
y
ij
for student
j
in high school
i
.[2
points]
y
ij
= 1 if student smokes cigarettes, 0 otherwise
c.
Does this study have potential sources of measurement error? Explain.[2 points]
Yes, the student may lie; the question is also not precise in terms of specifying the
frequency of smoking that is of interest.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
13
d.
Provide numeric values for the following symbols. [7 points]
N
60
K
45,000
n
2
m
1
40
π
1
2/60
π
j
∨
1
.1
π
1
, j
2/600
e.
Is this a self-weighting design? Why, or why not? [2 points]
Yes, the weight is 300 for all sampled units.
14
f.
Now, we will estimate the total number of high school students in Rhode Island
who smoke cigarettes using the unbiased estimator of the total.
(i)
Give a mathematical formula for the population parameter of interest
and define symbols used that are not given in (d).[2 points]
t
=
∑
i
=
1
N
t
i
,t
i
=
∑
j
∈
U
i
y
ij
(ii)
Give a formula for the unbiased estimator of the total number of high
school students in Rhode Island who smoke cigarettes.[4 points]
^
t
=
N
n
∑
i
=
1
n
^
t
i
,
^
t
i
=
∑
j
=
1
m
i
M
i
m
i
y
ij
(iii)
Estimate the total number of high school students in Rhode Island who
smoke cigarettes using the unbiased estimator. [4 points]
^
t
=
N
n
∑
i
=
1
n
^
t
i
=
60
(
100
+
200
2
)
=
9,000
15
(iv)
Give a formula for the standard error of the unbiased estimator of the
total number of high school students in Rhode Island who smoke
cigarettes. [4 points]
SE
{
^
t
}
=
√
N
2
(
1
−
n
N
)
S
t
2
n
+
N
n
∑
i
∈
A
M
i
2
(
1
−
m
i
M
i
)
S
i
2
(v)
Estimate the variance of the unbiased estimator of the total number of
high school students in Rhode Island who smoke cigarettes. (Hint: Use
the property that a proportion is a special case of a mean to estimate the
within-cluster component of the variance.)[4 points]
First, we estimate the between-psu component:
N
2
(
1
−
n
N
)
S
t
2
n
=
60
2
(
1
−
2
60
)
(
100
−
200
)
2
/
2
2
=
8,700,000
Because the estimates of the within-PSU components are based on SRS
proportions,
S
i
2
m
i
=
^
p
i
(
1
−
^
p
i
)
m
i
−
1
=
M
i
−
1
^
t
i
(
1
−
M
i
−
1
^
t
i
)
m
i
−
1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
16
The within-PSU component is
N
n
∑
i
∈
A
M
i
2
(
1
−
m
i
M
i
)
S
i
2
m
i
=
60
2
{
400
2
(
1
−
0.1
)
(
0.25
(
0.75
)
)
39
+
1000
2
(
1
−
0.1
)
(
0.2
(
0.8
)
)
99
}
¿
64405.59
We combine the between-psu component and the with-in psu component to obtain
SE
{
^
t
}
=
√
8,700,000
+
64405.59
=
2960.47
g.
Use the unbiased estimator to estimate the proportion of high school students in
Rhode Island who smoke cigarettes. (Hint: Use your answer to (f.) and recall that a
proportion is a special case of a mean.) [3 points]
^
p
=
9,000
45,000
=
.2
h.
For this example, would you recommend using the unbiased estimator or the ratio
estimator? Explain. [4 points]
17
Ratio estimator because cluster population sizes vary
4.
TRUE/FALSE: 2 points each.
[T
F]
Stratified sampling typically leads to estimators of overall population means that are
less efficient (higher MSE) than estimators from an SRSWOR of the same number of
elements, especially if the strata define groups that are more homogeneous than the overall
population.
[T
F]
Cluster sampling is usually used because of practical reasons, such as the frame
structure or to reduce data collection costs, rather than to improve the efficiency.
Cluster
samples usually lead to less precise estimators (higher MSE) than estimators from an
SRSWOR of the same number of elements.
[
T
F]
In probability proportional to size sampling with replacement (PPSWR), we select a
with replacement sample where the probability of selecting an element on a single draw is
proportional to the size measure for the element. If the size measure is correlated with the
variable of interest, then we expect estimators of overall population means and totals from
the
PPSWR sample to be more efficient (lower MSE) than estimators from an SRSWOR
of the same number of elements.
[T
F]
An SRSWOR of size
n
=
50
is selected from a population of size
N
=
500
. The
population mean of an auxiliary variable is known to be
´
x
U
=
50
.
The sample mean of
the same auxiliary variable is
´
x
=
60
.
The ratio estimator of the population mean of the
variable of interest will be larger than the sample mean.
[
T
F]
Systematic sampling is a special case of one-stage cluster sampling, where one cluster
is selected.
Related Documents
Recommended textbooks for you
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage

Mathematics For Machine Technology
Advanced Math
ISBN:9781337798310
Author:Peterson, John.
Publisher:Cengage Learning,

College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Recommended textbooks for you
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:CengageMathematics For Machine TechnologyAdvanced MathISBN:9781337798310Author:Peterson, John.Publisher:Cengage Learning,College Algebra (MindTap Course List)AlgebraISBN:9781305652231Author:R. David Gustafson, Jeff HughesPublisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage

Mathematics For Machine Technology
Advanced Math
ISBN:9781337798310
Author:Peterson, John.
Publisher:Cengage Learning,

College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning