3
.pdf
keyboard_arrow_up
School
University of Guelph *
*We aren’t endorsed by this school
Course
2040
Subject
Statistics
Date
Apr 3, 2024
Type
Pages
16
Uploaded by CoachTeamWasp34
Introductory Statistics Explained (1.11 Draft)
Exercises
Descriptive Statistics
©
2022, 2023, 2024 Jeremy Balka
J.B.’s strongly suggested exercises:
1
,
4
,
7
,
8
,
9
,
10
,
11
,
16
,
17
,
20
,
21
,
22
,
24
,
25
,
36
,
46
,
49
NB The titles and numbers of sections may not (yet) sync up with what is in the text.
1
Plots for Qualitative and Quantitative Variables
1.1
Plots for Qualitative Variables
1. A Finnish study
1
investigated a possible association between the gender of convicted murderers and
their relationship to the victim. A random selection of 91 female murderers and a random selection
of 91 male murderers were obtained, and the results are illustrated in Figure
1
.
Gender
Male
Female
Total
Acquaintance
61
37
98
Partner
22
32
54
Family Member
4
18
22
Stranger
4
4
8
Total
91
91
182
Male Offenders
Female Offenders
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Relative Frequency
Acquaintance
Partner
Family
Stranger
Figure 1: Gender of murderer and their relationship to the victim in Finnish murders.
(a) For male murderers, summarize the distribution of the relationship between the murderer and
the victim.
(b) Describe the observed di
ff
erences in the distributions of the relationship to the victim between
male and female murderers.
(c) Can we be certain that the observed di
ff
erences in the samples of male and female murderers
reflect the true di
ff
erences in the population distributions?
(d) Sketch a pie chart to illustrate the distribution of the relationship to the victim for male mur-
derers. (Sketch a rough plot – it does not need to be very accurate.)
1
Häkkänen et al. (2009). Gender di
ff
erences in Finnish homicide o
ff
ence characteristics.
Forensic Science International
,
186:75–80.
1
connections
more
intent
to
murder
females
more
likely
to
murderclose
relationships
yes
b
c
random
d
Aquintone
ner
igr
1.2
Plots for Quantitative Variables
2
Numerical Measures
2.1
Summation Notation
2. Suppose we have the following sample data set: 8, 14, 22,
-
5
(a) What is the value of
x
3
?
(b) What is
3
X
i
=1
x
i
?
(c) What is
P
x
i
?
(d) What is
P
x
2
i
?
2.2
Measures of Central Tendency
2.2.1
Mean, Median, and Mode
3. Suppose we have a sample of 5 observations: 1, 5, 2,
-
3
, 987.
(a) What is the mean?
(b) What is the median?
(c) What is the mode?
(d) If the extreme value is removed, what is the mean?
(e) If the extreme value is removed, what is the median?
4. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams:
2870
,
2620
,
3120
,
3620
(a) What is the mean birth weight?
(b) What is the median birth weight?
(c) What are the units of the mean?
(d) What are the units of the median?
(e) In this situation, which is the more appropriate measure of central tendency, the mean or the
median?
2.2.2
Other Measures of Central Tendency
5. What would be an advantage of using the trimmed mean instead of the untrimmed mean? What
would be a disadvantage?
2.3
Measures of Variability
6. Consider the following sample of 4 observations: 18, 8, 3, 17.
(a) What are the deviations?
(b) What is the sum of the deviations?
x
x
x
x
x
22
0
Xi
x
t
atx
8 14
22
44
3
s
987
1928.4
no
a
onur
more
sonone
92
2
1.5
grams
Either
b
c
no
outliers
in
TmT
T.mn
x x x
x
x
̅
18,413
17
9
x
̅
der
a
s
is
always
0
7. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams:
2870
,
2620
,
3120
,
3620
(a) What is the range?
(b) What is the mean absolute deviation?
(c) What is the variance?
(d) What is the standard deviation?
(e) What are the units of the variance?
(f) What are the units of the standard deviation?
8. Create a 4 number sample data set, where all numbers lie between 0 and 500 (inclusive, and repeats
are allowed), such that the standard deviation is as large as possible.
(a) What is your data set?
(b) What is the standard deviation of your data set?
9. Create a 4 number sample data set, where all numbers lie between 0 and 500 (inclusive, and repeats
are allowed), such that the standard deviation is as small as possible.
(a) What is your data set?
(b) What is the standard deviation of your data set?
10. Which of the following statements are true? There may be more than one correct statement; check
all that are true.
(a) The standard deviation can be greater than the variance.
(b) The standard deviation can be negative.
(c) The standard deviation can be less than the mean.
(d) The standard deviation can be less than the third quartile.
(e) The standard deviation is always less than the average distance from the mean.
2.3.1
Interpreting the standard deviation
11. Figure
2
illustrates scores on a di
ffi
cult statistics test. The scores have a mean of 22.4 and a standard
deviation of 7.2. (The maximum possible score on the test was 40.)
(a) Would the empirical rule apply to this data? Why or why not?
(b) What would the empirical rule tell us about the proportion of observations that are within 7.2
of 22.4?
(c) What would the empirical rule tell us about the proportion of observations that are within 14.4
of 22.4?
(d) What would the empirical rule tell us about the proportion of observations that are within 21.6
of 22.4?
12. Figure
2
illustrates scores on a di
ffi
cult statistics test. The scores have a mean of 22.4 and a standard
deviation of 7.2.
(a) Would Chebyshev’s theorem apply to this data? Why or why not?
nos's
3620 2620
110
sina.se
62
si
seas
a
312
s
182291.71
Ergym
95639
grams
12
29841189
184.1784
all
values
that
r
equal
5 s
s
s
5
0
if
values
r
equal
T
if
Ocs
1
F
5
0
t.no
relation
p
ftp.t
ontstfp
so
cannot
be
a
nth
moundshape
so
yes
68
would
lieb
c
within
1
SD
95
b
c
2
sd
Almost
all lie
within
3
Sd
Yes
bk
applies
to
all
data
temp2
Frequency
0
10
20
30
40
0
20
40
60
Score on Test
Frequency
.
Figure 2: Scores on a hard test (
¯
x
= 22
.
4
,
s
= 7
.
2
).
(b) What would Chebyshev’s theorem tell us about the proportion of observations that are within
7.2 of 22.4?
(c) What would Chebyshev’s theorem tell us about the proportion of observations that are within
14.4 of 22.4?
(d) What would Chebyshev’s theorem tell us about the proportion of observations that are within
21.6 of 22.4?
13. Consider the histogram given in Figure
3
.
(a) Would the empirical rule apply to this data?
(b) Would Chebyshev’s theorem apply to this data?
Figure 3: A distribution that is skewed to the right.
2.3.2
Why divide by
n
-
1
in the sample variance formula?
14. Why do we divide by
n
-
1
in the sample variance formula?
15. Suppose we have a sample of size
n
= 87
, and the population mean and variance are unknown. How
many degrees of freedom are there for estimating the variance?
K
Sd
4
1
1
f
0
only
useful
if
so
42
1
0.7s
at
Least
75
of
obs
fall
within a
sd of
the
mean
k3
1
f
0.889
at
least
89
of
obs
Fall
within
3sd
No
not
mound
shape
proportion
o
2.4
Measures of Relative Standing
2.4.1
Z
-scores
16. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams:
2870
,
2620
,
3120
,
3620
In Exercise
7
, we found that for these four observations:
¯
x
= 3057
.
5
and
s
= 426
.
9563
.
(a) What are the 4
z
-scores?
(b) What is the mean of the 4
z
-scores?
(c) What is the standard deviation of the 4
z
-scores?
(d) If a newborn male baby had a
z
-score of 4.6, what does that tell us about the baby’s weight?
(e) If a newborn male baby had a
z
-score of
-
0
.
4
, what does that tell us about the baby’s weight?
17. Todd has always had a dream of becoming a medical doctor. After doing well in an introductory
statistics course, Todd decides to write the MCAT. His score on the test corresponded to a
z
-score
of 3.0.
Suppose that scores on the test have a distribution that is mound-shaped (approximately
normal). Which of the following statements are true?
(a) The
z
-score is a unitless quantity.
(b) Todd’s score was 3 standard deviations greater than the mean score.
(c) Todd scored worse than approximately 1/3 of the test writers.
(d) Todd’s score was better than average.
2.4.2
Percentiles
18. A sample of 8 boxes of Kellogg’s All Bran was collected at a grocery store. The boxes had a nominal
weight of 675 grams. The weight (in grams) of the cereal in each box was recorded, with the following
results:
684
,
684
,
686
,
691
,
691
,
686
,
691
,
684
(The weights were recorded after discarding the box and the bag, so they represent the weight of
only the cereal. Real data, collected by JB.)
(a) Use the method outlined in the text to calculate the 80th percentile of the weights.
(b) Use the method outlined in the text to calculate the 25th percentile of the weights.
19. The 90th percentile of heights of adult females in the United States is closest to which one of the
following?
(a) 90 cm.
(b) 122 cm.
(c) 143 cm.
(d) 171 cm.
(e) 200 cm.
2
is
x
̅
0.439
1.025
0.146 1.317
ALWAYS
ALWAYS
1
T
timost
thebest
warfarin
sina.gg
ordered
684
684,684,686686,691,691,69
b
I
If
est
siingeish
To
percentile
A
B
C
0
20
40
60
80
Figure 4: 3 boxplots.
3
Boxplots
20. Consider the boxplots in Figure
4
, representing 3 di
ff
erent samples.
(a) What is the median of sample C?
(b) What is the range (Maximum
-
Minimum) of sample C?
(c) How many outliers are there in the entire plot (all samples combined).
(d) What is the 25th percentile of sample C?
21.
Qu et al.
(
2011
) investigated physical characteristics of the lizard
Phrynocephalus frontalis
. In one
part of the study, the researchers compared the tail lengths of males and females of this species.
Figure
5
illustrates the distributions of tail length for 44 female and 22 male lizards that were
captured in the wild.
60
65
70
75
Tail Length (mm)
Females
Males
Figure 5: Boxplots of tail lengths of 44 female and 22 male lizards of the species
P. frontalis
.
(a) What is the 75th percentile of tail lengths for the sample of male lizards?
(b) What is the 75th percentile of tail lengths for the sample of female lizards?
(c) Summarize the main di
ff
erences and similarities between males and females for these samples
of tail lengths.
50th
sthy
25th
35
80
4
Bottom
of
box
21
1466
Male
median
alot
higher
than
female
variances
are
similar
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL