3

.pdf

School

University of Guelph *

*We aren’t endorsed by this school

Course

2040

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

16

Uploaded by CoachTeamWasp34

Report
Introductory Statistics Explained (1.11 Draft) Exercises Descriptive Statistics © 2022, 2023, 2024 Jeremy Balka J.B.’s strongly suggested exercises: 1 , 4 , 7 , 8 , 9 , 10 , 11 , 16 , 17 , 20 , 21 , 22 , 24 , 25 , 36 , 46 , 49 NB The titles and numbers of sections may not (yet) sync up with what is in the text. 1 Plots for Qualitative and Quantitative Variables 1.1 Plots for Qualitative Variables 1. A Finnish study 1 investigated a possible association between the gender of convicted murderers and their relationship to the victim. A random selection of 91 female murderers and a random selection of 91 male murderers were obtained, and the results are illustrated in Figure 1 . Gender Male Female Total Acquaintance 61 37 98 Partner 22 32 54 Family Member 4 18 22 Stranger 4 4 8 Total 91 91 182 Male Offenders Female Offenders 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Relative Frequency Acquaintance Partner Family Stranger Figure 1: Gender of murderer and their relationship to the victim in Finnish murders. (a) For male murderers, summarize the distribution of the relationship between the murderer and the victim. (b) Describe the observed di ff erences in the distributions of the relationship to the victim between male and female murderers. (c) Can we be certain that the observed di ff erences in the samples of male and female murderers reflect the true di ff erences in the population distributions? (d) Sketch a pie chart to illustrate the distribution of the relationship to the victim for male mur- derers. (Sketch a rough plot – it does not need to be very accurate.) 1 Häkkänen et al. (2009). Gender di ff erences in Finnish homicide o ff ence characteristics. Forensic Science International , 186:75–80. 1 connections more intent to murder females more likely to murderclose relationships yes b c random d Aquintone ner igr
1.2 Plots for Quantitative Variables 2 Numerical Measures 2.1 Summation Notation 2. Suppose we have the following sample data set: 8, 14, 22, - 5 (a) What is the value of x 3 ? (b) What is 3 X i =1 x i ? (c) What is P x i ? (d) What is P x 2 i ? 2.2 Measures of Central Tendency 2.2.1 Mean, Median, and Mode 3. Suppose we have a sample of 5 observations: 1, 5, 2, - 3 , 987. (a) What is the mean? (b) What is the median? (c) What is the mode? (d) If the extreme value is removed, what is the mean? (e) If the extreme value is removed, what is the median? 4. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams: 2870 , 2620 , 3120 , 3620 (a) What is the mean birth weight? (b) What is the median birth weight? (c) What are the units of the mean? (d) What are the units of the median? (e) In this situation, which is the more appropriate measure of central tendency, the mean or the median? 2.2.2 Other Measures of Central Tendency 5. What would be an advantage of using the trimmed mean instead of the untrimmed mean? What would be a disadvantage? 2.3 Measures of Variability 6. Consider the following sample of 4 observations: 18, 8, 3, 17. (a) What are the deviations? (b) What is the sum of the deviations? x x x x x 22 0 Xi x t atx 8 14 22 44 3 s 987 1928.4 no a onur more sonone 92 2 1.5 grams Either b c no outliers in TmT T.mn x x x x x ̅ 18,413 17 9 x ̅ der a s is always 0
7. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams: 2870 , 2620 , 3120 , 3620 (a) What is the range? (b) What is the mean absolute deviation? (c) What is the variance? (d) What is the standard deviation? (e) What are the units of the variance? (f) What are the units of the standard deviation? 8. Create a 4 number sample data set, where all numbers lie between 0 and 500 (inclusive, and repeats are allowed), such that the standard deviation is as large as possible. (a) What is your data set? (b) What is the standard deviation of your data set? 9. Create a 4 number sample data set, where all numbers lie between 0 and 500 (inclusive, and repeats are allowed), such that the standard deviation is as small as possible. (a) What is your data set? (b) What is the standard deviation of your data set? 10. Which of the following statements are true? There may be more than one correct statement; check all that are true. (a) The standard deviation can be greater than the variance. (b) The standard deviation can be negative. (c) The standard deviation can be less than the mean. (d) The standard deviation can be less than the third quartile. (e) The standard deviation is always less than the average distance from the mean. 2.3.1 Interpreting the standard deviation 11. Figure 2 illustrates scores on a di cult statistics test. The scores have a mean of 22.4 and a standard deviation of 7.2. (The maximum possible score on the test was 40.) (a) Would the empirical rule apply to this data? Why or why not? (b) What would the empirical rule tell us about the proportion of observations that are within 7.2 of 22.4? (c) What would the empirical rule tell us about the proportion of observations that are within 14.4 of 22.4? (d) What would the empirical rule tell us about the proportion of observations that are within 21.6 of 22.4? 12. Figure 2 illustrates scores on a di cult statistics test. The scores have a mean of 22.4 and a standard deviation of 7.2. (a) Would Chebyshev’s theorem apply to this data? Why or why not? nos's 3620 2620 110 sina.se 62 si seas a 312 s 182291.71 Ergym 95639 grams 12 29841189 184.1784 all values that r equal 5 s s s 5 0 if values r equal T if Ocs 1 F 5 0 t.no relation p ftp.t ontstfp so cannot be a nth moundshape so yes 68 would lieb c within 1 SD 95 b c 2 sd Almost all lie within 3 Sd Yes bk applies to all data
temp2 Frequency 0 10 20 30 40 0 20 40 60 Score on Test Frequency . Figure 2: Scores on a hard test ( ¯ x = 22 . 4 , s = 7 . 2 ). (b) What would Chebyshev’s theorem tell us about the proportion of observations that are within 7.2 of 22.4? (c) What would Chebyshev’s theorem tell us about the proportion of observations that are within 14.4 of 22.4? (d) What would Chebyshev’s theorem tell us about the proportion of observations that are within 21.6 of 22.4? 13. Consider the histogram given in Figure 3 . (a) Would the empirical rule apply to this data? (b) Would Chebyshev’s theorem apply to this data? Figure 3: A distribution that is skewed to the right. 2.3.2 Why divide by n - 1 in the sample variance formula? 14. Why do we divide by n - 1 in the sample variance formula? 15. Suppose we have a sample of size n = 87 , and the population mean and variance are unknown. How many degrees of freedom are there for estimating the variance? K Sd 4 1 1 f 0 only useful if so 42 1 0.7s at Least 75 of obs fall within a sd of the mean k3 1 f 0.889 at least 89 of obs Fall within 3sd No not mound shape proportion o
2.4 Measures of Relative Standing 2.4.1 Z -scores 16. A random sample of 4 Canadian male newborns revealed the following birth weights, in grams: 2870 , 2620 , 3120 , 3620 In Exercise 7 , we found that for these four observations: ¯ x = 3057 . 5 and s = 426 . 9563 . (a) What are the 4 z -scores? (b) What is the mean of the 4 z -scores? (c) What is the standard deviation of the 4 z -scores? (d) If a newborn male baby had a z -score of 4.6, what does that tell us about the baby’s weight? (e) If a newborn male baby had a z -score of - 0 . 4 , what does that tell us about the baby’s weight? 17. Todd has always had a dream of becoming a medical doctor. After doing well in an introductory statistics course, Todd decides to write the MCAT. His score on the test corresponded to a z -score of 3.0. Suppose that scores on the test have a distribution that is mound-shaped (approximately normal). Which of the following statements are true? (a) The z -score is a unitless quantity. (b) Todd’s score was 3 standard deviations greater than the mean score. (c) Todd scored worse than approximately 1/3 of the test writers. (d) Todd’s score was better than average. 2.4.2 Percentiles 18. A sample of 8 boxes of Kellogg’s All Bran was collected at a grocery store. The boxes had a nominal weight of 675 grams. The weight (in grams) of the cereal in each box was recorded, with the following results: 684 , 684 , 686 , 691 , 691 , 686 , 691 , 684 (The weights were recorded after discarding the box and the bag, so they represent the weight of only the cereal. Real data, collected by JB.) (a) Use the method outlined in the text to calculate the 80th percentile of the weights. (b) Use the method outlined in the text to calculate the 25th percentile of the weights. 19. The 90th percentile of heights of adult females in the United States is closest to which one of the following? (a) 90 cm. (b) 122 cm. (c) 143 cm. (d) 171 cm. (e) 200 cm. 2 is x ̅ 0.439 1.025 0.146 1.317 ALWAYS ALWAYS 1 T timost thebest warfarin sina.gg ordered 684 684,684,686686,691,691,69 b I If est siingeish To percentile
A B C 0 20 40 60 80 Figure 4: 3 boxplots. 3 Boxplots 20. Consider the boxplots in Figure 4 , representing 3 di ff erent samples. (a) What is the median of sample C? (b) What is the range (Maximum - Minimum) of sample C? (c) How many outliers are there in the entire plot (all samples combined). (d) What is the 25th percentile of sample C? 21. Qu et al. ( 2011 ) investigated physical characteristics of the lizard Phrynocephalus frontalis . In one part of the study, the researchers compared the tail lengths of males and females of this species. Figure 5 illustrates the distributions of tail length for 44 female and 22 male lizards that were captured in the wild. 60 65 70 75 Tail Length (mm) Females Males Figure 5: Boxplots of tail lengths of 44 female and 22 male lizards of the species P. frontalis . (a) What is the 75th percentile of tail lengths for the sample of male lizards? (b) What is the 75th percentile of tail lengths for the sample of female lizards? (c) Summarize the main di ff erences and similarities between males and females for these samples of tail lengths. 50th sthy 25th 35 80 4 Bottom of box 21 1466 Male median alot higher than female variances are similar
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help