STAT200 2023 GE2

.docx

School

University of Delaware *

*We aren’t endorsed by this school

Course

200

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

7

Uploaded by SuperHumanMouse356

Report
1 STAT 200 Guided Exercise 2 Be sure to: Please submit your answers in a Word or PDF file to Canvas at the place you downloaded the file. You can paste Excel/JMP output into a Word File. Please submit only one file for the assignment. It is ok to do problems by hand. However, you will need to scan or take a picture of your work. Guided Exercises are not graded but we check the work. Key Topics Measures of Central Tendency Stem & Leaf Plot and describing distributions Using Excel to graph data 1. Let’s finish up the Academy Award winners for best actor (and actress) since 1996 that was given in Assignment 1, now that we have command of both central tendency and variability. Each year the Academy of the Screen Actors Guild gives an award for the best actor and actress in a motion picture. We have recorded the name and age of each since 1996. The data for males and females is given below (the sample size, n =27). The sum of their age as well as the sum of age squared are also given. YEAR ACTOR MALE AGE ACTRESS FEMALE AGE 1996 Geoffrey Rush 45 Frances McDormand 39 1997 Jack Nicholson 60 Helen Hunt 34 1998 Roberto Benigni 46 Gwyneth Paltrow 26 1999 Kevin Spacey 40 Hilary Swank 25 2000 Russell Crowe 36 Julia Roberts 33 2001 Denzel Washington 47 Halle Berry 35 2002 Adrien Brody 29 Nicole Kidman 35 2003 Sean Penn 43 Charlize Theron 28 2004 Jamie Foxx 37 Hilary Swank 30 2005 Philip Seymour Hoffman 38 Reese Witherspoon 29 2006 Forest Whitiker 45 Helen Mirren 61 2007 Daniel Day-Lewis 50 Marion Cotillard 32 2008 Sean Penn 48 Kate Winslet 33 2009 Jeff Bridges 60 Sandra Bullock 45 2010 Colin Firth 50 Natalie Portman 29 2011 Jean Dujardin 39 Meryl Streep 62 2012 Daniel Day-Lewis 55 Jennifer Lawrence 22 2013 Matthew McConaughey 44 Cate Blanchett 44 2014 Eddie Redmayne 32 Julianne Moore 54 2015 Leonardo DiCaprio 41 Brie Larson 26 2016 Casey Affleck 41 Emma Stone 28 2017 Gary Oldman 59 Frances McDormand 60 2018 Rami Malek 37 Olivia Colman 45 2019 Joaquin Phoenix 45 Renee Zellweger 50 2020 Anthony Hopkins 83 Frances McDormand 63 2021 Will Smith 53 Jessica Chastain 44 2022 Brandan Fraser 54 Michelle Yeoh 60 Sum X 1257 Sum X 1072 Sum X-squared 61635 Sum X-squared 47012 Page of J 7
Here is the Stem and Leaf plot for each group to compare the distributions. Male Actor Age Female Actor Age Stem Leaf Count Stem Leaf Count 2 2 2 1 2 9 1 2 5 6 6 8 8 9 9 7 3 2 1 3 0 2 3 3 4 5 3 6 7 7 8 9 5 3 5 5 9 3 4 0 1 1 3 4 5 4 4 4 2 4 5 5 5 6 7 8 6 4 5 5 2 5 0 0 3 4 4 5 0 4 2 5 5 9 2 5 6 0 0 2 6 0 0 1 2 3 5 6 6 7 7 7 7 8 3 1 8 4|5 is 45 years old 4|5 is 45 years old a. Calculate the measures of central tendency and variability for each group. Males Females Mean 46.556 39.7037 Median 45 35 Mode 45 26 Range 54 41 Variance 119.795 171.1396 Standard Deviation 10.945 13.0820 Coefficient of Variation 23.510 32.9492 b. Briefly compare the two distributions with an emphasis on the measures of Central Tendency and Variability. The median age is 10 years older for males compared to females. The male data is symmetrical but there is a greater range of ages compared to the females. c. For both men and women there are a few outliers. For men there is one individual with a value of 83. For women there is one winner aged 60, another 61 and a third aged 62 and a fourth at 63. Calculate z-scores for these values and interpret their meaning. Men (x=83) – the z-score = 3.30 meaning that this is an outlier as it is 3 standard deviations away from the mean Women (x=60) z-score = 1.551 meaning that this is not an outlier X= 61 z-score = 1.628 meaning that this is not an outlier X=62 z-score = 1.704 meaning that this is not an outlier Page of J 7
X=63 z-score = 1.781 meaning that this is not an outlier d. The value of 83 for males is a large outlier. Sometimes, we make a decision to remove an outlier from an analysis. That should never be done lightly. However, large outliers can have a large impact on summary statistics based on the mean. Let’s remove the value of 83 from the data and see what happens. We want to see if the outlier influences the mean, median, standard deviation, and CV much. How to calculate things without the outlier? If you are using the data in Basic Stats.xlsx, just delete the outlier. The file will immediately recalculate for the other 26 values. If you are using the sums of X and x-squared, here are the values you need. Sum(X) = 1174; Sum(X-squared) = 54746; N=26. Full Data Outlier Removed Mean 46.556 45.154 Median 45 45 Std Dev 10.945 8.332 CV 23.510 18.452 2. Below is the data for infant mortality for 37 OECD countries in 2020. The Organization for Economic Co-operation and Development (OECD) is an international economic organization of 37 countries, founded in 1961 to stimulate economic progress and world trade. It is a forum of countries describing themselves as committed to democracy and the market economy, providing a platform to compare policy experiences, seeking answers to common problems, identify good practices and coordinate domestic and international policies of its members (Wikipedia). OECD’s web site provided some data on infant mortality for 37 countries. Infant mortality (the rate of death of children under 1 year of age per 1,000 live births) is a measure of development. The table below has the data for 37 countries already sorted from smallest to highest (https://data.oecd.org/healthstat/infant-mortality-rates.htm). The actual data, a Histogram (from JMP) and the Stem and Leaf Plot for this data is given below. Use the stem and leaf values for some calculations, such as the min and max. For other calculations the Sum(X) and Sum(X 2 ) is given. COUNTRY TIME IMR AUS 2020 3.2 AUT 2020 3.1 BEL 2020 3.3 CAN 2020 4.5 CZE 2020 2.3 DNK 2020 2.4 FIN 2020 1.8 FRA 2020 3.6 DEU 2020 3.1 GRC 2020 3.2 HUN 2020 3.4 ISL 2020 2.9 IRL 2020 3 ITA 2020 2.4 JPN 2020 1.8 KOR 2020 2.5 LUX 2020 4.5 MEX 2020 12.3 NLD 2020 3.8 NOR 2020 1.6 POL 2020 3.6 PRT 2020 2.4 Stem Leaf Count 1 | 4 6 8 8 4 2 | 2 3 4 4 4 4 4 5 6 8 9 11 3 | 0 1 1 2 2 3 4 5 6 6 6 8 8 13 4 | 5 5 2 5 | 1 4 6 3 6 | 0 7 | 9 1 8 | 5 1 9 | 0 10 | 0 11 | 0 12 | 3 1 14 | 0 15 | 0 16 | 8 1 A value of 8|5 is an Infant Mortality of 8.5 Sum(X) 148.700 Sum(X-squared) 925.570 Page of J 7
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help