HWK2_324_Soln

.pdf

School

University of Wisconsin, Madison *

*We aren’t endorsed by this school

Course

324

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

9

Uploaded by UltraDolphinMaster987

Report
Statistics 324 Homework #2 SOLUTIONS *Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating circumstances and need to request an extension. *If an exercise asks you to use R, include a copy of the code and output. Please edit your code and output to be only the relevant portions. *If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manual calculations on your exams, so practice accordingly. *You must include an explanation and/or intermediate calculations for an exercise to be complete. *Be sure to submit the HWK2 Autograde Quiz which will give you ~20 of your 40 accuracy points. *50 points total: 40 points accuracy, and 10 points completion Summarizing Data Numerically and Graphically (I) Exercise 1: A company that manufactures toilets claims that its new presure-assisted toilet reduces the average amount of water used by more than 0.5 gallons per flush when compared to its current model. The company selects 20 toilets of the current type and 19 of the new type and measures the amount of water used when each toilet is flushed once. The number of gallons measured for each flush are recorded below. The measurements are also given in flush.csv. Current Model: 1.63, 1.25, 1.23, 1.49, 2.11, 1.48, 1.94, 1.72, 1.85, 1.54, 1.67, 1.76, 1.46, 1.32, 1.23, 1.67, 1.74, 1.63, 1.25, 1.56 New Model: 1.28, 1.19, 0.90, 1.24, 1.00, 0.80, 0.71, 1.03, 1.27, 1.14, 1.36, 0.91, 1.09, 1.36, 0.91, 0.91, 0.86, 0.93, 1.36 a. Use R to create histograms to display the sample data from each model (any kind of histogram that you want since sample sizes are very similar). Have identical x and y axis scales so the two groups’ values are more easily compared. Include useful titles. Curr <- c ( 1.63 , 1.25 , 1.23 , 1.49 , 2.11 , 1.48 , 1.94 , 1.72 , 1.85 , 1.54 , 1.67 , 1.76 , 1.46 , 1.32 , 1.23 , 1.67 , 1.74 , 1.63 , 1.25 , 1.56 ) New <- c ( 1.28 , 1.19 , 0.90 , 1.24 , 1.00 , 0.80 , 0.71 , 1.03 , 1.27 , 1.14 , 1.36 , 0.91 , 1.09 , 1.36 , 0.91 , 0.91 , 0.86 , 0.93 , 1.36 ) length (Curr); length (New) ## [1] 20 ## [1] 19 #This is where I put data into long format to then created the csv file #gallons=c(Curr, New) #Model=c(rep("Current",20), rep("New",19)) #flush_data<-data.frame(Model, gallons) #View(flush_data) 1
#write.csv(flush_data, "flush.csv", row.names=FALSE) #or to get the data from the csv: flush_df = read.csv ( "flush.csv" , header= TRUE ) Curr_df = subset (flush_df, Model == "Current" ) Curr_gall = Curr_df $ gallons New_df = subset (flush_df, Model == "New" ) New_gall = New_df $ gallons mean (flush_df $ gallons); sd (flush_df $ gallons) ## [1] 1.327692 ## [1] 0.3422561 par ( mfrow= c ( 1 , 2 ), mar= c ( 5.1 , 4.1 , 4.1 , 2.1 )) hist (Curr, breaks= c ( seq ( 0.5 , 2.5 , . 2 )), main= "Current Model" , xlab= "gallons" , ylim= c ( 0 , 8 )) hist (New, breaks= c ( seq ( 0.5 , 2.5 , . 2 )), main= "New Model" , xlab= "gallons" , ylim= c ( 0 , 8 )) Current Model gallons Frequency 0.5 1.0 1.5 2.0 2.5 0 2 4 6 8 New Model gallons Frequency 0.5 1.0 1.5 2.0 2.5 0 2 4 6 8 par ( mfrow= c ( 1 , 1 )) b. Compare the shape of the gallons per flush from the two types of toilets observed in this experiment. Both of the histograms are roughly symmetric. Current model has one primary peak around 1.6 and New Model has a primary peak around 1 c. Compute the mean and median gallons flushed for the Current and New Model toilets using the built-in R functions. Compare the measures of center across the two groups and comment on how that relationship is evident in the histograms. 2
mean (Curr); median (Curr) ## [1] 1.5765 ## [1] 1.595 mean (Curr_gall); median (Curr_gall) #making sure csv data gives same summaries ## [1] 1.5765 ## [1] 1.595 mean (New); median (New) ## [1] 1.065789 ## [1] 1.03 mean (New_gall); median (New_gall) #making sure csv data gives same summaries ## [1] 1.065789 ## [1] 1.03 Current: mean: 1.5765, median: 1.595. New: mean: 1.065789, median: 1.03. For both Models, we see that the mean and median values are pretty close to one another - this is consistent with the roughly symmetric shapes of the data. The centers of the current model data is slightly higher than those of the new model data which can be seen with a slight shift to the right for the Current Model histogram. d. Compute (using built-in R function) and compare the sample standard deviation of gallons flushed by the current and new model toilets. Comment on how the relative size of these values can be identified from the histograms. sd (Curr) ## [1] 0.2456843 sd (Curr_gall) ## [1] 0.2456843 sd (New) ## [1] 0.2058941 sd (New_gall) ## [1] 0.2058941 Current SD:0.2457 New SD:0.2059 The standard deviation of the new model gallons flushed is smaller than that of the current model. This relationship could be predicted from the histograms since the histogram of the new model data is more tightly clustered around the center. The values of gallons flushed look to be more predictable/consistent with the new model based on these samples- there is less variability in the outcome. e. Use R to create side-by-side boxplots of the two sets in R so they are easily comparable. boxplot (Curr, New, names= c ( "Current" , "New" ), ylab= "water flushed (gallons)" , main= "Toilet Water Flushed by Model" , ylim= c ( 0.5 , 2.3 )) text ( y= fivenum (Curr), labels= fivenum (Curr), x= 0.6 ) text ( y= fivenum (New), labels= fivenum (New), x= 2.4 ) 3
Current New 0.5 1.0 1.5 2.0 Toilet Water Flushed by Model water flushed (gallons) 1.23 1.39 1.595 1.73 2.11 0.71 0.91 1.03 1.255 1.36 par ( mfrow= c ( 1 , 2 ), mar= c ( 4.5 , 4.5 , 2.5 , 1.5 )) boxplot (Curr, ylim= c ( 0.5 , 2.5 ), xlab= "Current" , ylab= "gallons" ) text ( y= fivenum (Curr), labels= fivenum (Curr), x= 0.7 ) boxplot (New, ylim= c ( 0.5 , 2.5 ), xlab= "New" , ylab= "gallons" ) text ( y= fivenum (New), labels= fivenum (New), x= 0.7 ) 0.5 1.0 1.5 2.0 2.5 Current gallons 1.23 1.39 1.595 1.73 2.11 0.5 1.0 1.5 2.0 2.5 New gallons 0.71 0.91 1.03 1.255 1.36 par ( mfrow= c ( 1 , 1 )) f. Explain why there are no values shown as a dot on the Current Model flush boxplot. To 4
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help