HWK_9_Soln

.pdf

School

University of Wisconsin, Madison *

*We aren’t endorsed by this school

Course

371

Subject

Statistics

Date

May 16, 2024

Type

pdf

Pages

5

Uploaded by GrandEchidnaMaster871

Report
Stat 371 Homework 9 Due April 20 11:59pm YOUR NAME GOES HERE *Submit your homework to Canvas by the due date and time. Email your instructor if you have extenuating circumstances and need to request an extension. *If an exercise asks you to use R, include a copy of the code and output. Please edit your code and output to be only the relevant portions. *If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manually calculations on your exams, so practice accordingly. *You must include an explanation and/or intermediate calculations for an exercise to be complete. *Be sure to submit the HWK 9 Auto grade Quiz which will give you ~20 of your 40 accuracy points. *50 points total: 40 points accuracy, and 10 points completion Hypothesis Testing in paired or 2+ independent samples Exercise 1 Scientists want to know if a specific bean plant variety shows evidence of having a higher mean carbohydrate concentration in its shoot than in its root. Six bean plants had their carbohydrate concentration (in percent by weight) measured both in the shoot and in the root. The following results were obtained. A graph of the data is provided. Plant Shoot Root 1 4.42 3.76 2 5.81 5.40 3 4.65 3.91 4 4.77 4.29 5 5.25 4.69 6 4.75 3.93 a. Run the following code to construct a graphical summary of this data. Explain why this graphical summary is more appropriate than two individual histograms of the shoot and root samples. Because the data is paired, it is useful to see the pairs of data highlighted in the graphical summary. Then, we are able to visually evaluate the differences in those values. Shoot = c( 4.42 , 5.81 , 4.65 , 4.77 , 5.25 , 4.75 ) Root = c( 3.76 , 5.40 , 3.91 , 4.29 , 4.69 , 3.93 ) Plant = 1 : 6 AllMeasures <- c(Shoot, Root) Location <- as.factor(c(rep( "Shoot" , 6 ), rep( "Root" , 6 ))) Plant <- as.factor(rep(Plant, 2 )) Plant_data <- data.frame(AllMeasures, Location, Plant) 1
require(ggplot2) ## Loading required package: ggplot2 ggplot( data= Plant_data, aes( x= Location, y= AllMeasures, color= Plant, group= Plant))+ geom_point()+ geom_line() 4.0 4.5 5.0 5.5 Root Shoot Location AllMeasures Plant 1 2 3 4 5 6 b. Explain why we need to consider the sample of differences (Shoot-Root) instead of the two samples of data seperately when making inference. Based on the sampling choice and question of interest, what inference strategies could we consider? We have matched-pair sampling. Each plant is observed twice -so each value in the Shoot is matched with a value in the Root. There is not independence between the samples. If we consider the sample of differences, we have 6 independent observations. Since we have matched-pair data and questions about the mean difference, we could consider matched-pair t test, matched-pair bootstrap, Wilcoxon Signed Rank, or Sign Test c. Construct the relevent histogram and qqnorm plots to check the normality assumptions of the matched-pair t test. Explain whether or not the normality assumption of the matched-pair t test seems to be well met. The histogram of sample data does not give strong evidence against the assumption of a normal population. Similarly, the qqnorm plot shows the sample data in a linear pattern which corresponds to the sample being drawn from a normal distribution. par( mfrow= c( 1 , 2 )) diffs = Shoot-Root hist(diffs) mean(diffs) ## [1] 0.6116667 qqnorm(diffs); qqline(diffs) 2
Histogram of diffs diffs Frequency 0.4 0.5 0.6 0.7 0.8 0.9 0.0 0.5 1.0 1.5 2.0 -1.0 0.0 1.0 0.4 0.5 0.6 0.7 0.8 Normal Q-Q Plot Theoretical Quantiles Sample Quantiles par( mfrow= c( 1 , 1 )) d. Does this data give us strong evidence that the mean carbohydrate concentration in the shoot is higher than that in the root? Conduct a matched-pair t test of the hypotheses: H o : μ S - μ R 0 (or H o : μ S - R 0 ) vs H A : μ S - μ R > 0 (or H A : μ S - R > 0 ). Compute the test statistic, degrees of freedom, and p value “by hand”. Check your answers using t.test(). Draw a conclusion in the context of the question at a 1% significance level. Test Statistic Degrees of Freedom p-value Test Statistic: 9.557162, df=5, pvalue: 0.0001061668. We have strong enough evidence at the 1% level to reject the null. Evidence suggests that the mean carbohydrate concentration is more 0.2 higher in the shoot than the root of bean plants. (sd(diffs)/sqrt(length(diffs))) ## [1] 0.06400087 ( t_obs= (mean(diffs)- 0 )/(sd(diffs)/sqrt(length(diffs)))) #9.557162 ## [1] 9.557162 ( 1 -pt(t_obs, df= length(diffs)- 1 )) #0.0001061668 ## [1] 0.0001061668 t.test(Shoot, Root, mu= 0 , paired= TRUE, alternative= "greater" ) ## ## Paired t-test ## 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help