STAT3220FA23Unit4

.docx

School

University of Virginia *

*We aren’t endorsed by this school

Course

3220

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

31

Uploaded by AgentTeamDeer43

Report
STAT 3220 Unit 4.3-4.4 4.3 One Factor ANOVA A sample of 44 healthy male college students participated in an experiment, Each student was asked to memorize a list of 40 words (20 on a green list and 20 on a red list).The students were then randomly assigned of one of 4 treatments: Group A: received 2 alcoholic drinks, Group AC: received 2 alcoholic drinks with caffeine powder dissolved in their drinks, Group AR: received 2 alcoholic drinks and a monetary reward for correct responses on the task, Group P: were told they received 2 alcoholic drinks, but instead just received a carbonated drink. After consuming their drinks and resting for 25 minutes, the students performed the word completion task. The response variable ”task score” represents the difference between the proportion of correct responses on the green list and incorrect on the red list. Step 1: Collect the Data and Determine Experimental Design Consider the design of the data collection : For the One and Two Factor ANOVA we are covering the analysis on completely randomized designs. We want to ensure our data meet that criteria and define the other components of the experimental design. drinkersdata <- read.delim ( "https://raw.githubusercontent.com/ kvaranyak4/STAT3220/main/DRINKERS.txt" , header= T) names (drinkersdata) [1] "GROUP" "SCORE" head (drinkersdata) GROUP SCORE 1 AR 0.51 2 AR 0.58 3 AR 0.52 4 AR 0.47 5 AR 0.61 6 AR 0.00 table (drinkersdata $ GROUP) A AC AR P 11 11 11 11
Experimental design This is a completely randomized design because subjects are randomly assigned to treatments. There is one factor , group, with four levels : AR, AC, A, and P. Therefore, there are four treatments : AR, AC, A, P. It is balanced because each treatment has the same number of experimental units . The response variable is the task score. Step 2: Hypothesize Relationship (Exploratory Data Analysis) We will look at side by side box plots of the treatment groups and compare the mean value. We can also compare the 5 number summary for a more detailed interpretation. (We could also determine the approximate distribution of each treatment group- in the next section we will see that we want the treatments to have normally distributed data.) boxplot (SCORE ~ GROUP, data= drinkersdata) tapply (drinkersdata $ SCORE,drinkersdata $ GROUP,summary) $A Min. 1st Qu. Median Mean 3rd Qu. Max. -0.35000 -0.05000 0.16000 0.06364 0.19000 0.31000 $AC
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0200 0.1750 0.2200 0.2655 0.3750 0.5000 $AR Min. 1st Qu. Median Mean 3rd Qu. Max. 0.000 0.400 0.500 0.440 0.525 0.610 $P Min. 1st Qu. Median Mean 3rd Qu. Max. 0.12 0.33 0.43 0.40 0.47 0.62 It appears there are differences in the mean task scores for the four groups, so we will perform a statistical analysis to compare the means through the ANOVA procedure. Step 3: Perform the ANOVA Null Hypothesis H 0 : μ A = μ AC = μ AR = μ P Alternative Hypothesis at least two means are different drinkanova1 <- aov (SCORE ~ GROUP, data= drinkersdata) summary (drinkanova1) Df Sum Sq Mean Sq F value Pr(>F) GROUP 3 0.9506 0.3169 10.29 3.76e-05 *** Residuals 40 1.2317 0.0308 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Test Statistic F test: F = MST/MSE = 0.317/0.0308 = 10.29 Distribution of Test Statistic : F with numerator degrees of freedom, 1 = 4 − 1 and ν denominator degrees of freedom, 2 = 44 − 4 ν CONCLUSION: With a pvalue this small, we reject the null hypothesis. There is evidence that at least two population treatment means are different from each other. Based on the sample means, it appears the group with just alcohol has a much lower than the rest, while the group with a monetary reward had a much higher mean than the rest. We will perform the post-hoc test to compare pairwise means in the next section. Contextual Conclusion The question of interest is: Does coffee or some other form of stimulation really allow a person suffering from alcohol intoxication to ”sober up”? What is your conclusion? Include any relevant inferences and confidence intervals. Use alpha=0.05. TukeyHSD (drinkanova1, conf.level= . 95 ) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = SCORE ~ GROUP, data = drinkersdata)
$GROUP diff lwr upr p adj AC-A 0.2018182 0.001256169 0.4023802 0.0480809 AR-A 0.3763636 0.175801623 0.5769256 0.0000618 P-A 0.3363636 0.135801623 0.5369256 0.0003282 AR-AC 0.1745455 -0.026016558 0.3751075 0.1075832 P-AC 0.1345455 -0.066016558 0.3351075 0.2892292 P-AR -0.0400000 -0.240562013 0.1605620 0.9501001 plot ( TukeyHSD (drinkanova1, conf.level= . 95 )) Note: If 0 is captured in any interval, it would imply the means are not significantly different (we will examine this precise pairwise analysis in the next section.) We will look at the confidence intervals for the difference in the mean scores of the alcohol group compared to the others. Note all of the confidence intervals are positive. This implies that the alcohol group under performs compared to every other group. Pair Lower Upper AC-A 0.00126 0.4024 AR-A 0.1758 0.5769
Pair Lower Upper P-A 0.1358 0.5369 We are 95% confident that the true mean difference of scores for those who also had caffeine is between 0.00126 and 0.40 higher than the group with just alcohol. There is a significant difference between the group with just alcohol and the group with caffeine, but it may not be practically significant. Next steps Since the main effect is significant, we go to post hoc analysis (Tukey’s test in this class). We will also check for the assumptions (next section). 4.3 Two Factor ANOVA The chemical element antimony is sometimes added to tin – lead solder to replace the more expensive tin and to reduce the cost of soldering. A factorial experiment was conducted to determine how antimony affects the strength of the tin–lead solder joint (Journal of Materials Science, May 1986). Tin–lead solder specimens were prepared using one of four possible cooling methods (water- quenched, WQ; oil-quenched, OQ; air-blown, AB; and furnace-cooled, FC) and with one of four possible amounts of antimony (0%, 3%, 5%, and 10%) added to the composition. Three solder joints were randomly assigned to each of the 4 × 4 = 16 treatments and the shear strength of each measured. Step 1: Collect the Data and Determine Experimental Design Consider the design of the data collection : For the One and Two Factor ANOVA we are covering the analysis on completely randomized designs. We want to ensure our data meet that criteria and define the other components of the experimental design. tindata <- read.delim ( "https://raw.githubusercontent.com/kvaranyak4/ STAT3220/main/TINLEAD.txt" , header= T) names (tindata) [1] "ANTIMONY" "METHOD" "STRENGTH" head (tindata) ANTIMONY METHOD STRENGTH 1 0 WQ 17.6 2 0 WQ 19.5 3 0 WQ 18.3 4 0 OQ 20.0 5 0 OQ 24.3 6 0 OQ 21.9 table (tindata $ ANTIMONY,tindata $ METHOD)
AB FC OQ WQ 0 3 3 3 3 3 3 3 3 3 5 3 3 3 3 10 3 3 3 3 Experimental design This is a completely randomized design because experimental units are randomly assigned to treatments. There two factors , cooling method, with four levels : AB, FC, OQ, and WQ and antimony with four levels 0%, 3%, 5%, and 10%. Therefore, there are 16 treatments . It is balanced because each treatment has the same number of experimental units . The response variable is the sheer strength. Step 2: Hypothesize Relationship (Exploratory Data Analysis) For two factor designs, we will examine the interaction plot first. # We want to begin with examining the interaction interaction.plot (tindata $ METHOD, tindata $ ANTIMONY, tindata $ STRENGTH, fun= mean, trace.label= "Antimony Level" , xlab= "Cooling Method" , ylab= "Mean Sheer Strength" , main= "Interaction Plot Cooling Method X Antimony" )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help