Group1-BAN100ZAA_ANOVA Assignment

pdf

School

Seneca College *

*We aren’t endorsed by this school

Course

100

Subject

Statistics

Date

May 24, 2024

Type

pdf

Pages

17

Report

Uploaded by MateSalamander4103

BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 1 of 17 1. Performing One-Way ANOVA (10 points) Montana Gourmet Garlic is a company that grows garlic using organic methods. It specializes in hardneck varieties. Knowing a little about experimental methods, the owners design an experiment to test whether growth of the garlic is affected by the type of fertilizer used. They limit the experimentation to a Rocambole variety named Spanish Roja, and test three organic fertilizers and one chemical fertilizer (as a control). They blind themselves to the fertilizer by using containers with numbers 1 through 4. (In other words, they design the experiment in such a way that they do not know which fertilizer is in which container.) One acre of farmland is set aside for the experiment. It is divided into 32 beds. They randomly assign fertilizers to beds. At harvest, they calculate the average weight of garlic bulbs in each of the beds. The data are in the DATALIB.Garlic data set. These are the variables in the data set: Fertilizer The type of fertilizer used (1 through 4) BulbWt The average garlic bulb weight (in pounds) in the bed BedID A bed identification number Analysis of Variance with Garlic Data Consider an experiment to study four types of fertilizer, labeled 1, 2, 3, and 4. One fertilizer is chemical and the rest are organic. You want to see whether the average of weights of garlic bulbs are significantly different for plants in beds using different fertilizers. Solution: a. Test the hypothesis that the means are equal. Using a statistical test called Analysis of Variance (ANOVA), we aim to determine if there is a notable distinction in the average weights of garlic bulbs treated with four distinct types of fertilizer. We establish a hypothesis (H0) that there is no difference in the mean weights, and an alternative hypothesis (Ha) that there is indeed a difference in mean. Code: libname mylib '/home/u63735261/my_shared_file_links/u63661134/DATALIB'; proc means data=mylib.Garlic printalltypes maxdec=3; var BulbWt; class Fertilizer; title 'Descriptive Statistics of Garlic Weight'; run;
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 2 of 17 b. Be sure to check that the assumptions of the analysis method that you choose are met. Before relying on the results of the ANOVA test, it's crucial to verify that the assumptions underlying the analysis are met. These assumptions include: Normality: To ensure reliable analysis, data within each group should conform to a bell-curve pattern (normal distribution). This assumption can be verified visually using graphs called Q-Q plots or through statistical tests such as the Shapiro-Wilk test. Homogeneity of variance: To ensure reliable results, the variability of the dependent variable (response) should be consistent among all groups in the analysis. This assumption can be verified visually by plotting the residuals (differences between observed and predicted values) against the fitted values. Alternatively, Levene's test can be used to formally assess the equality of variances across groups. Independence: To ensure reliable results, the observations collected within each group of a study should not influence each other. This is often achieved by using a randomized experimental design, where participants are randomly assigned to different groups, reducing the likelihood of bias and external factors affecting the results. c. What conclusions can you reach at this point in your analysis? After performing the one-way ANOVA analysis, we check the p-value linked to the F-statistic. If this p-value falls below the selected significance level (usually 0.05), we reject the null hypothesis and determine that at least one pair of means significantly differs. Conversely, if the p-value exceeds 0.05, we fail to reject the null hypothesis, suggesting that there are no significant differences in the means.
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 3 of 17 d. Perform a post hoc test to look at the individual differences among means. Conduct pairwise comparisons with an experiment-wise error rate of a=0.05. (Use the Tukey adjustment.) Which types of fertilizer are significantly different? Post hoc Tukey adjustment Since ANOVA indicates significant differences among groups, we can conduct pairwise comparisons using Tukey's HSD (Honestly Significant Difference) test to identify which groups are significantly different from each other. The output in this test provides confidence intervals for the differences between means and indicates which pairs are significantly different. Significant differences are determined based on whether the confidence interval includes zero or not. By examining the results of the Tukey test, we can identify which types of fertilizer lead to significantly different garlic bulb weights. Code: /* Tukey adjustmemt */ ods graphics; proc glm data=mylib.garlic; class Fertilizer; model BulbWt = Fertilizer; means Fertilizer / hovtest=levene welch plots=none; lsmeans Fertilizer /adjust=Tukey alpha=.05; title "One-Way ANOVA with Fertilizer as Predictor"; run; quit;
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 4 of 17 e. Please provide the syntax and output please interpret the output Interpretation The degrees of freedom, p-value, and F-statistic are all provided in the ANOVA summary output. ANOVA p-values less than 0.05 indicate the presence of significant differences between the groups. Confidence intervals and p-values for each pairwise comparison between the groups will be provided by the Tukey's HSD test output. The experiment-wise error rate (0.05) can be used to compare the p-values and identify significant changes across groups. The groups differ from one another considerably if the p-value is less than 0.05.
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 5 of 17 Results:
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 6 of 17
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 7 of 17
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 8 of 17
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 9 of 17
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 10 of 17 2. Performing Two-Way ANOVA (10 points) Data were collected in an effort to determine whether different dose levels of a given drug have an effect on blood pressure for people with one of three types of heart disease. The data are in the DATALIB.Drug data set. The data set contains the following variables: DrugDose dosage level of drug (1, 2, 3, 4), corresponding to (Placebo, 50 mg, 100 mg, 200 mg) Disease heart disease category BloodP change in diastolic blood pressure after 2 weeks treatment Solution: a. Examine the data with a vertical line plot. Put BloodP on the Y axis, DrugDose on the X axis, and then stratify by Disease . What information can you obtain from looking at the data? On a graph, we plot: - Bloo dP (change in diastolic blood pressure) on the vertical axis (Y-axis) DrugDose (dosage level of the drug) on the horizontal axis (X-axis), ranging from placebo to higher doses - Disease (type of heart disease) which groups patients. By analysing this data, we can determine how drug dosage and disease type impact blood pressure changes. By creating a vertical line plot that organizes data by disease type and drug dosage, we can visually examine how blood pressure changes as dosage levels vary across different heart conditions. This graph allows us to spot potential trends, patterns, or unusual data points that might go unnoticed when just looking at numerical summaries. Code: LIBNAME mydata '/home/u63735261/my_shared_file_links/u63661134/DATALIB'; PROC SGPLOT DATA=mydata.drug; VLINE DrugDose / GROUP=Disease RESPONSE=BloodP STAT=MEAN MARKERS; XAXIS LABEL="Drug Dose"; YAXIS LABEL="Change in Diastolic Blood Pressure"; RUN;
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 11 of 17 b. Test the hypothesis that the means are equal, making sure to include an interaction term if the graphical analyses that you performed indicate that would be advisable. What conclusions can you reach at this point? By looking closely at the plotted data, we can see how drug dosage, type of heart disease, and changes in blood pressure are related. For example, we can see if certain dosage levels have a bigger effect on blood pressure for specific types of heart disease, or if there are patterns that are consistent across all types of heart disease. These visual observations help us develop ideas and decide what statistical tests to do next. Statistical analysis relies heavily on hypothesis testing. In this case, a two-way analysis of variance (ANOVA) allows us to thoroughly evaluate whether there are significant differences in blood pressure changes between different drug dosage levels and disease types. The p-values and effect sizes associated with the ANOVA help us gauge the statistical importance of each factor and their combined influence on blood pressure changes. The analysis of our hypothesis allows us to make informed judgments about how drug dosage, disease type, and their relationship affect changes in blood pressure. If, for instance, we notice considerable independent effects of drug dosage and disease type, as well as a significant interaction between them, we can conclude that both factors significantly alter blood pressure changes and that their combined impact differs from what we would anticipate if they operated independently. c. Which level of disease is not significant within the interaction? When drug effectiveness depends on both the dose and type of disease, it's essential to determine which disease levels affect the interaction. This detailed analysis allows for precise interventions or treatments by identifying patient groups that respond best to specific drug dosages, optimizing patient care.
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 12 of 17 d. Are there any variables redundant? How do you determine the model is optimal? Please provide rationales. Overlapping variables in a model can hide important connections or make the analysis more complicated without offering valuable information. By examining how much each variable influences changes in blood pressure (the dependent variable), we can spot and possibly remove unnecessary variables, simplifying the model and making it easier to understand. Balancing model simplicity and effectiveness is crucial in optimization. The ideal model includes only the key factors and interactions needed to explain data variation adequately. Techniques like stepwise regression, information criteria like AIC and BIC, and cross-validation help identify the best model by identifying the variables that contribute most while avoiding redundancies and overfitting. e. Please provide the syntax and output please interpret the output Interpretation To ensure the reproducibility and comprehension of statistical analysis, it is essential to thoroughly explain its implementation and interpret its output. This includes outlining the commands or code used for the two-way ANOVA, presenting comprehensive interpretations of the ANOVA tables, and providing detailed explanations of effect sizes, confidence intervals, and post-hoc tests (if necessary). By providing these detailed explanations, we can clearly communicate the importance of the findings and their potential impact on clinical practice or future research endeavors.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 13 of 17 Result:
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 14 of 17
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 15 of 17
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 16 of 17
BAN 100 Statistics for Analytics Group 1 - Assignment 2 - ANOVA Page 17 of 17 Submitted by: Name Student ID 1 Gelasia Mendonca ( gmendonca2@myseneca.ca ) 104624234 2 Sam Oswald (sosavarimuthu@myseneca.ca) 138928239 3 Abhishek Vipul Shah ( avshah6@myseneca.ca ) 138939236