HW2 LR assignment

.docx

School

New York University *

*We aren’t endorsed by this school

Course

MISC

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

5

Uploaded by DeaconBoulderAlpaca37

Report
HW2 2024-02-10 Part 1: The cheddar cheese study In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters. The cheddar dataset has 30 observations on the following 4 variables. taste: a subjective taste score Acetic: concentration of acetic acid (log scale) H2S: concentration of hydrogen sulfice (log scale) Lactic: concentration of lactic acid Use the following statement to access the data: data(“cheddar”, package = “faraway”) Question 1.1: Show descriptive statistics for each of the variables rm ( list = ls ()) # read data here data ( "cheddar" , package = "faraway" ) # Note: using summary() for descriptive statistics is sufficient for this part # Enter your code below summary (cheddar) ## taste Acetic H2S Lactic ## Min. : 0.70 Min. :4.477 Min. : 2.996 Min. :0.860 ## 1st Qu.:13.55 1st Qu.:5.237 1st Qu.: 3.978 1st Qu.:1.250 ## Median :20.95 Median :5.425 Median : 5.329 Median :1.450 ## Mean :24.53 Mean :5.498 Mean : 5.942 Mean :1.442 ## 3rd Qu.:36.70 3rd Qu.:5.883 3rd Qu.: 7.575 3rd Qu.:1.667 ## Max. :57.20 Max. :6.458 Max. :10.199 Max. :2.010 Question 1.2: Fit a regression model with taste as the response and the three chemical contents as predictors. Identify the predictors that are statistically significant at the 5% level. # Enter your code below
lmod <- lm (taste ~ Acetic + H2S + Lactic , data = cheddar) summary (lmod) ## ## Call: ## lm(formula = taste ~ Acetic + H2S + Lactic, data = cheddar) ## ## Residuals: ## Min 1Q Median 3Q Max ## -17.390 -6.612 -1.009 4.908 25.449 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -28.8768 19.7354 -1.463 0.15540 ## Acetic 0.3277 4.4598 0.073 0.94198 ## H2S 3.9118 1.2484 3.133 0.00425 ** ## Lactic 19.6705 8.6291 2.280 0.03108 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 10.13 on 26 degrees of freedom ## Multiple R-squared: 0.6518, Adjusted R-squared: 0.6116 ## F-statistic: 16.22 on 3 and 26 DF, p-value: 3.81e-06 The findings show that, at the 5% level, lactic acid and hydrogen sulfide are statistically significant predictors of taste of the dataset cheddar, whereas acetic acid concentration is not at all significant. Question 1.3: Calculate the p-values of the three predictors using the anova function. # Enter your code below model_NoAce <- lm (taste ~ H2S + Lactic, data = cheddar) anova (model_NoAce, lmod) ## Analysis of Variance Table ## ## Model 1: taste ~ H2S + Lactic ## Model 2: taste ~ Acetic + H2S + Lactic ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 27 2669.0 ## 2 26 2668.4 1 0.55427 0.0054 0.942 model_NoLac <- lm (taste ~ Acetic + H2S, data = cheddar) anova (model_NoLac, lmod) ## Analysis of Variance Table ## ## Model 1: taste ~ Acetic + H2S ## Model 2: taste ~ Acetic + H2S + Lactic
## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 27 3201.7 ## 2 26 2668.4 1 533.32 5.1964 0.03108 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 model_NoH2S <- lm (taste ~ Acetic + Lactic, data = cheddar) anova (model_NoH2S, lmod) ## Analysis of Variance Table ## ## Model 1: taste ~ Acetic + Lactic ## Model 2: taste ~ Acetic + H2S + Lactic ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 27 3676.1 ## 2 26 2668.4 1 1007.7 9.8182 0.004247 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Question 1.4: Use the anova function to calculate the significance of the full model. # Enter your code below nullmod <- lm (taste ~ 1 , cheddar) anova (nullmod, lmod) ## Analysis of Variance Table ## ## Model 1: taste ~ 1 ## Model 2: taste ~ Acetic + H2S + Lactic ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 29 7662.9 ## 2 26 2668.4 3 4994.5 16.221 3.81e-06 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Part 2: Study of teenage gambling in Britain The teengamb dataset contains a survey that was conducted to study teenage gambling in Britain. The dataset has 47 observations and 5 variables: sex: 0 = male, 1 = female status: Socioeconomic status score based on parents’ occupation income: income in pounds per week verbal: verbal score in words out of 12 correctly defined gamble: expenditure on gambling in pounds per year Use the following statement to access the data: data(“teengamb”, package = “faraway”)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help