Concept explainers
1.
Complete the F table.
Make a decision to retain or reject the null hypothesis that the multiple regression equation can be used to significantly predict health.
1.
Answer to Problem 28CAP
The completed F table is,
Source of variation | SS | df | MS | |
Regression | 126.22 | 2 | 63.11 | 14.48 |
Residual (error) | 21.78 | 5 | 4.36 | |
Total | 148.00 | 7 |
The decision is to reject the null hypothesis.
Theregression equation significantly predicts variance in criterion variable (Y) health [BMI].
Explanation of Solution
Calculation:
The given information is that, the researcher has tested whether ‘daily intake of fat (in grams)’ and ‘amount of exercise (in minutes)’ can predict health (measured using a body mass index [BMI] scale.
The formula for test statistic is,
Decision rules:
- If the test statistic value is greater than the critical value, then reject the null hypothesis
- If the test statistic value is smaller than the critical value, then retain the null hypothesis
Null hypothesis:
Alternative hypothesis:
Software procedure:
Step by step procedure to obtain test statistic value using SPSS software is given as,
- Choose Variable view.
- Under the name, enter the name as Fat, Exercise, Health.
- Choose Data view, enter the data.
- Choose Analyze>Regression>Linear.
- In Dependents, enter the column of Health.
- In Independents, enter the column of Fat, and Exercise.
- Click OK.
Output using SPSS software is given below:
The table of F is,
Source of variation | SS | df | MS | |
Regression | 126.22 | 2 | 63.11 | 14.48 |
Residual (error) | 21.78 | 5 | 4.36 | |
Total | 148.00 | 7 |
Table 1
Critical value:
The considered significance level is
The degrees of freedom for regression are 2, the degrees of freedom for residual are 5.
From the Appendix C: Table C.3-Critical values for F distribution:
- Locate the value 2 in degrees of freedomnumerator row.
- Locate the value 5 in degrees of freedomdenominator row.
- Locate the 0.05 level of significance (value in lightface type) in combined row.
- The intersecting value that corresponds to the (2, 5) with level of significance 0.05 is 5.79.
Thus, the critical value for
Conclusion:
The value of test statistic is 14.48.
The critical value is 5.79.
The test statistic value is greater than the critical value.
The test statistic value falls under critical region.
Hence the null hypothesis is rejected.
Thus, the regression equation significantly predicts variance in criterion variable (Y) health [BMI].
2.
Determine which predictor variable or variables, when added to the multiple regression equation, significantly contributed to predictions in Y (health).
2.
Answer to Problem 28CAP
The predictor variabledaily intake of fat significantly contributed to predictions in Y (health) when added to the multiple regression equation.
Explanation of Solution
Calculation:
The given information is that, a sample of 8 scores is recorded. The predictor variables are ‘daily intake of fat (in grams)
Relative contribution to fat
Null hypothesis:
Alternative hypothesis:
The predictor variable fat is tested. The other predictor variable that is not tested is ‘Exercise’. Calculate the correlation between ‘Exercise’ and ‘Health[BMI]’.
Software procedure:
Step by step procedure to obtain correlation using SPSS software is given as,
- Choose Variable view.
- Under the name, enter the name as Health, Exercise.
- Choose Data view, enter the data.
- Choose Analyze>Correlate>Bivariate.
- In variables, enter the Health, and Exercise.
- Click OK.
Output using SPSS software is given below:
The correlation between ‘Exercise’ and ‘Health [BMI]’ is –0.789.
The value of
The formula for
The calculation of sums of squares is,
Health [BMI] (Y) | |
32 | 1,024 |
34 | 1,156 |
23 | 529 |
33 | 1,089 |
28 | 784 |
27 | 729 |
25 | 625 |
22 | 484 |
Table 2
Substitute,
The contribution of
From the F table, the value of
The contribution of
Reproduce the F table by replacing,
Since there would be only one predictor variable, the degrees of freedom for regression would be 1.
The mean sums of squares for regression is,
The F statistic value is,
The change F table with
Source of variation | SS | df | MS | |
Regression | 34.09 | 1 | 34.09 | 7.82 |
Residual (error) | 21.78 | 5 | 4.36 | |
Total | 148.00 | 7 |
Critical value:
The considered significance level is
The degrees of freedom for regression are 1, the degrees of freedom for residual are 5.
From the Appendix C: Table C.3-Critical values for F distribution:
- Locate the value 1 in degrees of freedom numerator row.
- Locate the value 5 in degrees of freedom denominator row.
- Locate the 0.05 level of significance (value in lightface type) in combined row.
- The intersecting value that corresponds to the (1, 5) with level of significance 0.05 is 6.61.
Thus, the critical value for
Conclusion:
The value of test statistic is 7.82.
The critical value is 6.61.
The test statistic value is greater than the critical value.
The test statistic value falls under critical region.
Hence the null hypothesis is rejected.
Thus, adding fat
Relative contribution to exercise
Null hypothesis:
Alternative hypothesis:
The predictor variable exercise is tested. The other predictor variable that is not tested is ‘fat’. Calculate the correlation between ‘fat’ and ‘Health [BMI]’.
Software procedure:
Step by step procedure to obtain correlation using SPSS software is given as,
- Choose Variable view.
- Under the name, enter the name as Health, Fat.
- Choose Data view, enter the data.
- Choose Analyze>Correlate>Bivariate.
- In variables, enter the Health, and Fat.
- Click OK.
Output using SPSS software is given below:
The correlation between ‘fat’ and ‘Health [BMI]’ is 0.923.
The value of
The value for
From the F table, the value of
The contribution of
Reproduce the F table by replacing,
Since there would be only one predictor variable, the degrees of freedom for regression would be 1.
The mean sums of squares for regression is,
The F statistic value is,
The change F table with
Source of variation | SS | df | MS | |
Regression | 0.14 | 1 | 0.14 | 0.03 |
Residual (error) | 21.78 | 5 | 4.36 | |
Total | 148.00 | 7 |
The critical value for F table with
Conclusion:
The value of test statistic is 0.03.
The critical value is 6.61.
The test statistic value is less than the critical value.
The test statistic value does not fall under critical region.
Hence the null hypothesis is retained.
Thus, adding exercise
Hence, the predictor variable daily intake of fat significantly contributed to predictions in Y (health) when added to the multiple regression equation.
Want to see more full solutions like this?
Chapter 16 Solutions
STATISTICS F/THE BEHAV.SCI. (LOOSELEAF)
- Olympic Pole Vault The graph in Figure 7 indicates that in recent years the winning Olympic men’s pole vault height has fallen below the value predicted by the regression line in Example 2. This might have occurred because when the pole vault was a new event there was much room for improvement in vaulters’ performances, whereas now even the best training can produce only incremental advances. Let’s see whether concentrating on more recent results gives a better predictor of future records. (a) Use the data in Table 2 (page 176) to complete the table of winning pole vault heights shown in the margin. (Note that we are using x=0 to correspond to the year 1972, where this restricted data set begins.) (b) Find the regression line for the data in part ‚(a). (c) Plot the data and the regression line on the same axes. Does the regression line seem to provide a good model for the data? (d) What does the regression line predict as the winning pole vault height for the 2012 Olympics? Compare this predicted value to the actual 2012 winning height of 5.97 m, as described on page 177. Has this new regression line provided a better prediction than the line in Example 2?arrow_forwardThe following fictitious table shows kryptonite price, in dollar per gram, t years after 2006. t= Years since 2006 0 1 2 3 4 5 6 7 8 9 10 K= Price 56 51 50 55 58 52 45 43 44 48 51 Make a quartic model of these data. Round the regression parameters to two decimal places.arrow_forward(a) For United States, provide data for the variables below over the years 1993 – 2007: (i) Net migration rate (per 1,000 population) (ii) Total fertility rate (live births per woman) (iii)Unemployment, general level (Thousands) (iv) Wages (v) Life expectancy at birth for both sexes combined (years) Data can be obtained from the UN database http://data.un.org/Explorer.aspx Using R-Studio, estimate a regression equation to determine the effect of unemployment, general level, wages and life expectancy at birth for both sexes on the net migration rate. (All codes and regression output should be provided). (iv) Using the 10% level of significance, determine and discuss whether the overall regression equation is statistically significant. In responding, construct and test any appropriate hypothesis. (v) Determine and interpret the confidence interval for the independent variable(s).arrow_forward
- (a) For United States, provide data for the variables below over the years 1993 –2007:(i) Net migration rate (per 1,000 population)(ii) Total fertility rate (live births per woman)(iii)Unemployment, general level (Thousands)(iv) Wages(v) Life expectancy at birth for both sexes combined (years)Data can be obtained from the UN database http://data.un.org/Explorer.aspxUsing R-Studio, estimate a regression equation to determine the effect of unemployment,general level, wages and life expectancy at birth for both sexes on the net migration rate.(All codes and regression output should be provided).(i) Write down the regression equation. (ii) Interpret the coefficients and determine which of the individual coefficients in theregression model are statistically significant. In responding, construct and test anyappropriate hypothesis. (iii) Interpret the coefficient of determination.arrow_forwardThe Mayor of texas whom is partners with a local agriculturalist wants to know how the amount of fertilizer and the amount of water given to plants affect their growth. The results were inputted into MINITAB so as to fit the model a) Write out the regression equation b) What is the sample size used in this investigation? c) Determine the values of *, ** and ***, **** d) Conduct a hypothesis test, at the 5% level of significance, to determine whether ? is significant. e) What would be the growth of the plant if 4g of fertilizer and 7g of ater was given to it daily? f) Carry out an F -test at the 1% significance level to determine whether the model is significantarrow_forwardWhich of the multivariate regression parameters listed below would be best interpreted as: the predicted value on the dependent variable when all of the independent variables in the model are equal to zero. a b1 X1 R2arrow_forward
- A mail-order business selling personal computer supplies, software and hardware maintains a centralized warehouse. Management is currently examining the process of distribution from the warehouse and wants to study the factors that affect the warehouse distribution costs. Data collected over 24 random months contain the warehouse’s distribution cost (in thousands of Rands), the sales (in thousands of Rands) and the number of orders received. A multiple linear regression model was fitted to the data by using Stat1.2. Use the output to answer the questions that follow by typing only the letter of the correct option in the answer boxes. Variablesy: Warehouse Distribution Costx1: Salesx2: Number of Orders Model Fitting StatisticsR2 = 0.8504Adj R2: ? Regression Coefficients Beta Parameter Standard b Parameter Standard Estimates…arrow_forwardConsider the following linear regression model that relates income per capita in thousand dollars of a country i (GDP P Ci), with its percentage of the population in the agricultural sector (P Ai): Model : GDP P Ci = β0 + β1P Ai + ui (a) Explain in words how to interpret parameters β0 and β1. What sign do you think these parameters might have? Explain. (b) Draw the (population) regression line associated with this model assuming that parameters β0 and β1 have the sign you have indicated in answering question (2a). Explain the meaning of this regression line.arrow_forwardIn exercise 5, the owner of Showtime Movie Theaters, Inc., used multiple regression analysis to predict gross revenue ( y) as a function of television advertising (x1) and newspaperadvertising (x2). The estimated regression equation wasThe computer solution provided SST = 25.5 and SSR = 23.435. yˆ = 83.2 + 2.29x1 + 1.30x2 a. Compute and interpret R2 and .b. When television advertising was the only independent variable, R2 + .653 and R2a=.595. Do you prefer the multiple regression results? Explainarrow_forward
- The owner of Showtime Movie Theaters, Inc. would like to predict weekly gross revenue as a function of advertising expenditures. Historical data for a sample of eight weeks follow. Weekly GrossRevenue($1000s) TelevisonAdvertising($1000s) NewspaperAdvertising($1000s) Part a 96 5.0 1.5 Develop an estimated regression equation with the amount of television advertising as the independent variable. 90 2.0 2.0 95 4.0 1.5 92 2.5 2.5 95 3.0 3.3 94 3.5 2.3 94 2.5 4.2 94 3.0 2.5 Part b Develop an estimated regression equation with both television advertising and news paper advertising as independent variables.…arrow_forwardThe owner of Showtime Movie Theaters, Inc., would like to predict weekly gross revenueas a function of advertising expenditures. Historical data for a sample of eight weeks follow. Weekly GrossRevenue($1000s) Television Advertising($1000s) Newspaper Advertising($1000s) 96 5.0 1.5 90 2.0 2.0 95 4.0 1.5 92 2.5 2.5 95 3.0 3.3 94 3.5 2.3 94 2.5 4.2 94 3.0 2.5 a. Develop an estimated regression equation with the amount of televisionadvertising as the independent variable.b. Develop an estimated regression equation with both television advertising and newspaper advertising as the independent variables. c. Is the estimated regression equation coefficient for television advertisingexpenditures the same in part (a) and in part (b)? Interpret the coefficient in each case. d. Predict weekly gross revenue for a week when $3500 is spent on television advertising and $1800 is spent on newspaper advertising.arrow_forwardThe manager of the Bayville police department motor pool wants to develop a forecast model for annual maintenance on police cars, based on mileage in the past year and age of the cars. The following data have been collected for eight different cars: a. Using Excel, develop a multiple regression equation for these data. b. What is the coefficient of determination for this regression equation? c. Forecast the annual maintenance cost for a police car that is 5 years old and will be driven 10,000 miles in 1 year.arrow_forward
- College AlgebraAlgebraISBN:9781305115545Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill