Statistical Techniques in Business and Economics, 16th Edition
Statistical Techniques in Business and Economics, 16th Edition
16th Edition
ISBN: 9780078020520
Author: Douglas A. Lind, William G Marchal, Samuel A. Wathen
Publisher: McGraw-Hill Education
bartleby

Concept explainers

bartleby

Videos

Question
Book Icon
Chapter 14, Problem 34DE

a.

To determine

Find the multiple regression equation.

Explain each of the variables.

Explain whether it is surprising that the regression coefficient for ERA is negative.

Explain whether the number of wins affected by whether the team plays in the National or the American League.

a.

Expert Solution
Check Mark

Answer to Problem 34DE

The multiple regression equation is as follows:

y^=82.5+(269)x1(0.0463)x2+(0.059)x3(19.47)x4+(0.0402)x5+(0.61)x6_

Explanation of Solution

Let y is dependent variable, x1x2, x3, x4, x5 and x6 are independent variables.

Where, y is the number of win games, the team batting average (BA), number of stolen bases (SB), number of errors committed (Error), team earned run average (ERA), the number of home runs (HR) and whether the team plays in the American or the National League are denoted as x1,x2,x3, x4x5, and x6 respectively.

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA, SB, Errors, ERA, HR, and League.
  • Click OK.

Output using MINITAB software is given below:

Statistical Techniques in Business and Economics, 16th Edition, Chapter 14, Problem 34DE , additional homework tip  1

From the above output, the multiple regression equation is as follows:

y^=82.5+(269)x1(0.0463)x2+(0.059)x3(19.47)x4+(0.0402)x5+(0.61)x6_

For each additional point that the team batting average increases, the number of wins will increase by 0.269. Each additional stole base, the number of wins decrease by 0.0463. For each additional error committed by the team, the number of wins increases by 0.059. An additional increase on the ERA then the number of wins decrease by 19.47. For each additional home run will increase the number of wins by 0.0402. It is noticed that the variable league is coded as 0 and 1 for the National and the American respectively. Playing in the American League the number of wins is 0.61.

The negative regression coefficient for ERA indicates that as the team earned run average is decreased, then the number of Wins is increased and vice versa. There might be a negative correlation between ERA and the number of Wins. This is not surprising.

Consider that, the level of significance is α=0.05. Hence, the p-value corresponding to “League” is 0.839, which is greater than 0.05. Hence, it can be concluded that there is no significant relationship between the dependent variable the “Wins” and the independent variable “League”. Therefore, it can be concluded that the number of wins is not affected by whether the team plays in the National or the American League.

b.

To determine

Find the coefficient of determination (R-square).

b.

Expert Solution
Check Mark

Answer to Problem 34DE

The coefficient of determination is 72.25%.

Explanation of Solution

From result of Part (a) output, the value of coefficient of determination (R-square) is 72.25%.

c.

To determine

Make the correlation matrix.

Find the independent variables those have strong or weak correlations with the dependent variable.

Explain whether there is any problem of multicollinearity.

c.

Expert Solution
Check Mark

Explanation of Solution

Multicollinearity:

In a multiple regression model, when there is high correlation between two or more independent variables, then multicollinearity occurs.

Due to this multicollinearity the standard errors will be high and there will be no exact estimate of the partial regression coefficient. Moreover, there will be difficulty to measure the relative significance of independent variables.

Step by step procedure to obtain the correlation matrix using MINITAB software is given below:

  • Choose Stat > Basic Statistics > Correlation.
  • Select the columns of BA, SB, Errors, ERA, HR, and League under Variables tab.
  • Click OK.

The output obtained using Minitab is as follows:

Statistical Techniques in Business and Economics, 16th Edition, Chapter 14, Problem 34DE , additional homework tip  2

From the obtained output, ERA has strong correlation with dependent variable number of Wins. Whereas batting average, SB, Errors and Home runs have weak correlation with the number of wins. The correlation between League and the number of wins is –0.028. League does not appear to have any linear relationship with the number of wins.

It is noticed that there is no high correlation among the independent variables. Hence it can be concluded that, there is no presence of multicollinearity.

d.

To determine

Conduct a global test on the set of the independent variables and interpret.

d.

Expert Solution
Check Mark

Explanation of Solution

The null and alternative hypotheses are stated below:

Null hypothesis:

H0:β1=β2=β3=β4=β5=β6

That is, all the regression coefficients are equal to zero.

H1:βiβjij=1,2,3...6

At least one of the regression coefficients is not equal to zero.

From Part (a), the F test statistic value is 9.98 and the p-value is 0.000.

Decision rule:

  • If p-valueα, then reject the null hypothesis.
  • Otherwise failed to reject the null hypothesis.

Conclusion:

Consider the significance level as 0.05.

Here, the p-value(=0.000)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that all the regression coefficients are not equal to zero.

e.

To determine

Perform individual tests of each independent variable.

Explain whether any of the independent variables will be deleted.

e.

Expert Solution
Check Mark

Explanation of Solution

For independent variablex1:

Consider that β1 is the population regression coefficient of independent variable x1.

The null and alternative hypotheses are stated as follows:

Null hypothesis:

H0:β1=0.

That is, there is no significant relationship between y and x1.

Alternative hypothesis:

H1:β10.

That is, there is significant relationship between y and x1.

In case of individual regression coefficient test the t test statistic is defined as,

t=bisbi, where bi and sbi are the ith regression coefficient and the standard deviation of the ith regression coefficient.

From Part (a) the t statistic value corresponding to x1 is 2.38 and the p-value is 0.026.

Consider the level of significance is α=0.05.

Conclusion:

Here, the p-value is less than the level of significance.

That is, p-value(=0.026)<α(=0.05)

Therefore, reject the null hypothesis.

Thus, it can be concluded that there is significant relationship between y and x1.

For independent variablex2:

Consider that β2 is the population regression coefficient of independent variable x2.

The null and alternative hypotheses are stated as follows:

Null hypothesis:

H0:β2=0.

That is, there is no significant relationship between y and x2.

Alternative hypothesis:

H1:β20.

That is, there is significant relationship between y and x2.

From output in Part (a) the value of t test statistic corresponding to x2 is –0.86 and the p-value is 0.40.

Conclusion:

Here, the p-value is greater than the level of significance.

That is, p-value(=0.40)>α(=0.05)

Hence by the rejection rule, fail to reject the null hypothesis.

Thus, it can be concluded that there is no significant relationship between y and x2.

For independent variablex3:

Consider that β3 is the population regression coefficient of independent variable x3.

State the hypotheses:

Null hypothesis:

H0:β3=0.

That is, there is no significant relationship between y and x3.

Alternative hypothesis:

H1:β30.

That is, there is significant relationship between y and x3.

From the output in Part (a) the value of t test statistic corresponding to x3 is 0.52 and the p-value is 0.609.

Conclusion:

Here, the p-value is greater than the level of significance

That is, p-value(=0.609)>α(=0.05)

By the rejection, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x3.

For independent variablex4:

Consider that β4 is the population regression coefficient of independent variable x4.

State the hypotheses:

Null hypothesis:

H0:β4=0.

That is, there is no significant relationship between y and x4.

Alternative hypothesis:

H1:β40.

That is, there is significant relationship between y and x4.

From the output in Part (a) the value of t test statistic corresponding to x4 is –6.87 and the p-value is 0.000.

Conclusion:

Here, the p-value is less than the level of significance

That is, p-value(=0.000)<α(=0.05)

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is a significant relationship between y and x4.

For independent variablex5:

Consider that β5 is the population regression coefficient of independent variable x5.

State the hypotheses:

Null hypothesis:

H0:β5=0.

That is, there is no significant relationship between y and x5.

Alternative hypothesis:

H1:β50.

That is, there is significant relationship between y and x5.

From the output in Part (a) the value of t test statistic corresponding to x5 is 0.84 and its p-value is 0.411.

Conclusion:

Here, the p-value is greater than the level of significance.

That is, p-value(=0.411)>α(=0.05)

By the rejection rule, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x5.

For independent variablex6:

Consider that β6 is the population regression coefficient of independent variable x6.

State the hypotheses:

Null hypothesis:

H0:β6=0.

That is, there is no significant relationship between y and x6.

Alternative hypothesis:

H1:β60.

That is, there is significant relationship between y and x6.

From the output in Part (a) the value of t test statistic corresponding to x6 is 0.21 and the p-value is 0.839.

Conclusion:

Here, the p-value is greater than the level of significance

That is, p-value(=0.839)>α(=0.05)

By the rejection rule, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x6.

It is noticed that except the variables team batting average and team ERA, rest of the variables are insignificant. Therefore, the insignificant variables can be dropped from the model.

f.

To determine

Perform the regression analysis until only significant regression coefficients remain in the analysis and identify those variables.

f.

Expert Solution
Check Mark

Explanation of Solution

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA, and ERA.
  • Click OK

Output using MINITAB software is given below:

Statistical Techniques in Business and Economics, 16th Edition, Chapter 14, Problem 34DE , additional homework tip  3

From the above output, the reduced regression model is y^=84.3+(296)x1(19.65)x4

The p-values for each of the independent variables are less than 0.05. Therefore, the independent variables team BA and ERA have significant effect on the number of Wins.

g.

To determine

Provide a histogram for the regression developed in Part (f).

Explain whether the residual follows normal distribution.

g.

Expert Solution
Check Mark

Explanation of Solution

Step by step procedure to obtain the histogram using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA and ERA.
  • Choose Graphs.
  • Under Residual plot select Histogram of residuals.
  • Click OK.

The output obtained using Minitab is as follows:

Statistical Techniques in Business and Economics, 16th Edition, Chapter 14, Problem 34DE , additional homework tip  4

Assumption of normality from histogram:

  • The majority of the observation in the middle and centered on the mean of 0.
  • There are lower frequencies on the tails of the distributions.

According to the given histogram, the most of the observations are centered and there are fewer frequencies on the tails of the distributions. Thus, it can be considered as roughly symmetric.

Hence, the residuals follow a normal distribution.

h.

To determine

Plot the residual plot for the regression developed in Part (f).

h.

Expert Solution
Check Mark

Explanation of Solution

Step by step procedure to obtain the residual Plot using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA and ERA.
  • Choose Graphs.
  • Under Residual plot select residual verses fits.
  • Click OK.

The output obtained using Minitab is as follows:

Statistical Techniques in Business and Economics, 16th Edition, Chapter 14, Problem 34DE , additional homework tip  5

Assumption for residual analysis for the regression model:

  • The plot of the residuals vs. the observed values of the predictor variable should fall roughly in a horizontal band and symmetric about x-axis.
  • For a normal probability plot, residuals should be roughly linear.
  • There should not be any observable pattern.

According to the given residual plot, the points are roughly scattered and moreover, there is no particular pattern in the residual plot. A complete haphazard and random nature has observed.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Students have asked these similar questions
Refer to the Baseball 2018 data, which reports information on the 2018 Major League Baseball season. Let attendance be the dependent variable and total team salary be the independent variable. Determine the regression equation and answer the following questions. Click here for the Excel Data File   Team Salary           Year Stadium Opened Attendance Net Worth League  ($ mil) HR BA Wins ERA mil $ bil National 143.32 176 0.235 82 3.72 1998 2.242695 1.21 National 130.6 175 0.257 90 3.75 2017 2.555781 1.625 American 127.63 188 0.239 47 5.18 1992 1.564192 1.2 American 227.4 208 0.268 108 3.75 1912 2.895575 2.8 National 194.26 167 0.258 95 3.65 1914 3.181089 2.9 American 71.84 182 0.241 62 4.84 1991 1.608817 1.5 National 100.31 172 0.254 67 4.63 2003 1.629356 1.01 American 142.8 216 0.259 91 3.77 1994 1.926701 1.045 National 143.97 210 0.256 91 4.33 1995 3.01588 1.1 American 130.96 135 0.241 64 4.58 2000 1.85697 1.225 American 163.52 205 0.255 103 3.11 2000…
Refer to the Baseball 2018 data, which reports information on the 2018 Major League Baseball season. Let attendance be the dependent variable and total team salary be the independent variable. Determine the regression equation and answer the following questions. Click here for the Excel Data File   Team Salary           Year Stadium Opened Attendance Net Worth League  ($ mil) HR BA Wins ERA mil $ bil National 143.32 176 0.235 82 3.72 1998 2.242695 1.21 National 130.6 175 0.257 90 3.75 2017 2.555781 1.625 American 127.63 188 0.239 47 5.18 1992 1.564192 1.2 American 227.4 208 0.268 108 3.75 1912 2.895575 2.8 National 194.26 167 0.258 95 3.65 1914 3.181089 2.9 American 71.84 182 0.241 62 4.84 1991 1.608817 1.5 National 100.31 172 0.254 67 4.63 2003 1.629356 1.01 American 142.8 216 0.259 91 3.77 1994 1.926701 1.045 National 143.97 210 0.256 91 4.33 1995 3.01588 1.1 American 130.96 135 0.241 64 4.58 2000 1.85697 1.225 American 163.52 205 0.255 103 3.11 2000…
In the packaging department of a large aircraft parts distributor, a fairly reliable estimate of packaging and processing costs can be determined by knowing the weight of an order. Thus, the weight is a cost driver that accounts for a sizable fraction of the packaging and processing costs at this company. Data for the past 10 orders are given as follows. Solve, a. Estimate the b0 and b1 coefficients, and determine the linear regression equation to fit these data. b. What is the correlation coefficient (R)? c. If an order weighs 250 lb, how much should it cost to package and process it?

Chapter 14 Solutions

Statistical Techniques in Business and Economics, 16th Edition

Ch. 14 - The following regression output was obtained from...Ch. 14 - A study by the American Realtors Association...Ch. 14 - The manager of High Point Sofa and Chair, a large...Ch. 14 - Prob. 10ECh. 14 - Prob. 11ECh. 14 - A real estate developer wishes to study the...Ch. 14 - Prob. 13CECh. 14 - Prob. 14CECh. 14 - Prob. 15CECh. 14 - Prob. 16CECh. 14 - The district manager of Jasons, a large discount...Ch. 14 - Suppose that the sales manager of a large...Ch. 14 - The administrator of a new paralegal program at...Ch. 14 - Prob. 20CECh. 14 - Prob. 21CECh. 14 - A regional planner is studying the demographics of...Ch. 14 - Great Plains Distributors, Inc. sells roofing and...Ch. 14 - Prob. 24CECh. 14 - Prob. 25CECh. 14 - Prob. 26CECh. 14 - An investment advisor is studying the relationship...Ch. 14 - Prob. 28CECh. 14 - Prob. 29CECh. 14 - The director of special events for Sun City...Ch. 14 - Prob. 31CECh. 14 - Prob. 32CECh. 14 - Refer to the Real Estate data, which report...Ch. 14 - Prob. 34DECh. 14 - Refer to the Buena School District bus data....Ch. 14 - Prob. 1PCh. 14 - Quick-print firms in a large downtown business...Ch. 14 - The following ANOVA output is given. a. Compute...Ch. 14 - Prob. 1CCh. 14 - Prob. 2CCh. 14 - Prob. 3CCh. 14 - In a scatter diagram, the dependent variable is...Ch. 14 - What level of measurement is required to compute...Ch. 14 - If there is no correlation between two variables,...Ch. 14 - Which of the following values indicates the...Ch. 14 - Under what conditions will the coefficient of...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Prob. 1.9PTCh. 14 - In a multiple regression equation, what is the...Ch. 14 - Prob. 1.11PTCh. 14 - Prob. 1.12PTCh. 14 - For a dummy variable, such as gender, how many...Ch. 14 - What is the term given to a table that shows all...Ch. 14 - If there is a linear relationship between the...Ch. 14 - Given the following regression analysis output: a....Ch. 14 - Given the following regression analysis output. a....
Knowledge Booster
Background pattern image
Statistics
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Text book image
Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning
Text book image
College Algebra
Algebra
ISBN:9781337282291
Author:Ron Larson
Publisher:Cengage Learning
Text book image
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Text book image
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Correlation Vs Regression: Difference Between them with definition & Comparison Chart; Author: Key Differences;https://www.youtube.com/watch?v=Ou2QGSJVd0U;License: Standard YouTube License, CC-BY
Correlation and Regression: Concepts with Illustrative examples; Author: LEARN & APPLY : Lean and Six Sigma;https://www.youtube.com/watch?v=xTpHD5WLuoA;License: Standard YouTube License, CC-BY