Concept explainers
This exercise requires the use of a statistical software package. The cotton aphid poses a threat to cotton crops. The accompanying data on
appeared in the article “Estimation of the Economic Threshold of Infestation for Cotton Aphid” (Mesopotamia Journal of Agriculture [1982]: 71–75). Use the data to find the estimated regression equation and assess the utility of the multiple regression model
Want to see the full answer?
Check out a sample textbook solutionChapter 14 Solutions
Bundle: Introduction to Statistics and Data Analysis, 5th + WebAssign Printed Access Card: Peck/Olsen/Devore. 5th Edition, Single-Term
- Would I use the regression line to predict Y from X ? And what is the pattern of the scatterplot?arrow_forwardWe have data on Lung Capacity of persons and we wish to build a multiple linear regression model that predicts Lung Capacity based on the predictors Age and Smoking Status. Age is a numeric variable whereas Smoke is a categorical variable (0 if non-smoker, 1 if smoker). Here is the partial result from STATISTICA. b* Std.Err. of b* Std.Err. N=725 of b Intercept Age Smoke 0.835543 -0.075120 1.085725 0.555396 0.182989 0.014378 0.021631 0.021631 -0.648588 0.186761 Which of the following statements is absolutely false? A. The expected lung capacity of a smoker is expected to be 0.648588 lower than that of a non-smoker. B. The predictor variables Age and Smoker both contribute significantly to the model. C. For every one year that a person gets older, the lung capacity is expected to increase by 0.555396 units, holding smoker status constant. D. For every one unit increase in smoker status, lung capacity is expected to decrease by 0.648588 units, holding age constant.arrow_forward4b) The data shows a systolic and a diastolic blood pressure of certain patients. Find the linear regression equation, using the first variable x (systolic) as the independent variable. Find the best predicted diastolic blood pressure for a patient with a systolic blood pressure (y) reading of 140. What is the correlation coefficient, r? Using a significance level of a = 0.05, is there a significant linear relationship between systolic and diastolic blood pressure? Blood Pressure: Systolic Diastolic 112 125 115 136 143 116 123 124 elimii 70 89 65 90 97 64 SUTT nisinoo aqdM 21.SS bns aqdM 78 ahoqnis erit te zbesqz steb ils to 69 bns ago 20.EI to adimil srit terit sonabilnos 2 nistnoo aqdM 21.SS bnc agdM sgsavs arit ferli mislo a hoqnis orti roqque lovedni sonsbilnos 3028 wolsd insmsisiz tomo artezorio SeqdM 2.55 al 2.SS to sulavadi znistmoodi ezusaed mish ads toqque ton zaob 2.55 to sulsy sdt anistroo ti sausosd mislo ert hoqquz 200b to sulav orit nisinoo ton zoob 11 saussed misbb adi…arrow_forward
- The quality of the orange juice produced by a certain manufacturer is constantly monitored. Data collected on the sweetness index of an orange juice sample and amount of water-soluble pectin for 24 production runs at a juice manufacturing plant are shown in the accompanying table. Suppose a manufacturer wants to use simple linear regression to predict the sweetness (y) from the amount of pectin (x). Find and interpret the coefficient of determination, r2, and the coefficient of correlation, r. Find and interpret the coefficient of determination, r2. Select the correct choice below and fill in the answer box within your choice. (Round to three decimal places as needed.) A. The coefficient of determination, r2, is enter your response here. Sample variations in the amount of water-soluble pectin explain 100r2% of the sample variation in the sweetness index using the least squares line. B. The coefficient of determination, r2, is enter your…arrow_forwardSuppose a study wants to predict the market price of a certain species of turtle (Y) based on the following independent variables indicated in the table. Based from the table, what is the equation of the multiple linear regression? (Round off up to two decimal places. Market Price = 0.07 - 0.40*weight + 1.51*length + 1.41*width + 0.80*age Market Price = - 0.40*weight + 1.51*length + 1.41*width + 0.80*age Market Price = 0.07 + 0.40*weight + 1.51*length + 1.41*width + 0.80*age Market Price = 0.07 - 0.40 + weight + 1.51 + length + 1.41 + width + 0.80 + agearrow_forwardThe least-squares regression equation is y=620.6x+16,624 where y is the median income and x is the percentage of 25 years and older with at least a bachelor's degree in the region. The scatter diagram indicates a linear relation between the two variables with a correlation coefficient of 0.7004. Predict the median income of a region in which 30% of adults 25 years and older have at least a bachelor's degree.arrow_forward
- An oil exploration company wants to develop a statistical model to predict the cost of drilling a new well. One of the many variables thought to be an important predictor of the cost is the number of feet in depth that the must be drilled to create the well. Consequently, the company decided to fit the simple linear regression model, where y = cost of drilling the new well (in $thousands) and x = number of feet drilled to create the well. Using data collected for a sample of n=83 wells, the following results were obtained: = 10.5 + 16.20x Give a practical interpretation of the estimate of the slope of the least squares line. An oil exploration company wants to develop a statistical model to predict the cost of drilling a new well. One of the many variables thought to be an important predictor of the cost is the number of feet in depth that the must be drilled to create the well. Consequently, the company decided to fit the simple linear regression model, where y =…arrow_forwardThe following result perspective in RapidMiner shows a multiple linear regression model. Based on the diagram, the model for our dependent variable Y is Predicted Y= (Insulation *0.420)+(Temperature *0.071)+(Avg_Age*0.065)+(Home_Size *0.311)+7.589 Attribute Insulation Temperature Avg Age Home Size (Intercept) O True O False Coefficient 3.323 -0.869 1.968 3.173 134.511 Std. Error 0.420 0.071 0.065 0.311 7.589 Std. Coefficient 0.164 -0.262 0.527 0.131 ? Tolerance 0.431 0.405 0.491 0.914 ? t-Stat 7.906 -12.222 30.217 10.210 17.725arrow_forwardA simple linear regression that describes the effect of individuals’ cigarette smoking on health is given by Health = α + β * cigarettes + u, where Health is a measure of health that is on the scale of 1 to 5, where 1 means excellent health and 5 means poor health. So the bigger the number, the worse the health. cigarettes is the average number of cigarettes smoked per day; the unobservable u is an individual’s health consciousness. Note that health conscious person tends to live a healthy life in general. What will happen to β if cigarettes is in terms of weekly rather than daily?arrow_forward
- 10) A regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y).The results of the regression were:y=ax+b a=-0.767 b=31.009 r2=0.609961 r=-0.781 Use this to predict the number of situps a person who watches 7.5 hours of TV can do (to one decimal place)arrow_forwardA 10-year study conducted by the American Heart Association provided data on how age, blood pressure, and smoking relate to the risk of strokes (Dataset "Stroke"). Risk is interpreted as the probability (times 100) that a person will have a stroke over the next 10-year period. For the smoker variable, 1 indicates a smoker and 0 indicates a nonsmoker. a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood-pressure level. b. Consider adding two independent variables to the model developed in part (a), one for the interaction between age and blood-pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables. c. At a 0.05 level of significance, test to see whether the addition of the interaction term and the smoker variable contributes significantly to the estimated regression equation developed in part (a). d. Refer to the model developed in part…arrow_forwardfind the (a) explained variation, (b) unexplained variation, and (c) indicated prediction interval. In each case, there is sujficient evidence to support a claim of a linear correlation, so it is reasonable to use the regression equation when making predictions. Altitude and Temperature Listed below are altitudes (thousands of feet) and outside air temperatures (°F) recorded by the author during Delta Flight 1053 from New Orleans to Atlanta. For the prediction interval, use a 95% confidence level with the altitude of 6327 ft (or 6.327 thousand feet).arrow_forward
- MATLAB: An Introduction with ApplicationsStatisticsISBN:9781119256830Author:Amos GilatPublisher:John Wiley & Sons IncProbability and Statistics for Engineering and th...StatisticsISBN:9781305251809Author:Jay L. DevorePublisher:Cengage LearningStatistics for The Behavioral Sciences (MindTap C...StatisticsISBN:9781305504912Author:Frederick J Gravetter, Larry B. WallnauPublisher:Cengage Learning
- Elementary Statistics: Picturing the World (7th E...StatisticsISBN:9780134683416Author:Ron Larson, Betsy FarberPublisher:PEARSONThe Basic Practice of StatisticsStatisticsISBN:9781319042578Author:David S. Moore, William I. Notz, Michael A. FlignerPublisher:W. H. FreemanIntroduction to the Practice of StatisticsStatisticsISBN:9781319013387Author:David S. Moore, George P. McCabe, Bruce A. CraigPublisher:W. H. Freeman