# The University Secretary wants to determine how University grade point average, GPA (highest being 4.0) of a sample of students from the University depends on a student’s high school GPA (HS), age of a student (A), achievement test score (AS), average number of lectures skipped each week (S), gender of a student (where M=1 if a student is male or 0 otherwise), computer or PC ownership of a student (where PC=1 if a student owns a computer or 0 otherwise), the means of transport to school (drive, bicycle or walk; where D=1 if a student drives to campus or 0 otherwise, B=1 if a student bicycles to campus or 0 otherwise), and finally, the subject major of the student (finance, human resource, marketing and accounting; where F=1 if a student majors in finance or 0 otherwise, HR=1 if a student majors in human resource or 0 otherwise, MR=1 if a student majors in marketing or 0 otherwise). Use the correlation matrix and dummy regression output to answer the questions. GPA HS A AS S M PC D B F HR MR GPA 1.00 HS 0.41 1.00 A -0.02 -0.26 1.00 AS 0.21 0.35 -0.08 1.00 S -0.26 -0.09 -0.08 0.12 1.00 M -0.08 -0.21 0.04 0.18 0.20 1.00 PC 0.22 0.04 -0.09 0.04 -0.21 -0.07 1.00 D -0.11 -0.19 0.27 -0.20 0.26 -0.08 0.02 1.00 B 0.08 0.14 -0.05 0.16 -0.13 0.13 -0.10 -0.38 1.00 F 0.08 0.12 -0.22 0.18 0.06 0.04 0.08 -0.08 -0.11 1.00 HR 0.08 0.17 -0.49 0.08 0.06 0.05 -0.04 -0.11 0.07 -0.12 1.00 MR -0.10 -0.19 0.37 -0.11 -0.05 0.02 0.05 0.08 0.01 -0.15 -0.79 1.00 a) Which 2 pairs of variables are most correlated with the regressand? b) Which 3 pairs of variables are mostly multicollinear? c) Identify 3 pairs of variables that are most correlated.

Question

3. The University Secretary wants to determine how University grade point average, GPA (highest being 4.0) of a sample of students from the University depends on a student’s high school GPA (HS), age of a student (A), achievement test score (AS), average number of lectures skipped each week (S), gender of a student (where M=1 if a student is male or 0 otherwise), computer or PC ownership of a student (where PC=1 if a student owns a computer or 0 otherwise), the means of transport to school (drive, bicycle or walk; where D=1 if a student drives to campus or 0 otherwise, B=1 if a student bicycles to campus or 0 otherwise), and finally, the subject major of the student (finance, human resource, marketing and accounting; where F=1 if a student majors in finance or 0 otherwise, HR=1 if a student majors in human resource or 0 otherwise, MR=1 if a student majors in marketing or 0 otherwise). Use the correlation matrix and dummy regression output to answer the questions.
GPA HS A AS S M PC D B F HR MR
GPA 1.00
HS 0.41 1.00
A -0.02 -0.26 1.00
AS 0.21 0.35 -0.08 1.00
S -0.26 -0.09 -0.08 0.12 1.00
M -0.08 -0.21 0.04 0.18 0.20 1.00
PC 0.22 0.04 -0.09 0.04 -0.21 -0.07 1.00
D -0.11 -0.19 0.27 -0.20 0.26 -0.08 0.02 1.00
B 0.08 0.14 -0.05 0.16 -0.13 0.13 -0.10 -0.38 1.00
F 0.08 0.12 -0.22 0.18 0.06 0.04 0.08 -0.08 -0.11 1.00
HR 0.08 0.17 -0.49 0.08 0.06 0.05 -0.04 -0.11 0.07 -0.12 1.00
MR -0.10 -0.19 0.37 -0.11 -0.05 0.02 0.05 0.08 0.01 -0.15 -0.79 1.00

a) Which 2 pairs of variables are most correlated with the regressand?
b) Which 3 pairs of variables are mostly multicollinear?
c) Identify 3 pairs of variables that are most correlated.

The estimated equation by OLS is:
Residual (df) =129, TSS=19.41, ESS=14.03.

Values in parentheses (under the regression equation) are standard errors and those in square brackets are the variance inflation factors (VIFs).

d) Determine the fitness of the regression model
e) Determine if the coefficient of high school GPA is statistically different from zero?
f) Specify the whole regression model and identify 2 relevant error terms.
g) Interpret the estimate of a student who bicycles to campus.
h) Using only the variance inflation factor (VIF), which one of the pairs of variables selected to be multicollinear may be dropped from the regression and why?
i) Suppose that two University students, A and B, of the same age of 20, same achievement test score, same average number of lectures skipped, same gender, both own a PC, both drive, and both major in the same subject, but Student A’s high school GPA score is 2.5 points higher. What is the predicted difference in college GPA for these two students? What is driving this comparative difference?
j) Interpret the coefficient of age of a student.
k) What is the predicted difference between a 19 year old male student who bicycles to campus, owns a computer, has a high school GPA of 3.5, an achievement score of 27, skipped 1 lecture, and majors in HR, and a 21 year old female student who walks to campus, has no computer, majors in accounting, but has the same high school GPA of 3.5, an achievement score of 27 and skipped 1 lecture. What is causing the comparative difference?