PROJECT PART C: Regression and Correlation Analysis
Math-533 Applied Managerial Statistics
Prof. Jeffrey Frakes
December 12, 2014
Jared D Stock
1. Generate a scatterplot for income ($1,000) versus credit balance ($), including the graph of the best fit line. Interpret.
This scatter plot graph is a representation of combining income and credit balance. It shows the income increasing as the credit balance increases. As a result of this data it can be inferred that there is a positive relationship between the two variables. Because of the positive relationship between income and credit balance the best fit line or linear regression line fits the data quite well. The speculation can be strongly made that the
…show more content…
A customer with a $10,000 credit balance is, more than likely, going to have an income of $115,748.27. That is according to the fitted regression model.
In an attempt to improve the model, we attempt to do a multiple regression model predicting income based on credit balance, years, and size.
11. Using MINITAB, run the multiple regression analysis using the variables credit balance, years, and size to predict income. State the equation for this multiple regression model.
Regression Analysis: Income($1000) versus Credit Balance($), Size, Years
The regression equation is
Income($1000) = - 13.2 + 0.0108 Credit Balance($) + 0.615 Size + 1.21 Years
Predictor Coef SE Coef T P
Constant -13.186 3.608 -3.65 0.001
Credit Balance($) 0.0107922 0.0008184 13.19 0.000
Size 0.6151 0.4178 1.47 0.148
Years 1.2097 0.2322 5.21 0.000
S = 5.26121 R-Sq = 86.5% R-Sq(adj) = 85.6%
Analysis of Variance
Source DF SS MS F P
Regression 3 8171.7 2723.9 98.41 0.000
Residual Error 46 1273.3 27.7
Total 49 9445.0
Source DF Seq SS
Credit Balance($) 1 6052.7
Size 1 1368.0
Years 1 750.9
The fitted regression line:
Income = -13.186 +0.0107922* Credit Balance + 0.6151* Size + 1.2097*Years.
12.
22) Which statement is true of a regression line that is superimposed on the scatter plot?
4) Use exponential regression (or find a constant ratio) to determine an equation for the data.
This report has been created in the framework of a student group project and the Georgia Institute of Technology does not officially sanction its content.
There is a linear positive relationship between Income and Credit Balance variables. Where income increases, credit balance also increases.
The last pairing of variables I combined together is Income and Size and it demonstrated in a scatter plot. The household size of 7 or 8 has the highest income is with over $69,000 and more. The shape of distribution is positive linear relationship.
Since the P-value (0.386) is greater than the significance level (0.05), we fail to reject the null hypothesis. The p-value implies the probability of rejecting a true null hypothesis.
Problem 2.4: Consider the sample from a bank database shown in Figure 2.14; it was selected randomly from a larger database to be the training set. Personal Loan indicates whether a solicitation for a personal loan was accepted and is the response variable. A campaign is planned for a similar solicitation in the future, and the bank is looking for a model that will identify likely responders. Examine the data carefully and indicate what your next step would be.
AJ DAVIS is a department store chain, which has many credit customers. A sample of 50 credit customers is selected with data collected on location, income, credit balance, number of people and years lived in the house
the table below corresponds to a salary of $125,000. A copy of this data set can be found in the
credit balance is $5,678 and smallest at $1,864, resulting in a range of $3,814. The standard deviation is $924.11. This represents that in the aggregated comparison of credit balances of AJ Davis customers (in the sample set) a relatively small variation exists from the mean of $3,964.06. The median credit balance is $4,090.00, the median often provides a clearer picture of the overall
Pivot the budget line and derive two other points on the consumer’s demand for X.
One research looked at the effects that student debt has on college graduates and their chances of marrying after college. The researcher collected data by interviewing and survey college graduates. The participates were men and women who receive their bachelor’s degree in 1993. There were 9,410 participates who were all single adults after college. The researcher finds were that men and women differ when it comes to marriage and loan debt (Bozick,2014). In conclusion, man who are high in debt are less likely to marry after college than those who are lower in debt. Women are more acceptable to getting married, while having large sums of student loan debt which shows that man and women differed with marriage (Bozick,2014). Dwyer, Hodson, Mccloud (2013) research focus on how debt affects men and women differently and how they take different paths. Some men and women may go to college after high school, and some may jump into the job market. The researchers discovered that debt affects men and women differently when trying to pay off their debt. Men have it easier to pay off their debt, because more employers are willing to pick them over women, which put women at a disadvantage. Bozick (2014) also looked at the differences in men and women, but by looking at debt and marriage rates. He found that they’re also a difference on how debt affects men and women. His findings were that men are less likely to marry if he is high debt. Women are more acceptable to marry if she has high debt, compared to men. Both researchers concluded that men and women differ with it comes to
This dataset contains customer’s default payments in Taiwan. This dataset has 30000 observations and 24 features. The features are all real numbers. There is a binary variable, default payment (Yes=1, No=0), as the response variable. The rest of the 23 features are explanatory variables, including amount of the given credit (X1), history of past payment (X6-X11), amount of bill statement (X12-X17), amount of previous payment (X18-X23), and some demographical data. In predictions, we did not include some of the demographical variables
2. Develop estimated regression equations, first using annual income as the independent variable and then using household size as the independent variable. Which variable is the better predictor of the annual credit card charges? Discuss your findings.
This case study included information on a sample of fifty credit card accounts. This information, table one, included household size, annual income, and the amount charged to the account. Scatter plots of the data were produced. Figure one shows household size vs. amount charged. This graph shows that the positive linear relationship of the data is somewhat strong. The r squared is 0.56, analyzing the graph there is a correlation of household size to amount charged, but there is a range per household size.