## What is Regression Analysis?

Regression analysis is a statistical method in which it estimates the relationship between a dependent variable and one or more independent variable. In simple terms dependent variable is called as outcome variable and independent variable is called as predictors. Regression analysis is one of the methods to find the trends in data. The independent variable used in Regression analysis is named Predictor variable. It offers data of an associated dependent variable regarding a particular outcome.

## Types of Regression Analysis

There are three major types of regression analysis

- Linear regression
- Multiple linear regression
- Non-linear regression

## How Does Regression Analysis Work?

Regression analysis does this by estimating the change in one independent variable has on the other dependent variable where all other independent variable will be constant. By this we can learn the nature of each independent variable.

## Formula

$\text{Y=mX+b}$

where,

- Y is the dependent variable
- m is the slope
- X is the independent variable
- B is constant

### Linear Regression

Linear regression is a model that finds the relationship between two variables by fitting a linear regression line equation .one variable is independent, and another is dependent.

For example,

For every year, five randomly selected students will take a math's test on aptitude before they selected for their statistics course.

- find the regression equation?
- If a student takes 80 marks in aptitude, then how much percentage a student will score in statistics exams?
- Is the regression equation being a good fit?

**Solution**

Step 1:

To find a regression equation by the below data.

STUDENT ROLL NO | TEST SCORE (X) | STATISTICS PERCENTAGE (Y) | DEVIATION OF X | DEVIATION OF Y |

1 | 92 | 82 | 15 | 6 |

2 | 82 | 92 | 5 | 16 |

3 | 79 | 70 | 2 | -6 |

4 | 70 | 65 | -7 | -11 |

5 | 60 | 70 | -17 | -6 |

SUM | 383 | 379 | ||

MEAN | 77 | 76 |

From the above table we have calculated the deviation of x and y by the formula of $\text{(}\overline{\text{X}}\text{-}\overline{\text{x}}\text{)and(}\overline{\text{Y}}\text{-}\overline{\text{y}}\text{)}\text{.}$

Step 2:

Now we want to find the square for the deviation for finding regression analysis.

Step 3:

Now we need to find the product of the deviation X and Y.

Now the linear regression equation is of the form $\overline{\text{y}}{\text{=b}}_{\text{0}}{\text{+b}}_{\text{1}}\overline{\text{x}}\stackrel{}{\to}\left(\text{1}\right)$. To do a regression analysis we need the value for $b$0 and b1. So now we want to solve this is two,

${\text{b}}_{\text{1}}{}_{\text{=}}\frac{{\displaystyle \sum \left[\left(\text{X-}\overline{\text{x}}\right)\left(\text{Y-}\overline{\text{y}}\right)\right]}}{{\displaystyle \sum \left[{\left(\text{X-}\overline{\text{x}}\right)}^{2}\right]}}$

${\text{b}}_{\text{1}}\text{=}\frac{\text{337}}{\text{592}}$

${\text{b}}_{\text{1}}\text{=0}\text{.569}$now we know b1 so we substitute in equation (1), we get$\begin{array}{l}{b}_{0}=\overline{y}-{b}_{1}\times \overline{x}\\ {b}_{0}=76-0.569\times 77\\ {b}_{0}=32.187\end{array}$

the regression equation is $\text{y\xaf =32}\text{.187+0}\text{.569x}$.

If a student score 80 marks, then his/her statistics percentage will be $\begin{array}{l}\overline{y}={b}_{0}+{b}_{1}x\\ \overline{y}=32.187+0.569\times 80\\ \overline{y}=77.707\end{array}$

To check whether the regression is good fit or not we want to find the coefficient of determination.

${\text{R}}^{\text{2}}\text{={}\left(\text{1/N}\right)\text{\xd7}{\displaystyle \sum \left[\left({\text{X}}_{\text{i}}\text{-}\overline{\text{X}}\right)\text{\xd7}\left({\text{Y}}_{\text{i}}\text{-}\overline{\text{Y}}\right)\right]}\text{/}\left({\text{\sigma}}_{\text{x}}{\text{*\sigma}}_{\text{y}}\right){\text{}}}^{\text{2}}$Now we want to calculate the standard deviation of x and y

${\sigma}_{x}=\text{}\sqrt{{\displaystyle \sum \left[{\left({X}_{I}-\overline{X}\right)}^{2}/N\right]}}$

${\text{\sigma}}_{\text{x}}\text{=}\sqrt{\left[\text{592/5}\right]}$

${\text{\sigma}}_{\text{x}}\text{=10}\text{.88}$

${\sigma}_{y}=\sqrt{\left[{\displaystyle \sum {\left(Y-\overline{y}\right)}^{2}/N}\right]}$

${\text{\sigma}}_{\text{y}}\text{=}\sqrt{\left[\text{485/5}\right]}$

${\text{\sigma}}_{\text{y}}\text{=9}\text{.84}$

$\begin{array}{l}{\text{R}}^{\text{2}}\text{=}{\left[\left(\text{1/5}\right)\text{\xd7337/}\left(\text{10}\text{.88\xd79}\text{.84}\right)\right]}^{\text{2}}\hfill \\ {\text{R}}^{\text{2}}\text{=}{\left[\text{67}\text{.4/107}\text{.059}\right]}^{\text{2}}\hfill \\ {\text{R}}^{\text{2}}\text{=0}\text{.39}\hfill \end{array}$

The coefficient of determination is equal to 0.39 it is about 39 % of the variation in statistics percentage can be predicted the relation to math aptitude scores. This could be considered as the good fit of data in the way that it would improve a tutor ability to predict student performance in statistics class.

## Multiple Regression

Multiple regression is a way like linear regression but here we deal with two or more other variables. Here it requires two or more predictors. For example, a student IQ level can be test with aptitude test, exam marks, general knowledge.

**Formula**

$\begin{array}{l}{\text{Y=b}}_{\text{0}}{\text{+b}}_{\text{1}}{\text{x}}_{\text{1}}{\text{+b}}_{\text{2}}{\text{x}}_{\text{2}}{\text{+\u2026\u2026+b}}_{\text{n}}{\text{x}}_{\text{n}}\text{+c}\hfill \\ \text{}\hfill \end{array}$

## Non-Linear Regression

Non-linear regression relates two variables as of simple linear regression but in a curved (non-linear) form. This curve function is of X variable where we use to predict the value of Y. non-linear regression are so complicated because here, we use the function which is created through iterations. Some examples are gauss newton method, iteration method etc. non-linear regression can predict a population growth over time.

**Formula**

${\text{Y=a}}_{\text{0}}{\text{+b}}_{\text{1}}{\text{x}}_{\text{1}}{}^{\text{2}}$

## How to select the best Regression Model?

Testing regression model by manual may be difficult so we use software to find the best fit and regression for the data some statistical software to use are SPSS, MINITAB, STATISTICA, R, PYTHON.

There are two steps to predict the best regression model for the data.

- Adjusted R
^{2}and predicted R^{2}value

If the R^{2} value is high (nearly 60%) then the regression model is a good fit for the data. For some human related problems low R^{2} value may be a good fit it relates upon to the data.

- P – value

In regression the lower the P value more the data is statistically significant.

## Advantages of Regression Analysis

Usually, regression analysis is used to know the impact of one variable among the another. This is mainly useful in small scale industries to predict their future share and which factor influencing the shares of the company. The important application of regression analysis is predictive analysis, operation efficiency, supporting decisions.

## Disadvantages of Regression Analysis

The data will be easily affected by the outlier and it is limited to the linear relationship. There is a lengthy and complicated step of calculation and analysis. If there is a case of qualitative phenomenon regression analysis cannot be used.

## Application of Regression Analysis

By this application we can improve the efficiency of small-scale industries

- Predictive analysis
- Operation efficiency
- Supporting decisions
- Correcting errors
- New insights

## Difference Between Correlation and Regression

- If there is a measure of degree of relationship between the variables is called correlation. if there is a nature of relationship between the variables to predict one by the another is called regression

- In correlation there is no cause and effect between the variables .in regression there is a cause and effect between the variables.

- By using correlation, we cannot predict anything, but regression is a predictive tool.

- Coefficients are symmetrical in correlation but in regression they are asymmetrical.

- In correlation the change in origin and scale are of independent but in regression it is dependent of change in scale.

## Context and Applications:

This topic is significant in the professional exams for both undergraduate and graduate courses, especially for

- BBA
- B.COM
- M.COM

### Want more help with your statistics homework?

*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.

### Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

# Regression Analysis Homework Questions from Fellow Students

Browse our recently answered Regression Analysis homework questions.

### Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.