## What is Correlation?

Correlation defines a relationship between two independent variables. It tells the degree to which variables move in relation to each other. When two sets of data are related to each other, there is a correlation between them.

## Correlation Coefficient

Correlation is measured in terms of correlation coefficient, which ranges from −1 to 1.  The correlation coefficient shows the relationship, but it cannot indicate causation. That is, it does not show if one variable is caused by the other. If the coefficient value is less than 0.8 or greater than − 0.8, the correlation coefficient is not considered significant. For example, correlation coefficient can be used to determine the relationship between the price of oil and the stock price of the oil-producing company.

## Types of Correlation

### Positive and Negative Correlation

Positive correlation: When there is a positive correlation between variables, it means that when one variable moves up or down, the other will move in the same direction. A positive correlation coefficient that has a value of 1 is perfectly positive.

Examples of positive correlation include:

• Price and supply of items
• Height and weight
• Income and spending on luxury items

Negative correlation: Under a negative correlation, the two variables move in opposite directions. If one goes up, the other goes down. A strong negative correlation coefficient is −1.

Examples:

• Price and demand of items
• Temperature and sale of winter items

### Zero Correlation

If variables are found to have zero correlation, there is no relationship between the two variables.

### Linear and Non-linear Correlation

Linear correlation: When two variables change in equal proportions, it is known as linear correlation.

Non-linear correlation: When two variables change in different proportions, it is known as non-linear correlation.

### Simple and Multiple Correlation

Simple correlation: The study of the relationship between two variables only is known as simple correlation. Under this, one variable is independent and the other one is dependent.

Multiple correlation: The study of the relationship between more than two variables is known as a multiple correlation.

## Methods of Estimating Correlation

There are three methods of estimating correlation:

• Scatter diagram
• Karl Person’s coefficient of correlation
• Spearman rank correlation

## Scatter Diagram

A scatter plot, or scatter diagram, denotes the direction and degree of correlation in graphic form. One variable is displayed on the x-axis and the other variable is shown on the y-axis. The bundle of points that are plotted makes it a scatter diagram. A line can be drawn showing the relationship based on the direction of points and their distance from each other.

## Karl Person’s Coefficient of Correlation

The correlation coefficient is used to measure the linear relationship between two variables. This method does not differentiate between dependent or independent variables. It is used to determine the direction (positive or negative) and strength of the relationship between the two variables.

If the coefficient is nearer to 1 or -1, it will be a strong relationship; otherwise, it will be a weak relationship. It is a quantitative method of calculation that assigns an exact numerical value to the correlation. It is also known as the product moment correlation.

• If the value is 1, there is a perfect positive correlation between the two variables. This means when there is a positive increase in one, then there will also be a positive increase in the other.
• If the value is −1, there is a perfect negative relationship between the two variables. If there is a positive increase in one, then there will be a decrease in the other.
• If the value is 0, there is no linear relationship between the two variables.

The value of the correlation coefficient tells us about the strength of the relationship between the two variables. For example, if the coefficient is 0.4, then the relationship is weak. Generally, if the coefficient is between -0.8 and 0.8, the correlation is not considered important.

There are three methods of finding the correlation coefficient:

• Actual mean method
• Assumed mean method
• Step deviation method

### Pearson Correlation Coefficient Equation

Correlation Coefficient = {Covariance (x,y)} / (Standard deviation of x * Standard deviation of y)

Correlation Coefficient =$\frac{\sum \left[\left(a-a\text{'}\right)\left(b-b\text{'}\right)\right]}{\sqrt{\sum {\left(a-a\text{'}\right)}^{2}\sum {\left(b-b\text{'}\right)}^{2}}}$

a’= average of observations of variable a

b’= average of observations of variable b

### Excel Formula to Calculate Coefficient

CORREL (array1, array2)

### How to Calculate Covariance

Covariance is the relationship between two random variables.

Formula:

Covariance (x,y)= $\frac{\sum \left[\left(x-x\text{'}\right)\left(y-y\text{'}\right)\right]}{N-1}$

Where:

x’= Mean of x

y’= Mean of y

N = Number of data values.

### Properties of Correlation Coefficient

• It has no unit.
• If the value is negative, it denotes an inverse relationship.
• If the value is positive, variables move in the same direction.
• If the value is zero, variables are not related to each other.
• Value of correlation coefficient lies between −1 to 1.
• If the value is +1, it indicates a perfect positive relationship.
• If the value is −1, it indicates a perfect negative relationship.
• If the value of the correlation coefficient is high, it indicates a strong relationship.
• A low value indicates a weak relationship.

### Merits and Demerits of Coefficient of Correlation

Merits

• It helps in finding the degree (or strength) of the relationship between two variables.
• It helps in finding the direction of the relationship between two variables (positive or negative).

Demerits

• These values can be affected by extreme values (or outliers).
• It assumes a linear relationship between two variables.

## Spearman Rank Correlation

The Spearman rank correlation is used to measure the monotonic relationship between two variables. The name of this correlation is dedicated to its developer C.E. Spearman. It may or may not be linear. It is also known as the rank order coefficient of correlation. It is used for qualitative measurements, such as beauty, honesty, wisdom, quality, etc.

### Similarities between Karl Person’s Coefficient and Spearman Rank Correlation

• The value under both methods will range between −1 and 1.
• If the value is −1, it is a perfect negative correlation. It implies that X with the highest value goes with the lowest value of Y.
• If the value is +1, it is a perfectly positive correlation. It implies that X and its paired Y have the same rank.

### Differences between Karl Person’s Coefficient and Spearman Rank Correlation

• The result of the rank correlation method is not accurate, as it is not based on numerical values of all data and doesn’t give importance to extreme values.
• The rank correlation method is useful only when data is small and can be ranked.

## Formulas

• Correlation Coefficient = $\frac{\sum \left[\left(a-a\text{'}\right)\left(b-b\text{'}\right)\right]}{\sqrt{\sum {\left(a-a\text{'}\right)}^{2}\sum {\left(b-b\text{'}\right)}^{2}}}$

a’= average of observations of variable a

b’= average of observations of variable b

• Covariance (x,y)= $\frac{\sum \left[\left(x-x\text{'}\right)\left(y-y\text{'}\right)\right]}{N-1}$

Where

x’= Mean of x

y’= Mean of y

N = Number of data values

## Context and Applications

Correlation is used in many real-life applications, including in:

• Finance and investment decisions
• Statistical and other research to find a relationship between variables
• Giving a numerical value to the relationship between two variables
• Understanding economic behavior
• Making decisions more reliable, as estimations based on correlations are more reliable

### Want more help with your statistics homework?

We've got you covered with step-by-step solutions to millions of textbook problems, subject matter experts on standby 24/7 when you're stumped, and more.
Check out a sample statistics Q&A solution here!

*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.

### Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in
MathStatistics

### Correlation, Regression, and Association

• Copyright, Community Guidelines, DSA & other Legal Resources: Learneo Legal Center
• bartleby, a Learneo, Inc. business