What is a Contingency Table?

A contingency table can be defined as the visual representation of the relationship between two or more categorical variables that can be evaluated and registered. It is a categorical version of the scatterplot, which is used to investigate the linear relationship between two variables. A contingency table is indeed a type of frequency distribution table that displays two variables at the same time.

Binary Variable

A binary has only two types of variables. The final output of the contingency table is a 2 row x 2 column table where both variables will be binary. Hence there are 2 rows and 2 columns which lead to four cells. It is also called a four-fold table. A contingency table is a visual representation of data that makes estimating probabilities easier. The table makes calculating conditional probabilities a breeze.

Three probability distributions – joint, marginal, and conditional can be described using contingency table statistics.

  • The proportion of subjects jointly identified by a category of X and a category of Y is defined by the joint distribution. The joint distribution is determined by dividing the cells of the contingency table by the sum. The cumulative joint distribution is equal to one.
  • The marginal distributions are used to characterize the distribution of a single X or Y variable. The marginal distributions are calculated using the contingency table's columns and rows. A marginal distribution's sum is 1.
  • Conditional distributions define the distribution of one variable in relation to the levels of another variable. The conditional distributions are calculated by dividing the cells of the contingency table by the sums. A conditional distribution's number equals one.
  • Let us consider that we are at a painting exhibition. This feature work from the renaissance, renaissance late, and periods of Baroque. Fruits, trees, or a mixture of both can be seen in these paintings. You want to see how many paintings from each of the periods listed have fruits, flowers, or a combination of both statistics. Here's what you'll get.
 FruitsFlowersMix of Both
Early Renaissance1151
Late Renaissance868
Baroque31012
  • Since the values within apply to all categorical variables: a specific period of time (early renaissance, late renaissance, and Baroque) and a specific object drawn, they are referred to as joint frequencies statistics (fruits, flowers, or a mix of both).

As shown below, marginal frequencies are the sum of each column and row of our table. By looking at the bottom right corner of a table (64) you can also, see how many paintings you have seen in total.

 FruitFlowersMix of BothSum (1)
Early Renaissance115117
Late Renaissance86822
Baroque3101225
Sum (2)22212164
  • Sum 1 contains the total number of observations (pictures) for each cycle.
  • Sum 2 includes the total number of observations (pictures) by content (for each column).
  • As you can see, the total number of images shown at the bottom right is contained in Sum (1) and Sum (2), which is 64.
  • A table contingency is a matrix table that shows the variables’ frequency distribution in means of joint frequencies statistics and marginal frequencies in statistics.

Creating Contingency Table

The table below, for example, shows events related to computer sales at a fictional store. It specifies the frequency of sales based on the variable gender of the customer and the type of computer purchased. The counts reflect the total number of PC and Macs purchased by both men and women. 

PCMacRow Totals
Male6640106
Female3087177
Column Total96127223

It is easy to see how these tables organize the data and paint a picture of the results at the first glance. For example, 66 men purchased PCs, while 87 women purchased Macs. Additionally, the report includes 117 females, 106 males, 96 PC sales, 127 Mac sales, and a total of 223 observations.

In general, all probabilities equal the following ratio as you work through the various forms of probabilities:

Probability (Event) = Favorable Event / Total Outcomes

It's simply a matter of deciding which table values go in the numerator and denominator when using a contingency table to calculate various types of probabilities.  

How can you Calculate Joint Probabilities?

A joint probability is the probability of two or more incidents happening at the same time. What is a joint possibility of a female buying a Mac?

  • Since each cell shows the number of times incidents happened together, contingency tables excel at highlighting joint probabilities. The numerator's joint events are represented by certain values. The denominator's grand total is the number of outcomes.
  • As a consequence, divide each cell count by the grand total to measure joint probability in a contingency table.
  • The joint probability of females buying Mac in our example is equal to the value in that cell (87) divided by the grand total (223).
  • The probability of events "A" and "B" occurring together is denoted by : P (A ⋂ B).
  • In our case, we discovered that: P (Female ⋂ Mac) = 87⁄223   = 0.390

The following is the procedure for estimating joint probabilities using a contingency table analysis:

  • The numerator is the number of occurrences for a particular combination of events that you are interested in.
  • The number of observations in the denominator is equal to the total number of observations.

The values in parentheses in the table below are the joint probabilities for the respective cells. The sum of the probabilities for an entire table is always 1.

 PCMacRow Totals
Male66(0.296)40(0.179)106
Female30(0.135)87(0.390)177
Column Total96127223

How to Calculate Marginal Probabilities?

The condition of one outcome has no direct effect on marginal probabilities. Joint probabilities (above) and conditional probabilities (below) do not have this lack of dependence (below). The single events in our tables are gender (male or female) and machine sort (PC or Mac).

In the numerator, choose the specific event you are interested in and use the corresponding sum. Then, for the denominator, use the grand sub-totals.

For instance, if you would like to calculate the probability of buying a Mac without considering gender, simply divide the column total for Mac (127) by the sub-total (223). Alternatively, if you want to know the likelihood of a female buying a computer without taking into consideration the type of computer, divide the row total for Female (117) by the grand total (223).

P(Female)=  117⁄223   = 0.524

The following is the procedure for estimating marginal probabilities using a contingency table:

  • The numerator is the sum of the rows or columns for the specific case you are interested in.
  • The total number of observations in the denominator is equal to the total number of observations in the numerator.
  • The values in parentheses in the table below are marginal probabilities for each condition. The marginal odds in the columns (PC and Mac) sum to 1. The row marginal probabilities (Male and Female) also sum to 1. The number of observations in the denominator is equal to the total number of observations.

The values in parentheses in the table below are marginal probabilities for each condition. The marginal odds in the columns (PC and Mac) sum to 1. The row marginal probabilities (Male and Female) also sum to 1.

 PCMacRow Totals
Male6640106(0.475)
Female3087177(0.524)
Column Total96(0.430)127(0.570)223

How to calculate Conditional Probabilities?

We may not use the grand total in the denominator, unlike joint and marginal probabilities. Instead of conditioning the probability on the entire sample space, we condition it on a specific outcome. As a result, in the denominator, we use the column or row total for the condition case (the “given” in the problem statement).

Let us see how likely it is that the customer would buy a Mac, considering that she is female.

In the numerator, we will use the female/Mac cell value (87), and in the denominator, we will use the female row total (117).

P(Mac|Female) = 87⁄117 =0.744

Let us give it another shot. In the case of a computer purchase, what is the likelihood that a buyer is a man?

In the numerator, we will use the male/PC cell value (66) and the PC column sum in the denominator (96).

P(Male|PC) = 66⁄96 = 0.688

The following is the procedure for estimating conditional probability using a contingency table:

  • The numerator is the number of times a particular mixture, you are interested in, has occurred. This number is stored in a cell.
  • The denominator is the number of times the “given” component of the question has been asked. This value may be a total from a row or a total from a column that contains the cell from step 1.

Context and Application

This topic is significant in the professional exams for both undergraduate and graduate courses, especially for: 

  • Bachelor of Science in Statistics
  • Master of Science in Statistics

Formula

The probability of an event is calculated as:

Probability (Event) = Favorable Outcomes / Total Outcomes

Practice Problem

Answer the following questions using the contingency table below.

 Brown eyesNot brown eyes
Black hair5030
Red hair7080
  1. What is the probability that someone with brown eyes will have black hair?
  2. What is the probability that someone has black hair?
  3. Are brown eyes and black hair mutually exclusive or mutually inclusive?
  4. What is the probability that someone has brown eyes?

Answer:

Step 1): Calculate the total number of persons having black and red hair.

Add 50 and 30 to obtain the total number of persons having black hair.

50+30=80

Add 70 and 80 to obtain the total number of persons having red hair.

70+80=150

Step 2): Calculate the total number of persons having brown eyes and the total number of persons not having brown eyes.

Add 50 and 70 to obtain the total number of persons having brown eyes.

50+70=120

Add 30 and 80 to obtain the total number of persons not having brown eyes.

30+80=110

Step 3): Draw a table fill it up as shown below.

 Brown eyesNot brown eyesTotal
Black hair503080
Red hair7080150
Total120110230

Step 4): Calculate the probability that someone with brown eyes will have black hair.

The formula for the probability of an event E is given by P E = n E n S , where n E is the number of favorable outcomes to E and n S is the total number of possible outcomes.

Substitute 120 for n S and 50 for n E to determine the probability that someone with brown eyes will have black hair.

P brown eyes with black hair = 50 120

= 5 12

Step 5): Calculate the probability that someone has black hair.

Substitute 230 for n S and 80 for n E to determine the probability that someone has black hair.

P black hair = 80 230

= 8 23

Step 6): Determine whether brown eyes and black hair are mutually exclusive or mutually inclusive.

Mutually exclusive is a term that is used to describe two or more events that cannot happen simultaneously.

There are 50 persons with brown eyes and black hair, so it can be said that brown eyes and black hair are not mutually exclusive.

In other words, it can be said that brown eyes and black hair are mutually inclusive.

Step 7): Calculate the probability that someone has brown eyes.

Substitute 230 for n S and 120 for n E to determine the probability that someone has brown eyes.

P black hair = 120 230

= 12 23

Want more help with your statistics homework?

We've got you covered with step-by-step solutions to millions of textbook problems, subject matter experts on standby 24/7 when you're stumped, and more.
Check out a sample statistics Q&A solution here!

*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in
MathStatistics

Probability and Random Processes

Discrete Probability Distributions

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in
MathStatistics

Probability and Random Processes

Discrete Probability Distributions