How to Design a Weighing Scale?What is Normal Distribution?Characteristics of a Normal Distribution Technical Definition of a Normal Distribution Standard Normal Distribution Z-Scores Back to Our Weighing Scale Design Empirical Rule Why is the Normal Distribution so Important?Formulas Context and Applications

How to Design a Weighing Scale?

Suppose we had to design a bathroom weighing scale, how would we decide what should be the range of the weighing machine? Would we take the highest recorded human weight in history and use that as the upper limit for our weighing scale? This may not be a great idea as the sensitivity of the scale would get reduced if the range is too large. At the same time, if we keep the upper limit too low, it may not be usable for a large percentage of the population!

So how do we decide what should be a reasonable upper limit, so that, say, 95% of the population can use the scale? Suppose, further, that we managed to conduct a survey and found some statistical measures about the population who are the intended users of the weighing scale, would that be of use? Let us assume that the survey determined that the mean weight of the population is 74 kg and the standard deviation is 28 kg. Can we now determine what should be the upper limit of the weighing scale?

This is where statistics and normal distributions come to our rescue! Given the above information, we can conclude that if we set the upper limit to be 120 kg, then we can expect it to be good enough for 95% of the population!

What is Normal Distribution?

The normal distribution is one of the most important continuous probability distributions of a random variable. Many real-world phenomena such as heights & weights of a population of organisms, errors in measurement, variations in stock prices, etc. can be modeled using normal distributions. The normal distribution is also known as a Gaussian distribution, named after the famous mathematician Carl Friedrich Gauss, who worked extensively on this probability distribution.

If we plot the data of any of these phenomena as relative frequency distribution, the resulting graph will be a very distinct bell-shaped curve. This bell curve is a characteristic of the normal distribution. A typical bell curve (dark blue curve in the picture below) is centered around the mean and spreads symmetrically on either side as shown in the figure below.

We will learn more about the various aspects of the normal distribution in the next few sections.

Characteristics of a Normal Distribution

The normal distribution can be effectively described using two parameters – the mean and the standard deviation.

The mean represents the center of the bell curve and the graph is perfectly symmetric about the center. Also, the mean, the median, and the mode are all equal for a normal distribution.

The standard deviation gives a measure of how much the data is spread from the center. The higher the standard deviation, the more the data is spread out, and the flatter the bell curve looks. So, two different random variables that are both normally distributed and that have the same mean but different standard deviations will have bell curves that may be flatter or taller depending on the standard deviation.

The variance is another commonly used measure of the spread of the distribution and is equal to the square of the standard deviation.

The mean of the distribution is typically denoted by µ and the standard deviation is denoted by σ.

“Normal distribution curves for different standard deviations”

Technical Definition of a Normal Distribution

Formally, the probability distribution of a normally distributed random variable X with mean µ and variance σ2 is written as $X \sim N (μ, σ^{2})$ . It can be expressed in terms of the probability density function, p(x), which is given by the following equation:

$p (x) = \frac{1}{σ \sqrt{2 π}} e^{\frac{- {(x - μ)}^{2}}{2 σ^{2}}}$

The area under the probability density function can be used to estimate the probability that the X is less than a value, say a. This can be calculated using the formula for cumulative density function, $P (X \leq a)$ :

$P (X \leq a) = \int_{- \infty}^{a} x p (x) d x = \int_{- \infty}^{a} x (\frac{1}{σ \sqrt{2 π}} e^{\frac{- {(x - μ)}^{2}}{2 σ^{2}}}) d x$

That is a scary-looking integral and unfortunately, it cannot be written in simpler terms using other common functions that we know! Then how do we calculate the probability?

Since it is a definite integral, there are methods to calculate the area under the curve for various values of a. However, this is a very tedious task and to avoid having to do the calculations repeatedly, we make use of a set of pre-computed tables of values for the probability. In the next section, we will look at how to use those tables by calculating Z-scores.

Standard Normal Distribution

The standard normal distribution is a special normal distribution whose mean is 0 and the standard deviation is 1. Let Z be a random variable such that $Z \sim N (0, 1)$ .

Suppose we had to find the probability that the value of Z is less than 2, then it be would the area under the curve (region shaded green) as shown below:

“A standard normal curve representing the probability that the z-value is less than 2”

Instead of evaluating the integral, we can look up the value in standard Z-tables such as the one shown below to directly find the probability to be 0.9772.

Z-Scores

The above tables can be used for any normal distribution after applying a suitable transformation. Given a random variable $X \sim N (μ, σ^{2})$ , then we can transform the distribution to the standard form by calculating the z-score that is defined as follows.

$z = \frac{x - μ}{σ}$

The resulting transformed variable will be a standard normal distribution for which the same tables are applicable. We can now look up the table for the corresponding z-score to find the probability.

For example, if a random variable X is normally distributed and has a mean of 20 and a standard deviation of 2, then the probability that X is less than 18 can be calculated as follows:

Calculate the z-score corresponding to the value of 18.

$z = \frac{18 - 20}{2} = - 1$

Look up the table to find the value of probability corresponding to z = -1. We find that the probability is given by $P (Z \leq - 1) = 0.1587 \approx 0.16$ .

So the probability that X is less than 18 is also roughly 0.16.

Back to Our Weighing Scale Design

Since weights of the population are normally distributed with mean as 74 kg and standard deviation as 28 kg, we can represent it as $W \sim N (74, 28^{2})$ . To find the upper limit of the range of our weighing scale, we want to find that value of the weight, $W_{u p p e r}$ , which is above 95% of the population, i.e., $P (W \leq W_{u p p e r}) = 0.95$ .

By reverse looking up in the tables, we see that 0.95 corresponds to a z-score of approximately 1.65. We can calculate the value of $W_{u p p e r}$ by substituting it in the equation for the transformation:

$\begin{array}{l} z = \frac{W_{u p p e r} - μ}{σ} \\ \Rightarrow 1.65 = \frac{W_{u p p e r} - 74}{28} \\ \Rightarrow W_{u p p e r} = 1.65 \times 28 + 74 \approx 120 \end{array}$

Empirical Rule

Since it may not always be possible to have access to these tables, it is useful to remember some standard values by heart. In the graph below, the dotted lines represent the values that are in steps of one standard deviation away from the mean on either side of the mean:

Here are some points to remember:

The region shaded green is one standard deviation from the mean and has an area of about 0.68, i.e. 68% of the values lie within one standard deviation from the mean.
The regions shaded orange is between 1 and 2 standard deviations away from the mean on either side and have an area of about 0.135 each.
So, the green and orange regions combined represent the values that are 2 standard deviations away from the mean and have a total area of about 0.95, i.e. 95% of the values lie within 2 standard deviations from the mean.
The regions shaded blue are between 2 and 3 standard deviations away from the mean on either side and have an area of about 0.0235 each.
So, the green, orange, and blue regions combined represent the values that are 3 standard deviations away from the mean and have a total area of about 0.997, i.e. 99.7% of the values lie within 3 standard deviations from the mean.
Only 0.3% of the values are more than 3 standard deviations from the mean.

Why is the Normal Distribution so Important?

Besides the fact that many natural processes can be modeled using normal distributions, normal distributions play a central role in statistical applications especially because of the central limit theorem and the various applications of the central limit theorem.

In our earlier weighing scale example, we assumed values for the population mean and standard deviation – but how do we do that in reality? It is almost impossible to find the descriptive statistical measures of a population. All that we can do is to collect data from a few samples and find the statistical measures of those samples.

Suppose we take random samples of size n, from a population (not necessarily normally distributed) with mean $μ$ , and standard deviation $σ$ , then according to the central limit theorem the means of the samples will tend to be normally distributed with mean $μ$ and a standard error of $\frac{σ}{\sqrt{n}}$ .

The central limit theorem lays the foundation for a branch of statistics known as inferential statistics that deals with inferring properties about the population based on measurements from random samples.

Formulas

The probability density function of a normal distribution $X \sim N (μ, σ^{2})$ is given by:

$p (x) = \frac{1}{σ \sqrt{2 π}} e^{\frac{- {(x - μ)}^{2}}{2 σ^{2}}}$

The probability that a normal random variable is less than ‘a’, $P (X \leq a)$ , is calculated using the formula for cumulative density function:

$P (X \leq a) = \int_{- \infty}^{a} x p (x) d x = \int_{- \infty}^{a} x (\frac{1}{σ \sqrt{2 π}} e^{\frac{- {(x - μ)}^{2}}{2 σ^{2}}}) d x$

Given a random variable $X \sim N (μ, σ^{2})$ , the formula to transform the distribution to the standard form by calculating the z-score that is:

$z = \frac{x - μ}{σ}$

Context and Applications

This concept is applicable for pre-graduation, graduation, and post-graduation students for mathematics and statistics, as well as many engineering branches.

Want more help with your statistics homework?

We've got you covered with step-by-step solutions to millions of textbook problems, subject matter experts on standby 24/7 when you're stumped, and more.

Check out a sample statistics Q&A solution here!

*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in

Math Statistics

Probability and Random Processes

Continuous Probability Distribution

Normal Distribution Homework Questions from Fellow Students

Browse our recently answered Normal Distribution homework questions.

Q: 2. Let X1, X2..... X, be independent random variables with expectation 0 and finite third moments.…

Q: For each of the time series, construct a line chart of the data and identify the characteristics of…

Q: 14. Define X-(H) for a given H E R. Provide a simple example.

Q: Please provide the solution for the attached image in detailed.

Q: What does the margin of error include? When a margin of error is reported for a survey, it includes…

Q: Suppose that you conduct a study twice, and the second time you use four times as many people as you…

Q: Sand and clay studies were conducted at a site in California. Twelve consecutive depths, each about…

Q: 2 Make a histogram from this data set of test scores: 72, 79, 81, 80, 63, 62, 89, 99, 50, 78, 87,…

Q: When fitting the model E[Y] = Bo+B1x1,i + B2x2; to a set of n = 25 observations, the following…

Q: 19. (a) Define the joint distribution and joint distribution function of a bivariate ran- dom…

Q: 17. (a) Define the distribution of a random variable X. (b) Define the distribution function of a…

Q: d of the 20 respectively. Interpret the shape, center and spread of the following box plot. 14 13 12…

Q: 1990) 02-02 50% mesob berceus +7 What's the probability of getting more than 1 head on 10 flips of a…

Q: For each of the time series, construct a line chart of the data and identify the characteristics of…

Q: Client 1 Weight before diet (pounds) Weight after diet (pounds) 128 120 2 131 123 3 140 141 4 178…

Q: 1. Show that, for any non-negative random variable X, EX+E+≥2, X E max X. 21.

Q: A classification study involving several classifiers was carried out. After training and the usual…

Q: Table of hours of television watched per week: 11 15 24 34 36 22 20 30 12 32 24 36 42 36 42 26 37 39…

Q: Crumbs Cookies was interested in seeing if there was an association between cookie flavor and…

Q: Spam filters are built on principles similar to those used in logistic regression. We fit a…

Q: In a recent national poll consisting of 1012 randomly selected adults. Participants were asked…

Q: 13. In 2000, two organizations conducted surveys to ascertain the public's opinion on banning gay…

Q: Prove that 1) | RxX (T) | << = (R₁ " + R$) 2) find Laplalse trans. of Normal dis: 3) Prove thy t /Rx…

Q: Questions An insurance company's cumulative incurred claims for the last 5 accident years are given…

Q: Can someone check my work? If you draw a card with a value of four or less from a standard deck of…

Q: Suppose that A and B are independent and P(A) = 0.3 and P(B) = 0.2. Find P(AUB). vob siw bris sugit…

Q: Please could you help me answer parts d and e. Thanks

Q: solve on paper

Q: 4. Dynamic regression (adapted from Q10.4 in Hyndman & Athanasopoulos) This exercise concerns…

Q: (d) Show that A, and A' are tail events.

Q: Show that L′(θ) = Cθ394(1 −2θ)604(395 −2000θ).

Q: Let us suppose we have some article reported on a study of potential sources of injury to equine…

Q: op In a two-way table with variables A and B, does P(A|B) + P(A|B) = 1? பே 69 6 work as m 3 atavs ow…

Q: Question 8 Ten parts are measured three times by the same operator in a gauge capability study. The…

Q: You want to obtain a sample to estimate the proportion of a population that possess a particular…

Q: Consider the following data and corresponding weights. xi Weight(wi) 3.2 6 2.0 3 2.5 2 5.0 8 a.…

Q: 8 Tell whether the following statement is true or false: "The correlation between height and weight…

Q: 3. Natalie Min is an undergraduate in the Haas School of Busi- ness at Berkeley. She wishes to…

Q: 08:34 ◄ Classroom 07:59 Probs. 5-32/33 D ا. 89 5-34. Determine the horizontal and vertical…

Q: The PDF of a random variable X is given by the equation in the picture.

Q: What is one sample T-test? Give an example of business application of this test? What is Two-Sample…

Q: a) When two variables are correlated, can the researcher be sure that one variable causes the other?…

Q: An environmental research team is studying the daily rainfall (in millimeters) in a region over 100…

Q: In this problem, we consider a 3-period stock market model with evolution given in Fig. 1 below.…

Q: Bob's commuting times to work have a nor- mal distribution with a mean of 45 minutes and standard…

Q: Negate the following compound statement using De Morgans's laws.

Q: Question 4 An article in Quality Progress (May 2011, pp. 42-48) describes the use of factorial…

Q: A clinical study is designed to assess the average length of hospital stay of patients who underwent…

Q: The acidity or alkalinity of a solution is measured using pH. A pH less than 7 is acidic; a pH…

Q: Please conduct a step by step of these statistical tests on separate sheets of Microsoft Excel. If…

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in

Math Statistics