Normal Distribution

Up to now, we have focused on distributions of discrete data. We will now direct our attention to continuous data. Where a discrete variable has a finite number of possible values, a continuous variable can assume all values in a given interval of values. Therefore, a continuous random variable can assume an infinite number of values.

For example, the SAT scores of every student who takes the test within a given period of time forms a standard distribution called a normal distribution. A normal distribution is a frequency distribution that often occurs when there are a large number of values in a set of data. The graph of this data is the symmetric, bell-shaped curve shown and has the data spread evenly around a specific center.

The shape indicates that the frequencies in a normal distribution are concentrated around the center portion of the distribution known as the mean (the number 0 in the distribution shown here). A small portion of the population occurs at the extreme values. In a normal distribution, small deviations are much more frequent than larger ones. Negative and positive deviations occur with the same frequency. To the left of the mean are the negative standard deviations of 1, 2, and 3. To the right are the positive standard deviations of 1, 2, and 3 deviations away from the mean.

Examples of continuous random variables are: height, weight, or the time required to run a mile. (The shapes in the album from the previous lesson also apply to continuous distributions.)

We will use probability histograms and approximate these histograms to a smooth curve that displays the shape of the distribution without the lumpiness of the histogram. We will also assume that all of the data we use is approximately normally distributed.

image depicting normal distribution

Example 1

This is an example of a probability histogram of a continuous random variable with an approximate mean of 100 and standard deviation of 15. A normal curve with the same mean and standard deviation has been overlaid on the histogram.

normal distribution overlaid on histogram

What do you notice about these two graphs?

Solution: Answers may vary. Sample responses include - The two graphs have the same peak 100. The normal curve cuts through the midpoint of the top of each rectangle of the histogram.

Example 2

First, you must realize that normally distributed data may have any value for its mean and standard deviation. Below are two graphs with three sets of normal curves.

normal distribution of data with massive outliers

In the first set above, all three curves have the same mean, 10, but different standard deviations. Approximate the standard deviation of each of the three curves.

Solution: $LaTeX: tall\:curve\:-\sigma=1 \\ middle\:curve\:-\sigma=2 \\ shortest\:curve\:-\sigma=3 \\$

normal distribution of data all with the same height

In this set, each curve has the same standard deviation. Determine the standard deviation and then each mean.

Solution: $LaTeX: \sigma=2\:for\:all\:three\:curves \\ \text{left peak} - \mu =6 \\ \text{middle peak} - \mu =8 \\ \text{right peak} - \mu =10 \\$

Based on these two examples, make a conjecture regarding the effect changing the mean has on a normal distribution. Make a conjecture regarding the effect changing the standard deviation has on a normal distribution.

Solution: The mean moves the curve left and right. Smaller values for the mean move the curve to the left. Standard deviation is a measure of variability. Standard deviation affects the width, and therefore the height, of the curve. The more narrow a curve is the smaller its standard deviation will be. Similarly, wider normal curves have larger standard deviations.

Spread of a Normal Curve

For a normal curve the data is spread evenly in intervals that are equal to one standard deviation. In order for a curve to be considered normal, 68% of the data falls within one standard deviation of the mean, 95% of the data falls within 2 standard deviations, and 99.7% of the data falls within 3 standard deviations of the mean. This is known as the Empirical Rule. We can find the spread of a normal curve, by using the formula equation image indicator $LaTeX: \mu\pm\eta\sigma$ , where n is the number of standard deviations, μ is the population mean, and sigma ( σ ) is the standard deviation (denoted s in the diagram). The maximum point of the curve is the mean ( μ ).

distribution of normal curve with mean at the apex

In a normal distribution, the total area under the normal curve and above the horizontal axis represents the total probability of the distribution, which is 1 (or 100%).

Example 1

Determine the spread of a normally distributed set of data with a mean of 75 and a standard deviation of 5 .

Center = 75,
Within 1 standard deviation $LaTeX: =75\pm1\sigma=80\:to\:80$
Within 2 standard deviations $LaTeX: =75\pm2\sigma=65\:to\:85$
Within 3 standard deviations $LaTeX: =75\pm3\sigma=60\:to\:90$

Normal Distributions - Application

In this, an example is shown that uses a normal distribution curve, and it also examines skew-ness, bimodal, and symmetry.

Normal Distributions and Z-scores - Application 2

Watch the following videos exploring examples of normal distributions.

In the first 5:00 of video 1, an example with multiple parts is shown that uses a normal distribution curve, and it also examines skew-ness, bimodal, and symmetry. From 5:01 to the end of the video, an example with three parts is shown introducing the empirical rule and finding the probability of a normal distribution.

In the first 6:50 of video 2, an example is shown introducing the empirical rule and finding the probability of a normal distribution. From 6:51 to the end of the video, examples are shown finding the probability under the curve, given a graph and making use of a calculator.

Normal Distributions Presentation

Probability

Probability is applied to intervals around the mean. If you know the mean and standard deviation, you can find the percent of data that lies within a given range of values. You can also find a range of values for a given probability. You will have to use a z score and z tables for the latter. A z score is the observed value minus the mean divided by the standard deviation or the number of standard deviations that a value falls from the mean. It is given by equation image indicator .

Z-scores to the right of the mean are positive and scores to the left are negative. The actual value is the same, whether left or right.

We need to use a z table for some problems to find the z score, by clicking here.

Z-scores and Probability (Videos 1 and 2)

Watch the following videos showing examples using z-scores.

In video 1, examples are shown that compute probability using z-scores.

In video 2, examples are shown that compute probability using z-scores, shaded to the left and shaded to the right.

Percentages Below the Indicated Value
Standard Normal Probabilities	Negative Z-Scores

	Positive Z- Scores

Probability Using Z Scores Presentation

IMAGES CREATED BY GAVS