ID - Standard Deviation (Lesson)

Standard Deviation

Measures of dispersion deal with the spread of the data. The spread of a data set refers to the variability of the data.  If the data cluster around a single central value, the spread is smaller. The further the values are from the center of the data set, the greater the spread or variability of the set. 

We have learned that the range of a set of data is the difference between the greatest data element and the least data element. There are also three quartiles of a data set which divide the set into four equal parts, when listed in numerical order.

Another measure of dispersion called the Interquartile Range (IQR) is the difference between the third quartile ( LaTeX: Q_3Q3 ) and the first quartile ( LaTeX: Q_1Q1 ).  This IQR measures the spread of the middle half of the data. 


Variance and Standard Deviation

There are two other important measures of dispersion known as the variance and the standard deviation. These measures tell a lot about the spread of a data set. The variance of data looks at how far each value is from the mean. This is similar to the mean absolute deviation that we learned about in a previous lesson. To determine the variance, each data point is subtracted from the mean value and then squared. Then, an average of those numbers is found (using n - 1).

The formula for variance is   LaTeX: \frac{\left(x_1-\overline x\right)^2+\left(x_2-\overline x\right)+...+\left(x_n-\overline x\right)^2}{n-1}(x1¯x)2+(x2¯x)+...+(xn¯x)2n1.

  • n is the number of data points 
  • LaTeX: \overline x¯x is the mean value
  • LaTeX: x_1,x_2,x_3,,......,x_nx1,x2,x3,,......,xn  are the data points

 

Another measure of dispersion called the standard deviation is simply the square root of the variance.  The greater the standard deviation the more spread out the data are. The symbol for Standard Deviation is the Greek letter σ ('sigma').

The formula for standard deviation is LaTeX: \sigma=\sqrt[]{\frac{\left(x_1-\overline x\right)^2+\left(x_2-\overline x\right)^2+...\left(x_n-\overline x\right)^2}{n-1}}σ=(x1¯x)2+(x2¯x)2+...(xn¯x)2n1 .

 

Example:

Find the variance and the standard deviation for the following data set of values: 10, 22, 38, 23, 38, 23, 21, 48, 92

First, find the mean:

LaTeX: x=\frac{10+22+38+23+38+23+21+48+92}{9}=35x=10+22+38+23+38+23+21+48+929=35

Next, subtract each data value from the mean. A table may help:

Data Value

Subtract the mean from each value

Square the difference 

10

10 - 35 = -25

(-25)2 = 625

22

22 - 35 = -13

(-13)2 = 169

38

38 - 35 = 3

32 = 9

23

23 - 35 = -12

(-12)2 = 144

38

38 - 35 = 3

32 = 9

23

23 - 35 = 12

122 = 144

21

21 - 35 = -14

(-14)2 = 196

48

48 - 35 = 13

132 = 169

92

92 - 35 = 57

572 = 3249

Now, we take the squared values and find the average of those values:

Variance = LaTeX: \frac{625+169+9+144+9+144+196+169+3249}{8}=589.25625+169+9+144+9+144+196+169+32498=589.25

There is one more step to find the standard deviation. We take the square root of the variance:

Standard Deviation = LaTeX: \sqrt{589.25}=24.27589.25=24.27

 

Many times, data sets can be very large or can be "messy" and contain decimals. This can make it pretty difficult to calculate the variance and standard deviation by hand. In these cases, we can use a calculator to calculate these for us. Every calculator differs in how this works. Find the model name and number of your calculator and search for the instructions on how to calculate variance and standard deviation for your specific calculator. 

Here is an example of how to use a Casio calculator to find the variance and the standard deviation of a data set. 

You can also use the desmos.com calculator Links to an external site. to determine the variance and standard deviation of a data set. The function var will return the variance and the function stdev will return the standard deviation. Take a look at this image to see how this works. 

desmosVarAndStd.jpg

There are a lot of statistical functions that can be used along with the desmos calculator. You can see them all here: Desmos Stat Functions Links to an external site.


Application: Measures of Central Tendency and Dispersion

Why do we want to know these statistical measures that we have learned? Data is everywhere and interpreting these measures tells us something about the data. This helps us understand what happened and prepare appropriately for the next situation! In the following video, an example is shown that computes the mean, median, standard deviation, and the interquartile range of a set of real-world data.


Measures of Data Practice

Now it is time for you to practice some measures of spread problems.

IMAGES CREATED BY GAVS