IROVD - Interpreting and Representing One Variable Data (Overview)

Algebra_OverviewTOP.png 

Interpreting and Representing One Variable Data

Introduction

InterpretingIntroduction.png Mrs. Wood is doing an experiment with her Algebra I class. She is going to try teaching one of her classes outside to see if the fresh air helps to improve their test scores. She teaches one class inside and one class outside and then compares their scores from that unit's test. What important statistics should she compare? If one class has a higher average than the other, does that mean every student in that class did better? In this unit, we will learn what statistics like mean, median, range, IQR and standard deviation imply about a data set. Then we will use those understandings to compare different data sets!

Essential Questions

  1. How do I summarize, represent, and interpret data on a single count or measurement variable?
  2. When making decisions or comparisons, what factors are important for me to consider in determining which statistics to compare, which graphical representation to use, and how to interpret the data?
  3. How can I use visual representations and measures of center and spread to compare two data sets?
  4. Why is technology valuable when making statistical models?
  5. How can I apply what I have learned about statistics to summarize and analyze real data?

Key Terms

The following key terms will help you understand the content in this module.

Box Plot - A method of visually displaying a distribution of data values by using the median, quartiles, and extremes of the data set. A box shows the middle 50% of the data.

Box-and-Whisker Plot - A diagram that shows the five-number summary of a distribution. (Five-number summary includes the minimum, lower quartile (25th percentile), median (50th percentile), upper quartile (75th percentile), and the maximum. In a modified box plot, the presence of outliers can also be illustrated.

Categorical Variables - Categorical variables take on values that are names or labels. The color of a ball (e.g., red, green, blue), gender (male or female), year in school (freshmen, sophomore, junior, senior). These are data that cannot be averaged or represented by a scatter plot as they have no numerical meaning.

Center - Measures of center refer to the summary measures used to describe the most "typical" value in a set of data. The two most common measures of center are median and the mean.

Dot Plot - A method of visually displaying a distribution of data values where each data value is shown as a dot or mark above a number line.

First Quartile (Q1) - For an ordered data set with median M, the first quarter is the median of the data values less than M. For the data set {2, 3, 6, 7, 10, 12, 14, 15, 22, 120}, the first quartile is 6.

Five-Number Summary - Minimum, lower quartile, median, upper quartile, maximum.

Histogram - Graphical display that subdivides the data into class intervals and uses a rectangle to show the frequency of observations in those intervals—for example you might do intervals of 0-3, 4-7, 8-11, and 12-15.

Interquartile Range (IQR) - A measure of variation in a set of numerical data. The interquartile range is the distance between the first and third quartiles of the data set. Example For the data set {1, 3, 6, 7, 10, 12, 14, 15, 22, 120}, the interquartile range is 15 - 6 = 9.

Mean Absolute Deviation (MAD) - A measure of variation in a set of numerical data, computed by adding the distances between each data value and the mean, then dividing by the number of data values. Example For the data set {2, 3, 6, 7, 10, 12, 14, 15, 22, 120}, the mean absolute deviation is 20.

Mean (X bar) - The mean (X bar) is a common measure of center. To find the mean, average the data values in the data set.

Number of Peaks - Distributions can have few or many peaks. Distributions with one clear peak are called unimodal and distributions with two clear peaks are called bimodal. Unimodal distributions are sometimes called bell-shaped.

Qutlier - Sometimes, distributions are characterized by extreme values that differ greatly from the other observations. These extreme values are called outliers. As a rule, an extreme value is considered to be an outlier if it is at least 1.5 interquartile ranges below the lower quartile (Q1), or at least 1.5 interquartile ranges above the upper quartile (Q3).

Quantitative Variables - Numerical variables that represent a measurable quantity. For example, when we speak of the population of a city, we are talking about the number of people in the city - a measurable attribute of the city. Therefore, population would be a quantitative variable. Other examples could be scores on a set of tests, height and weight, temperature at the top of each hour, etc.

Median - The median (M) is a common measure of center. The median the middle number in a set of data values that have been placed in order from highest to lowest or lowest to highest. To find the median of a set with an odd number of values, place the set in order and choose the middle number. To find the median of a set with an even number of values, place the set in order and average the two middle numbers.

Range - The range is the difference between the greatest data value and the least data value.

Second Quartile (Q2) - The median value in the data set.

Shape - The shape of a distribution is described by symmetry, number of peaks, direction of skew, or uniformity.

Symmetry - A symmetric distribution can be divided at the center so that each half is a mirror image of the other.

Direction of Skew - Some distributions have many more observations on one side of graph than the other. Distributions with a tail on the right toward the higher values are said to be skewed right and distributions with a tail on the left toward the lower values are said to be skewed left.

Spread - The spread of a distribution refers to the variability of the data. If the data cluster around a single central value, the spread is smaller. The further the observations fall from the center, the greater the spread or variability of the set. (range, interquartile range, Mean Absolute Deviation, and Standard Deviation measure the spread of data)

Third Quartile (Q3) - For an ordered data set with median M, the third quartile is the median of the data values greater than M. Example For the data set {2, 3, 6, 7, 10, 12, 14, 15, 22, 120}, the third quartile is 15.

Uniformity - When observations in a set of data are equally spread across the range of the distribution, the distribution is called uniform distribution. A uniform distribution has no clear peaks.

Algebra_OverviewBottomBanner.png

IMAGES CREATED BY GAVS