AAD - Types of Variables and Graphical Representation of Quantitative Data Lesson

Math_Lesson_TopBanner.png Types of Variables and Graphical Representation of Quantitative Data Lesson

CAUTION: Data is a plural noun so be careful to match the verb when using the term in a sentence. A VARIABLE is any measured characteristic or attribute that differs for different subjects.
For example, if the height of 50 subjects were measured, then height would be a variable.

VARIABLE:  any characteristic that can be assigned a number or category.

There are two (2) kinds of variables:  numerical and categorical.

Numerical variables have a more formal name.  These QUANTITATIVE VARIABLES measure a numerical characteristic like weight, height, income, height of trees, number of students, $ as tips.  Sometimes these can be converted into a category like $10,000 - $24,999 income category, or 11th  grade student.  It is very important to distinguish between the types of variables.  All future decisions are based on this first step.

CATEGORICAL VARIABLES also have a formal name QUALITATIVE VARIABLES and record a category designation like birth month, shirt size, type of soft drink, color of eyes, type of job, income category, or grade level.  A special case is called a "binary" variable where ONLY 2 possible categories exist...yes/no, true/false, male/female, etc.

Consider your extended family members as legitimate variables that can be measured as "observational units" that is, a person or thing to which a number or category can be assigned.   Hair color is a legitimate variable.  Number of relatives with blonde hair is NOT a variable, unless we have people changing hair color.  Height of the shortest relative is NOT a variable.  Whether or not a person has black hair IS a categorical (qualitative) binary variable.  Other binary variables would be gender or political identity considering our two party system of government.  Age of the teacher is NOT a variable.  The number of states that a student has visited is quantitative along with heights of students. 

NOTE:  IF the observational units had been all the classes at a particular school, then the number of students with blonde hair would become a variable. Data may be UNIVARIATE, meaning only one (1) measurement on each object is recorded as height of a child, or BIVARIATE, meaning two (2) measurements on each object are used like height AND weight of a child.

The data type will determine the type of display used.

The distribution of a variable tells what values the variable takes and how often it takes them.  When we examine data in order to describe their main features it is called "Exploratory Data Analysis."  We should always begin by examining each variable by itself with a GRAPH, VERBAL DESCRIPTION, and NUMERICAL SUMMARY.   For more than one variable we discuss relationships between or among the variables. 

Quantitative data is modeled by different types of graphs.  DOT PLOTS and HISTOGRAMS are common types.

dotplot.pnghistogram.png

STEM PLOTS, also called stem-leaf plots, are sideways histograms but should be used for small data sets since too few stems hides the pattern and too many stems dilutes the pattern.   Stem plots can be created using the keyboard and the post '|' to divide stems and leaves.   Stem plots do have an advantage over dot plots and histograms in that they preserve the original data values.

stemplot

When looking at the data, some characteristics are readily observable...symmetry or non-symmetry.  Symmetric distributions will have two sides that are approximate mirror images of each other.  Non-symmetric distributions may have long tails on either side of "center" and are said to be "skewed right" if the tail is long on the right or "skewed left" if the tail is long on the left.

skewedrightskewedleft

Categorical data and their graphs are addressed in a separate unit.

Math_APStatisticsBottomBanner.png

IMAGES CREATED BY GAVS