ECD - Exploring Categorical Data Module Overview
Exploring Categorical Data Overview
Introduction
Since middle school you have studied basic probability and statistics and may recall making pie charts and bar graphs. These are very important graphs that display data obtained from categorical variables. They are still appropriate in advanced statistics studies. Although this unit is shorter than the previous units, do not assume that the contents are any less important. Categorical variables and their analysis are extremely important in the real world. We visit them here and will conclude this course in statistics by revisiting them in the last unit. You already know that a pie chart shows how a "whole" is divided into categories by showing a wedge of the circle whose area corresponds to the percent in each category. The entire circle represents 100%. By contrast, a bar graph uses bars to represent the count of each category within the variable. Both of these graphs have limitations. You will learn about another display type that is very powerful in helping to analyze categorical data and form sophisticated conclusions. So let's begin!
Essential Questions
- What graph options exist for displaying categorical data?
- Why are percents important when displaying and analyzing categorical data?
- What is meant by "conditional?"
Key Terms
The following key terms will help you understand the content in this module.
Contingency table - displays counts and sometimes percentages for two or more categorical variables.
Marginal distributions - row and column totals converted into % form marginal.
distributions
Conditional distributions- limits considerations to a specific row or column in the table or row and column.
Simpson’s Paradox- averages taken across different groups can appear contradictory and misleading.
IMAGES CREATED BY GAVS