MM - Distributions and Histograms Lesson

Let's briefly revisit histograms. The distribution of a variable tells what values the variable takes and how often it takes them.  This is different than a scatterplot because it shows frequency of the variable or percentages.  We can use both dot plots and histograms to show the distribution of frequency.  In this lesson, we will talk about histograms and we will learn how to interpret the data.

Here are some examples.  As you look at each one, try to summarize the data in one or two sentences using words like more, less, compared to, portion, percentages, quantities.

Histogram Salaries graphical representation

2Source: PennState Eberly College of Science

Histogram Health Spending sample

1Source: KFF analysis of 2019 Medical Expenditure Panel Survey Links to an external site.  Get the data  PNG Links to an external site.

 

Health spending by group Histogram

1Source: KFF analysis of Medical Expenditure Panel Survey 2019 data Links to an external site.  Get the data  PNG Links to an external site.

Once you have a deep understanding of the purpose of the data you can progress to beginning the analysis of that data.  This video will present the procedure used to describe a distribution.

Now let's evaluate each histogram from above.

Figure 3.5 Histogram (Salaries):

2As you can see, the histogram is right-skewed because a larger percentage of the salaries are located on the lower tail. The very large salary of $110,000 is largely responsible for the histogram is right-skewed. With right-skewed histograms, the mean will be greater than the median, because the mean is sensitive to the large salary of $110,000 and is pulled in the direction of the unusually large observation. In contrast, the median, which is the middle value of the data set, is resistant to any extreme observations because these observations are not used to determine its value. Table 3.3 summarizes the link between the two measures of center and histogram shape.

Histogram Shape

There are websites that will automatically generate histograms.  Complete a web search and test out a few different histogram creators/generators.  Use the data below as an example:

Here are the weights (in pounds) of 20 steers on an experimental feed diet:
183   140   136   142   172   153   172   156   70   167   192   166   113   171   159   129   112   135   174   155
How would you describe this data's  overall shape?

Average health spending by race/ethnicity, 2019:

1Overall, White people in the U.S. have significantly higher average total health spending than other race and ethnicity groups. People who identified as either Asian or Hispanic had the lowest average health spending in every age category. Differences in health spending by race may be driven by a variety of factors, including health status, insurance coverage, age distribution, and access to care. People of color are younger, on average, than White people, Hispanic people are more likely to be uninsured, and Hispanic and Black people are more likely to report delaying or going without medical care due to costs. Private health plans tend to pay higher prices for services than public plans do. About three in four Asian and White people are enrolled in private health plans at some point in a given year, while about one in two Black and Hispanic people are covered by private plans at some point in a given year.  Immigrants have lower health spending on average than those born in the United States. Asian and Hispanic people have the highest shares of foreign-born populations at about 66% and 33%, respectively.

Average total health spending, by insurance status and sex, 2019:

1People who lack insurance all year have much lower total health expenditures on average in all age categories than people who have insurance for part of the year or the entire year. This could be because people who are healthier may be more likely to go without insurance, or because people who do not have insurance are more likely to go without needed medical care.

Among children, there is not a significant difference in health spending between sexes. Average health spending increases throughout adulthood for both men and women, but at somewhat different rates. Women have higher health spending than men in their 20s, 30s, and early 40s, largely due to pregnancy and delivery-related care. Spending differences between men and women are not statistically significant in older age groups.

3Self-Assessment SOCS:

 

Citations:

1How do health expenditures vary across the population? Links to an external site.(2022, July 6). Peterson-KFF Health System Tracker. 

23.3 - Numbers: Summarizing Measurement Data Links to an external site. | STAT 100. (n.d.). PennState: Statistics Online Courses. Retrieved November 4, 2022.

3Russell, A. B. J. M. (2021, January 11). 2.6Measures of Center – Significant Statistics Links to an external site.. Pressbooks. 

[CC BY-NC-SA 4.0] UNLESS OTHERWISE NOTED | IMAGES: LICENSED AND USED ACCORDING TO TERMS OF SUBSCRIPTION - INTENDED ONLY FOR USE WITHIN LESSON.