SS - Sources of Variability Lesson
Sources of Variability
Adapted from Course materials (III.C Student Activity Sheet 9 & 10) for AMDM developed under the leadership of the Charles A. Dana Center, in collaboration with the Texas Association of Supervisors of Mathematics and with funding from Greater Texas Foundation.
The discrepancy between a population parameter and a sample statistic is known as statistical bias, which can result from many different sources. Two broad categories of statistical bias are biased sampling method and biased statistic.
What would happen if you collected all the potato chip bags from a single vending machine? A biased sample might result; the chips could be packaged at the same time or shipped during the same work shift.
Sometimes researchers spend vast resources (time, money, effort) to get a great sample and then still end up with a biased statistic. For example, something could be wrong with the data collection method. What could happen after selecting the sample (for example, potato chip bags) and before calculating the statistic that could result in a biased statistic?
If in fact, the method of sampling has introduced bias, one reason for this might be that the sample was actually non-representative of the population.
Non-representative sampling is when the sample does not represent the differences in a population. For example, the students in this class are mainly seniors in high school; therefore, they do not represent all students in high schools. If only the football team or only the volleyball team were surveyed at most schools, the sample would have all males or all females.
Under coverage occurs when entire segments of the population are missed. In the class example, freshmen, sophomores, and juniors would be missed. In the broad population, homeless people, people in prisons, people without phones, and so forth might be missed.
Several errors are possible in implementing the chosen sampling method. Perhaps the strata are not identified properly (stratified sampling) or clusters were not well defined, creating overlapping subsets of the population that introduced the possibility of a response being repeated.
Biased statistics can result from a variety of problems during the data collection process.
Variability
Variability can be caused by natural variability or induced variability.
Natural variability occurs with no human intervention or error. For example, temperatures vary from hour to hour, day to day, and location to location, strength varies from person to person, and grades on a given test vary from person to person.
Outside forces act to change the variability that naturally occurs. People affect the environment, which can result in causing temperatures to vary more, less, or differently than before. People work out or eat differently, which affects the variation in strength. Test grades are influenced by new instructional techniques or changing study patterns.
Experimental studies can suffer from non-representative samples and under coverage. What are some reasons for deliberately using a non-representative sample in an experimental study? What problems could result?
Rather than spending a great deal of time and money to ensure a representative sample, researchers often use the techniques listed below to try to eliminate any statistical bias introduced by the sampling method.
Self-Assessment: Bias
Think about how each method can reduce sampling method bias and thus increase the accuracy of a study's results.
- Random assignment of treatments. - This approach helps ensure that the treatment group and the control group are similar to each other. Therefore, if different outcomes are detected researchers can be more confident that this result is due to the treatment, not to differences in the participation.
- Blind/double-blind studies - Participants sometimes inaccurately report on their improvement. Likewise, researchers sometimes subconsciously try harder to find improvement in treatment group participants versus control group participants. Conducting the study blind, or preferably double-blind, can ensure more accurate reports.
- Use of control groups - This approach is related to the first two techniques, as three work together to improve result accuracy. Randomly assigning participants to receive a fake treatment helps control for the placebo effect- hence the name of the control group.
- Replication - If a study is run again with a new sample ( for example, a larger population, different ethnic group, or different gender) and still yields similar results the researchers feel more confident that the treatment is working.
Blocking is another method used to reduce variability. When groups of experimental units are similar, it is often a good idea to gather them together into blocks. By blocking we isolate the variability attributable to the differences between the blocks so that we can see the differences caused by the treatments more clearly.
Sources of Variability Citation:
1Al., E. O. I. B. &. (n.d.-b). Appendix B: Practice Tests (1-4) and Final Exams | Introduction to Statistics. Retrieved October 14, 2022, from lumen Learning Links to an external site..
[CC BY-NC-SA 4.0] UNLESS OTHERWISE NOTED | IMAGES: LICENSED AND USED ACCORDING TO TERMS OF SUBSCRIPTION - INTENDED ONLY FOR USE WITHIN LESSON.