IRV - Inference for Related Variables Module Overview

Test of Significance Overview

Introduction

All of the fancy significance testing thus far has concerned quantitative data distributions. In this unit you will learn to perform significance tests on categorical data presented in a two-way table. The chi-square distribution has many applications depending upon the form of the data. This test can determine if categorical variables are independent or not, whether the data fit a "claim" about expected proportions, or whether the data suggest a real difference between categories. We will continue to use the calculator to assist with these tests.
The final component we must consider is inference for slope. You will recall that performing linear regression on paired data often resulted in an equation of a line as a reasonable predictor model. But, now we know that it is necessary to create an interval surrounding the slope of the model since we are never 100% certain. There is also a hypothesis testing procedure to determine if two variables are related or not. Independence has been, and will always remain, an important issue in statistics.

Essential Questions

  • What procedures are available for testing results obtained from categorical data?
  • How can we test whether two categorical variables are related or not?
  • Is there a procedure for significance testing on measures of spread?
  • How can we create a confidence interval for the slope of a linear regression prediction model?
  • How do we test the accuracy of the slope and the implications for the variables relationship?

Key Terms

The following key terms will help you understand the content in this module.

Chi-square statistic- is found by summing the chi-square components

Expected count- every model gives a hypothesized proportion for each cell - the expected value is the product of the total observations times this proportion

Observed count- actual observation value of each cell

Components of chi-square- (Observed - Expected)^2/expected (O - E)^2 / E for each cell of the contingency table

R by c table- dimensions of a contingency table used to determine degrees of freedom as the product (r - 1)(c - 1)

Goodness of fit test- tests whether the distribution of counts in one categorical variable matches the distribution predicted by a stated model

Independence test- examines the distribution of counts for one group of individuals classified according to two variables…conclusion is the variables are associated (not independent) or not associated (independent)

Homogeneity test- compares the distribution of counts for two or more groups on the same categorical variable

T-test for regression of slope- tests the null hypothesis that the true value of the slope is zero against the alternative hypothesis…that is the slope zero suggests a lack of relationship between the two variables

Math_OverviewBottomBanner.png IMAGES CREATED BY GAVS