ECI - Beginning Thoughts on Estimation and Confidence Intervals Lesson
ECI - Beginning Thoughts on Estimation and Confidence Intervals Lesson
In the year 2000, the President of the University of California suggested dropping SAT reasoning scores as a factor in its admission decisions and replacing it with content tests. The underlying belief or claim is that SAT scores are not good indicators of student success. A SRS (simple random sample) of 500 California high school seniors took the SAT with a mean MATH score of 461.
The Law of Large Numbers says that this sample mean will be close to the population mean. So we conclude that the population parameter is somewhere around 461.
Somewhere around becomes more precise if we ask the question "How would the mean of the sample vary if we took MANY samples of the same size from the population? "
We need to revisit some previous ideas and note that this particular problem involves MEANS.
Central limit theorem tells us that the mean of the sample is approximately normally distributed. Mean of a Normal sampling distribution = the mean of the population Standard deviation of a sample = standard deviation of the population divided by the square root of n (number of trials in the sample). Remember this has code name SE (standard error).
Taking repeated samples of size 500 (from our SAT example) might produce results such as:
If it is known that the standard deviation for the population for the SAT Math scores is 100, we can calculate the standard deviation of our samples as 100/sqrt (500) or 4.472 rounded to 4.5.
The standard deviation for the Normally distributed SAT Math scores is 4.5. In plain English this means that 68% of all samples are within 1 standard deviation (4.5 points) of the mean. Stated another way we could say "the sample mean will be within 4.5 points of the true population mean in 68% of the samples." The entire width of this interval surrounding the mean would be 9 points. Confidence intervals can be expressed in two forms: estimate + margin of error or using interval notation stating the lower and upper bounds.
Confidence intervals should be expressed in BOTH forms. The 'estimate + margin of error' form preserves the estimate resulting from the hypothetical sampling distribution and then quantifies that value to include the error present according to sample size. It seems to give the answer AND explain where it came from at the same time. This form allows us to hypothesize changes that would occur. For example, if a specific margin of error were desired, or if a specific sample size were available, or if a specific confidence level were desired. We can easily manipulate that form using the formula that we will learn. Software packages and technology report the answer in interval form. It is expected that we be able to elicit the same information from this interval form as from the other, that is, to find the midpoint (the estimate) and find the margin of error (distance to either endpoint).
IMAGES CREATED BY GAVS