to make statistical inferences about the population mean, .
1.5 Interval Estimation
We have already seen that the sample mean, , is a good point estimate of the population mean, (in the sense that it is unbiased—see Section 1.4). It is also helpful to know how reliable this estimate is, that is, how much sampling uncertainty is associated with it. A useful way to express this uncertainty is to calculate an interval estimate or confidence interval for the population mean, . The interval should be centered at the point estimate (in this case, ), and since we are probably equally uncertain that the population mean could be lower or higher than this estimate, it should have the same amount of uncertainty either side of the point estimate. We quantify this uncertainty with a number called the “margin of error.” Thus, the confidence interval is of the form “point estimate margin of error” or “(point estimate margin of error, point estimate margin of error).”
We can obtain the exact form of the confidence interval from the t‐version of the central limit theorem, where has an approximate t‐distribution with degrees of freedom. In particular, suppose that we want to calculate a 95% confidence interval for the population mean, , for the home prices example—in other words, an interval such that there will be an area of 0.95 between the two endpoints of the interval (and an area of 0.025 to the left of the interval in the lower tail, and an area of 0.025 to the right of the interval in the upper tail). Let us consider just one side of the interval first. Since 2.045 is the 97.5th percentile of the t‐distribution with 29 degrees of freedom (see the t‐table in Section 1.4.2), then
The difference from earlier calculations is that this time is the focus of inference, so we have not assumed that we know its value. One consequence for the probability calculation is that in the fourth line we have “.” To change this to “” in the fifth line, we multiply each side of the inequality sign by “” (this also has the effect of changing the direction of the inequality sign).
This probability statement must be true for all potential values of and . In particular, it must be true for our observed sample statistics, and . Thus, to find the values of that satisfy the probability statement, we plug in our sample statistics to find
This shows that a population mean greater than would satisfy the expression . In other words, we have found that the lower bound of our confidence interval is , or approximately . The value 20.1115 in this calculation is the margin of error.
To find the upper bound, we perform a similar calculation:
To find the values of that satisfy this expression, we plug in our sample statistics to find
This shows that a population mean less than would satisfy the expression . In other words, we have found that the upper bound of our confidence interval is , or approximately . Again, the value 20.1115 in this calculation is the margin of error.
We can write these two calculations a little more concisely as
As before, we plug in our sample statistics to find the values of that satisfy this expression:
This shows that a population mean between and would satisfy the expression . In other words, we have found that a 95% confidence interval for for this example is (, ), or approximately (, ). It is traditional to write confidence intervals with the lower number on the left.
More generally, using symbols, a 95% confidence interval