Applied Regression Modeling. Iain Pardoe

Чтение книги онлайн.

Читать онлайн книгу Applied Regression Modeling - Iain Pardoe страница 24

Applied Regression Modeling - Iain Pardoe

Скачать книгу

estimated variance of the estimation error, images, in expression (1.1).

      The estimated variance of the random error, images, in expression (1.1) is images. It can then be shown that the estimated variance of the prediction error, images, in expression (1.1) is images. Then, images is called the standard error of prediction.

      Thus, in general, we can write a prediction interval for an individual images‐value, as

equation

      where images is the sample mean, images is the sample standard deviation, images is the sample size, and the t‐percentile comes from a t‐distribution with images degrees of freedom.

equation

      What about the interpretation of a prediction interval? Well, for the home prices example, loosely speaking, we can say that “we are 95% confident that the sale price for an individual home picked at random from all single‐family homes in this housing market will be between images and images.” More precisely, if we were to take a large number of random samples of size 30 from our population of sale prices and calculate a 95% prediction interval for each, then 95% of those prediction intervals would contain the (unknown) sale price for an individual home picked at random from the population.

      Interpretation of a prediction interval for an individual images‐value:

      Suppose we have calculated a 95% prediction interval for an individual images‐value to be (images, images). Then we can say that we are 95% confident that the individual images‐value is between images and images.

      As discussed at the beginning of this section, the 95% prediction interval for an individual value of images, images, is much wider than the 95% confidence interval for the population mean single‐family home sale price, which was calculated as

equation

      Unlike for confidence intervals for the population mean, statistical software does not generally provide an automated method to calculate prediction intervals for an individual images‐value. Thus, they have to be calculated by hand using the sample statistics, images and images. However, there is a trick that can get around this (although it makes use of simple linear regression, which we cover in Chapter 2). First, create a variable that consists only of the value 1 for all observations. Then, fit a simple linear regression model using this variable as the predictor variable and images as the response variable, and restrict the model to fit without an intercept (see computer help #25 in the software information files available from the book website). The estimated regression equation for this model will be a constant value equal to the sample mean of the response variable. Prediction intervals for this model will be the same for each value of the predictor variable (see computer help #30), and will be the same as a prediction interval for an individual images‐value. As further practice, calculate a 90% prediction interval for an individual sale price (see Problem 1.10). Calculate it by hand or using the trick just described. You should find that the interval is (images, images).

      We spent some time in this chapter coming to grips with summarizing data (graphically and numerically) and understanding sampling distributions, but the four major concepts that will carry us through the rest of the book are as follows:

      1 Statistical thinking is the process of analyzing quantitative information about a random sample of observations and drawing conclusions (statistical inferences) about the population from which the sample was drawn. An example is using a univariate sample mean, , as an estimate of the corresponding population mean and calculating the sample standard deviation, , to evaluate the precision of this estimate.

      2 Confidence

Скачать книгу