Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Чтение книги онлайн.

Читать онлайн книгу Handbook of Regression Analysis With Applications in R - Samprit Chatterjee страница 15

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee

Скачать книгу

this rough prediction interval
.

      1.3.3 HYPOTHESIS TESTS AND CONFIDENCE INTERVALS FOR β

      There are two types of hypothesis tests of immediate interest related to the regression coefficients.

      1 Do any of the predictors provide predictive power for the target variable? This is a test of the overall significance of the regression,versusThe test of these hypotheses is the ‐test,This is referenced against a null ‐distribution on degrees of freedom.

      2 Given the other variables in the model, does a particular predictor provide additional predictive power? This corresponds to a test of the significance of an individual coefficient,versusThis is tested using a ‐test,which is compared to a ‐distribution on degrees of freedom. Other values of can be specified in the null hypothesis (say ), with the ‐statistic becoming(1.9) The values of are obtained as the square roots of the diagonal elements of , where is the residual mean square (1.8). Note that for simple regression (), the hypotheses corresponding to the overall significance of the model and the significance of the predictor are identical,versusGiven the equivalence of the sets of hypotheses, it is not surprising that the associated tests are also equivalent; in fact, , and the associated tail probabilities of the two tests are identical.A ‐test for the intercept also can be constructed as in (1.9), although this does not refer to a hypothesis about a predictor, but rather about whether the expected target is equal to a specified value if all of the predictors equal zero. As was noted in Section 1.3.1, this is often not physically meaningful (and therefore of little interest), because the condition that all predictors equal zero cannot occur, or does not come close to occurring in the observed data.

      As is always the case, a confidence interval provides an alternative way of summarizing the degree of precision in the estimate of a regression parameter. A

confidence interval for
has the form

      where

is the appropriate critical value at two‐sided level
for a
‐distribution on
degrees of freedom.

      1.3.4 FITTED VALUES AND PREDICTIONS

      The rough prediction interval

discussed in Section 1.3.2 is an approximate
interval because it ignores the variability caused by the need to estimate
and uses only an approximate normal‐based critical value. A more accurate assessment of predictive power is provided by a prediction interval given a particular value of
. This interval provides guidance as to how precise
is as a prediction of
for some particular specified value
, where
is determined by substituting the values
into the estimated regression equation. Its width depends on both
and the position of
relative to the centroid of the predictors (the point located at the means of all predictors), since values farther from the centroid are harder to predict as precisely. Specifically, for a simple regression, the estimated standard error of a predicted value based on a value
of the predicting variable is

      Here

is taken to include a
in the first entry (corresponding to the intercept in the regression model). The prediction interval is then

      where

.

      This prediction interval should not be confused with a confidence interval for a fitted value. The prediction interval is used to provide an interval estimate for a prediction of

for one member of the population with a particular value of
; the confidence interval is used to provide an interval estimate for the true expected value of
for all members of the population with a particular value of
. The corresponding standard error, termed the standard error for a fitted value, is the square root of

      with corresponding confidence interval

Скачать книгу