Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Чтение книги онлайн.

Читать онлайн книгу Handbook of Regression Analysis With Applications in R - Samprit Chatterjee страница 25

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee

Скачать книгу

if a (say) images prediction interval does not include roughly images of the new observations, that indicates poorer‐than‐expected predictive performance on new data.

Image described by caption.

      where images is based on the chosen “best” model, and images is the number of predictors in the most complex model examined, in the sense of most predictors (Ye, 1998). Clearly, if very complex models are included among the set of candidate models, images can be much larger than the standard error of the estimate from the chosen model, with correspondingly wider prediction intervals. This reinforces the benefit of limiting the set of candidate models (and the complexity of the models in that set) from the start. In this case images, so the effect is not that pronounced.

      It is not unusual for the observations in a sample to fall into two distinct subgroups; for example, people are either male or female. It might be that group membership has no relationship with the target variable (given other predictors); such a pooled model ignores the grouping and pools the two groups together.

      On the other hand, it is clearly possible that group membership is predictive for the target variable (for example, expected salaries differing for men and women given other control variables could indicate gender discrimination). Such effects can be explored easily using an indicator variable, which takes on the value images for one group and images for the other (such variables are sometimes called dummy variables or imagesvariables). The model takes the form

equation

Скачать книгу