Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Чтение книги онлайн.

Читать онлайн книгу Handbook of Regression Analysis With Applications in R - Samprit Chatterjee страница 26

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee

Скачать книгу

interpretation of fitting a constant shift model, where the regression relationships for group members and nonmembers are identical, other than being shifted up or down; that is,

equation

      for nonmembers and

equation

      for members. The images‐test for whether images is thus a test of whether a constant shift model (two parallel regression lines, planes, or hyperplanes) is a significant improvement over a pooled model (one common regression line, plane, or hyperplane).

      Would two different regression relationships be better still? Say there is only one numerical predictor images; the full model that allows for two different regression lines is

equation

      for nonmembers (images), and

equation

      for members (images). The pooled model and the constant shift model can be made to be special cases of the full model, by creating a new variable that is the product of images and images. A regression model that includes this variable,

equation

      corresponds to the two different regression lines

equation

      for nonmembers (since images), implying images and images above, and

equation

      for members (since images), implying images and images above.

equation

      on images degrees of freedom, provides a test comparing the pooled model to the full model. This test is often called the Chow test (Chow, 1960) in the economics literature.

      A reasonable question to ask at this point is “Why bother to fit the full model? Isn't it just the same as fitting two separate regressions on the two groups?” The answer is no. The full model fit above assumes that the variance of the errors is the same (the constant variance assumption), while fitting two separate regressions allows the variances to be different. The fitted slope coefficients from the full model will, however, be identical to those from two separate fits. What is gained by analyzing the data this way is the comparison of versions of pooled, constant shift, and full models based on group membership, including different slopes for some variables and equal slopes for others, something that is not possible if separate regressions are fit to the two groups.

      Another way of saying that the relationship between a predictor and the target is different for members of the two different groups is that there is an interaction effect between the predictor and group membership on the target. Social scientists would say that the grouping has a moderating effect on the relationship between the predictor and the target. The fact that in the case of a grouping variable, the interaction can be fit by multiplying the two variables together has led to a practice that is common in some fields: to try to represent any interaction between variables (that is, any situation where the relationship between a predictor and the target is different for different values of another predictor) by multiplying them together. Unfortunately, this is not a very reasonable way to think about interactions for numerical predictors, since there are many ways that the effect of one variable on the target can differ depending on the value of another that have nothing to do with product functions. See Section 15.6 for further discussion.

      

      2.4.1 EXAMPLE — ELECTRONIC VOTING AND THE 2004 PRESIDENTIAL ELECTION

      The 2000 US presidential election matching Republican George W. Bush against Democrat Al Gore attracted worldwide attention because of its close and controversial results, particularly in the state of Florida. The 2004 election, pitting the incumbent Bush against John Kerry, is less discussed, but was also controversial, in part because of the introduction of electronic voting machines in some polling places across the country (such machines were introduced in part because of the irregularities in paper balloting that occurred in Florida in the 2000 election). Some of the manufacturers of electronic voting machines were strong supporters of President Bush, and this, along with the fact that the machines did not produce a paper trail, led to speculation about whether the machines could be manipulated to favor one candidate over the other.

Image described by caption.

Скачать книгу