Introduction to Linear Regression Analysis. Douglas C. Montgomery

Чтение книги онлайн.

Читать онлайн книгу Introduction to Linear Regression Analysis - Douglas C. Montgomery страница 34

Introduction to Linear Regression Analysis - Douglas C. Montgomery

Скачать книгу

alt="in52-1"/>, in52-2, and in52-3, that maximize L, or equivalently, ln L. Thus,

      (2.57) image

      and the maximum-likelihood estimators in52-4, in52-5, and in52-6 must satisfy

      (2.58b) image

      and

      (2.58c) image

      (2.59a) image

      (2.59c) image

      Notice that the maximum-likelihood estimators of the intercept and slope, in53-1 and in53-2, are identical to the least-squares estimators of these parameters. Also, in53-3 is a biased estimator of σ2. The biased estimator is related to the unbiased estimator in53-4 [Eq. (2.19)] by in53-6. The bias is small if n is moderately large. Generally the unbiased estimator in53-6 is used.

      In general, maximum-likelihood estimators have better statistical properties than least-squares estimators. The maximum-likelihood estimators are unbiased (including in53-7, which is asymptotically unbiased, or unbiased as n becomes large) and have minimum variance when compared to all other unbiased estimators. They are also consistent estimators (consistency is a large-sample property indicating that the estimators differ from the true parameter value by a very small amount as n becomes large), and they are a set of sufficient statistics (this implies that the estimators contain all of the “information” in the original sample of size n). On the other hand, maximum-likelihood estimation requires more stringent statistical assumptions than the least-squares estimators. The least-squares estimators require only second-moment assumptions (assumptions about the expected value, the variances, and the covariances among the random errors). The maximum-likelihood estimators require a full distributional assumption, in this case that the random errors follow a normal distribution with the same second moments as required for the least- squares estimates. For more information on maximum-likelihood estimation in regression models, see Graybill [1961, 1976], Myers [1990], Searle [1971], and Seber [1977].

      The linear regression model that we have presented in this chapter assumes that the values of the regressor variable x are known constants. This assumption makes the confidence coefficients and type I (or type II) errors refer to repeated sampling on y at the same x levels. There are many situations in which assuming that the x’s are fixed constants is inappropriate. For example, consider the soft drink delivery time data from Chapter 1 (Figure 1.1). Since the outlets visited by the delivery person are selected at random, it is unrealistic to believe that we can control the delivery volume x. It is more reasonable to assume that both y and x are random variables.

      Fortunately, under certain circumstances, all of our earlier results on parameter estimation, testing, and prediction are valid. We now discuss these situations.

      Suppose that x and y are jointly distributed random variables but the form of this joint distribution is unknown. It can be shown that all of our previous regression results hold if the following conditions are satisfied:

      1 The conditional distribution of y given x is normal with conditional mean β0 + β1x and conditional variance σ2.

      2 The x’s are independent random variables whose probability distribution does not involve β0, β1, and σ2.

      While all of the regression procedures are unchanged when these conditions hold, the confidence coefficients and statistical errors have a different interpretation. When the regressor is a random variable, these quantities apply to repeated sampling of (xi, yi) values and not to repeated sampling of yi at fixed levels of xi.

      2.13.2 x and y Jointly Normally Distributed: Correlation Model

      Now suppose that y and x are jointly distributed according to the bivariate normal distribution. That is,

      (2.60) image

      where μ1 and in54-1 the mean and variance of y, μ2 and in54-2 the mean and variance of x, and

ueqn54-1

      is the correlation coefficient between y and x. The term σ12 is the covariance of y and x.

      The conditional distribution of y for a given value of x is

      (2.61) image

      where

      (2.62a)

Скачать книгу