Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 36

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

that as sample size grows, the estimator takes on a normal distribution. Finally, ML estimators possess the invariance property (see Casella and Berger, 2002, for details).

      A measure of model fit commonly used in comparing models that uses the log‐likelihood is Akaike's information criteria, or AIC (Sakamoto, Ishiguro, and Kitagawa, 1986). This is one statistic of the kind generally referred to as penalized likelihood statistics (another is the Bayesian information criterion, or BIC). AIC is defined as:

equation

      where Lm is the maximized log‐likelihood and m is the number of parameters in the given model. Lower values of AIC indicate a better‐fitting model than do larger values. Recall that the more parameters fit to a model, in general, the better will be the fit of that model. For example, a model that has a unique parameter for each data point would fit perfectly. This is the so‐called saturated model. AIC jointly considers both the goodness of fit as well as the number of parameters required to obtain the given fit, essentially “penalizing” for increasing the number of parameters unless they contribute to model fit. Adding one or more parameters to a model may cause −2Lm to decrease (which is a good thing substantively), but if the parameters are not worthwhile, this will be offset by an increase in 2m.

      The Bayesian information criterion, or BIC (Schwarz, 1978) is defined as −2Lm + m log(N), where m, as before, is the number of parameters in the model and N the total number of observations used to fit the model. Lower values of BIC are also desirable when comparing models. BIC typically penalizes model complexity more heavily than AIC. For a comparison of AIC and BIC, see Burnham and Anderson (2011).

      The covariance of a random variable is given by:

equation

      where E[(xiμx)(yiμy)] is equal to E(xiyi) − μxμy since

equation

      The sample covariance is a measure of relationship between two variables and is defined as:

      The numerator of the covariance, images, is the sum of products of respective deviations of observations from their respective means. If there is no linear relationship between two variables in a sample, covariance will equal 0. If there is a negative linear relationship, covariance will be a negative number, and if there is a positive linear relationship covariance will be positive. Notice that to measure covariance between two variables requires there to be variability on each variable. If there is no variability in xi, then images will equal 0 for all observations. Likewise, if there is no variability in yi, then images will equal 0 for all observations on yi. This is to emphasize the essential fact that when measuring the extent of relationship between two variables, one requires variability on each variable to motivate a measure of relationship in the first place.

equation

      It is easy to understand more of what the covariance actually measures if we consider the trivial case of computing the covariance of a variable with itself. In such a case for variable xi, we would have

equation

      But what is this covariance? If we rewrite the numerator as images instead of images, it becomes clear that the covariance of a variable with itself is nothing more than the usual variance for that variable. Hence, to better understand the covariance, it is helpful to start with the variance, and then realize that instead of computing the cross‐product of a variable with itself, the covariance computes the cross‐product of a variable with a second variable.

      We compute the covariance between parent height and child height in Galton's data:

      > attach(Galton) > cov(parent, child) [1] 2.064614

      The reason for this is that the size of covariance will also be impacted by the degree to which there is variability in xi and the degree to which there is variability in yi. If either or both variables contain sizeable deviations of the sort images or images, then the corresponding cross‐products images will also be quite sizeable, along with their sum, images. However, we do not want our measure of relationship to be small or large as a consequence of variability on xi or variability on yi. We want our measure of relationship to be small or large as an exclusive result of covariability, that is, the extent to which there is actually a relationship between xi

Скачать книгу