Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 31

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

average, in the long run, the statistic T is considered to be an unbiased estimator of θ if

equation

      That is, an estimator is considered unbiased if its expected value is equal to that of the parameter it is seeking to estimate. The bias of an estimator is measured by how much E(T) deviates from θ. When an estimator is biased, then E(T) ≠ θ, or, we can say E(T) − θ ≠ 0. Since the bias will be a positive number, we can express this last statement as E(T) − θ > 0.

      Good estimators are, in general, unbiased. The most popular example of an unbiased estimator is that of the arithmetic sample mean since it can be shown that:

equation

      An example of an estimator that is biased is the uncorrected sample variance, as we will soon discuss, since it can be shown that

equation

      However, S2 is not asymptotically biased. As sample size increases without bound, E(S2) converges to σ2. Once the sample variance is corrected via the following, it leads to an unbiased estimator, even for smaller samples:

equation

      where now,

equation equation

      An estimator is regarded as efficient the lower is its mean squared error. Estimators with lower variance are more efficient than estimators with higher variance. Fisher called this the criterion of efficiency, writing “when the distributions of the statistics tend to normality, that statistic is to be chosen which has the least probable error” (Fisher, 1922a, p. 316). Efficient estimators are generally preferred over less efficient ones.

      An estimator is regarded as sufficient for a given parameter if the statistic “captures” everything we need to know about the parameter and our knowledge of the parameter could not be improved if we considered additional information (such as a secondary statistic) over and above the sufficient estimator. As Fisher (1922a, p. 316) described it, “the statistic chosen should summarize the whole of the relevant information supplied by the sample.” More specifically, Fisher went on to say:

      If θ be the parameter to be estimated, θ1 a statistic which contains the whole of the information as to the value of θ, which the sample supplies, and θ2 any other statistic, then the surface of distribution of pairs of values of θ1 and θ2, for a given value of θ, is such that for a given value of θ1, the distribution of θ2 does not involve θ. In other words, when θ1 is known, knowledge of the value of θ2 throws no further light upon the value of θ.

      (Fisher, 1922a, pp. 316–317)

      Returning to our discussion of moments, the variance is the second moment of a distribution. For the discrete case, variance is defined as:

equation

      while for the continuous case,

equation

      Since E(yi) = μ, it stands that we may also write E(yi) as μ. We can also express σ2 as images since, when we distribute expectations, we obtain:

equation equation

      As earlier noted, taking the expectation of S2, we find that E(S2) ≠ σ2. The actual expectation of S2 is equal to:

equation

      which implies the degree to which S2 is biased is equal to:

equation

      We have said that S2 is biased, but you may have noticed that as n increases, (n − 1)/n approaches 1, and so E(S2) will equal σ2 as n increases without bound. This was our basis for earlier writing images. That is, we say that the estimator S2, though biased for small samples, is asymptotically unbiased because its expectation is equal to σ2 as n → ∞.

      When we lose a degree of freedom in the denominator and rename S2 to s2, we get

equation

      Recall that when we take the expectation of s2, we find that E(s2) = σ2 (see Wackerly, Mendenhall, and Scheaffer (2002, pp. 372–373) for a proof).

      The population standard deviation is given by the positive square root of σ2, that is, images. Analogously, the sample standard deviation is given by images.

      Recall

Скачать книгу