Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 41

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

target="_blank" rel="nofollow" href="#ulink_359bb32a-3e46-5b2c-a0a1-d1a8515dcf81">Figure 2.11 Student's t versus normal densities for 3 (left), 10 (middle), and 50 (right) degrees of freedom. As degrees of freedom increase, the limiting form of the t distribution is the z distribution.

equation

      where the numerator images represents the distance between the sample mean and the population mean μ0 under the null hypothesis, and the denominator images is the standard error of the mean.

      In most research contexts, from simple to complex, we usually do not have direct knowledge of σ2. When we do not have knowledge of it, we use the next best thing, an estimate of it. We can obtain an unbiased estimate of σ2 by computing s2 on our sample. When we do so, however, and use s2 in place of σ2, we can no longer pretend to “know” the standard error of the mean. Rather, we must concede that all we are able to do is estimate it. Our estimate of the standard error of the mean is thus given by:

equation

      When we use s2 (where images) in place of σ2, our resulting statistic is no longer a z statistic. That is, we say the ensuing statistic is no longer distributed as a standard normal variable (i.e., z). If it is not distributed as z, then what is it distributed as? Thanks to William Sealy Gosset who in 1908 worked for Guinness Breweries under the pseudonym “Student” (Zabell, 2008), the ratio

equation

      was found to be distributed as a t statistic on n − 1 degrees of freedom. Again, the t distribution is most useful for when sample sizes are rather small. For larger samples, as mentioned, the t distribution converges to that of the z distribution. If you are using rather large samples, say approximately 100 or more, whether you evaluate your null hypothesis using a z or t distribution will not matter much, because the critical values for z and t for such degrees of freedom (99 for the one‐sample case) will be relatively alike, that practically at least, the two test statistics can be considered more or less equal. For even larger samples, the convergence is that much more fine‐tuned.

      The concept of convergence between z and t can be easily illustrated by inspecting the variance of the t distribution. Unlike the z distribution where the variance is set at 1.0 as a constant, the variance of the t distribution is defined as:

equation

      where v are the degrees of freedom. For small degrees of freedom, such as v = 5, the variance of the t distribution is equal to:

equation

      Note what happens as v increases, the ratio images gets closer and closer to 1.0, which is the precise variance of the z distribution. For example, v = 20 yields:

equation equation

      That is, as v increases without bound, the variance of the t distribution equals that of the z distribution, which is equal to 1.0.

      We demonstrate the use of the one‐sample t‐test using SPSS. Consider the following small, hypothetical data on IQ scores on five individuals:

      IQ 105 98 110 105 95

      Suppose that the hypothesized mean IQ in the population is equal to 100. The question we want to ask is—Is it reasonable to assume that our sampled data could have arisen from a population with mean IQ equal to 100? We assume we have no knowledge of the population standard deviation, and hence must estimate it from our sample data. To perform the one‐sample t‐test in SPSS, we compute:

      T-TEST /TESTVAL=100 /MISSING=ANALYSIS /VARIABLES=IQ /CRITERIA=CI(.95).

      The line /TESTVAL = 100 inputs the test value for our hypothesis test, which for our null hypothesis is equal to 100. We have also requested a 95% confidence interval for the mean difference.

One‐Sample Statistics
N Mean SD SE Mean
IQ 5 102.6000 6.02495 2.69444

      We confirm from the above that the size of our sample is equal to 5, and the mean IQ for our sample is equal to 102.60 with standard deviation 6.02. The standard error of the mean reported by SPSS of 2.69 is actually not the true standard error of the mean. It is the estimated standard error of the mean, since recall that we did not have knowledge of the population variance (otherwise we would have been performing a z‐test instead of a t‐test).

One‐Sample Test
Test Value = 100
95% Confidence Interval of the Difference
t Df Sig. (2‐tailed) Mean Difference Lower Upper
IQ 0.965 4 0.389 2.60000 −4.8810 10.0810

      We note from the above output:

       Our obtained t‐statistic is equal to 0.965 and is evaluated on four degrees of freedom (i.e., n − 1 = 5 − 1 = 4). We lose a degree of freedom because recall that in estimating the population variance σ2 with s2, we had to compute a sample mean and hence this value is regarded as “fixed” as we carry on with our t‐test. Hence, we lose a single degree of freedom.

       The

Скачать книгу