Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 42

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

which, assuming we had set our criteria for rejection at α = 0.05, leads us to the decision to not reject the null hypothesis. The two‐tailed (as opposed to one‐tailed or directional) nature of the statistical test in this example means that we allow for a rejection of the null hypothesis in either direction from the value stated under the null. Since our null hypothesis is μ0 = 100, it means we were prepared to reject the null hypothesis for observed values of the sample mean that deviate “significantly” either greater than or less than 100. Since our significance level was set at 0.05, this means that we have 0.05/2 = 0.025 area in each end of the t distribution to specify as our rejection region for the test. The question we are asking of our sample mean is—What is the probability of observing a sample mean that falls much greater OR much less than 100? Because the observed sample mean can only fall in one tail or the other on any single trial (i.e., we are conducting a single “trial” when we run this experiment a single time), this implies these two events are mutually exclusive, which by the addition rule for mutually exclusive events, we can add them. When we add their probabilities, we get 0.025 + 0.025 = 0.05, which, of course, is our significance level for the test.

       The actual mean difference observed is equal to 2.60, which was computed by taking the mean of our sample, that of 102.6 and subtracting the mean hypothesized under the null hypothesis, that of 100 (i.e., 102.6 – 100 = 2.60).

       The 95% confidence interval of the difference is interpreted to mean that with 95% confidence, the interval with lower bound −4.8810 and upper bound 10.0810 will capture the true parameter, which in this case is the population mean difference. We can see that 0 lies within the limits of the confidence interval, which again confirms why we were unable to reject the null hypothesis at the 0.05 level of significance. Had zero lay outside of the confidence interval limits, this would have been grounds to reject the null at a significance level of 0.05 (and consequently, we would have also obtained a p‐value of less than 0.05 for our significance test). Recall that the true mean (i.e., parameter) is not the random component. Rather, the sample is the random component, on which the interval is then computed. It is important to emphasize this distinction when interpreting the confidence interval.

      We can easily generate the same t‐test in R. We first generate the vector of data then carry on with the one‐sample t‐test, which we notice mirrors the findings obtained in SPSS:

      2.20.2 t‐Tests for Two Samples

      Just as the t‐test for one sample is a generalization of the z‐test for one sample, for which we use s2 in place of σ2, the t‐test for two independent samples is a generalization of the z‐test for two independent samples. Recall the z‐test for two independent samples:

equation

      where images and images denote the expectations of the sample means images and images respectively (which are equal to μ1 and μ2).

      When we do not know the population variances images and images, we shall, as before, obtain estimates of them in the form of images and images. When we do so, because we are using these estimates instead of the actual variances, our new ratio is no longer distributed as z. Just as in the one‐sample case, it is now distributed as t:

      on degrees of freedom v = n1 − 1 + n2 − 1 = n1 + n2 − 2.

equation

      which can also be written as

equation

      Notice that the pooled estimate of the variance images is nothing more than an averaged weighted sum, each variance being weighted by its respective sample size. This idea of weighting variances as to arrive at a pooled value is not unique to t‐tests. Such a concept forms the very fabric of how MS error is computed in the analysis of variance as we shall see further in Chapter 3 when we discuss the ANOVA procedure in some depth.

      2.20.3 Two‐Sample t‐Tests in R

      Consider the following hypothetical data on pass‐fail grades (“0” is fail, “1” is pass) for a seminar course with 10 attendees:

      To conduct the two‐sample t‐test, we generate the relevant vectors in R then carry out the test:

      > grade.0 <- c(30, 25, 59, 42, 31) > grade.1 <- c(140, 90, 95, 170, 120) > t.test(grade.0, grade.1) Welch Two Sample t-test data: grade.0 and grade.1 t = -5.3515, df = 5.309, p-value = 0.002549 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -126.00773 -45.19227 sample estimates: mean of x mean of y 37.4 123.0

      Using a Welch adjustment for unequal variances (Welch, 1947) automatically generated by R, we conclude a statistically significant difference between means (p = 0.003). With 95% confidence, we can say the true mean difference lies between the lower limit of approximately −126.0 and the upper limit of approximately −45.2. As a quick test to verify the assumption of equal variances (and to confirm in a sense whether the Welch adjustment was necessary), we can use var.test which will produce a ratio of variances and evaluate the null hypothesis that this ratio is equal to 1 (i.e., if the variances are equal, the numerator of the ratio will be the same as the denominator):

      > var.test(grade.0, grade.1) F test to compare two variances data: grade.0 and grade.1 F = 0.1683, num df = 4, denom df = 4, p-value = 0.1126 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.01752408 1.61654325 sample estimates: ratio of variances 0.1683105

Скачать книгу