Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis
Чтение книги онлайн.
Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 46
About Table 2.8:
In each block (1 through 5), participants within blocks are assumed to be more homogeneous on one or more variables than participants between blocks.
Participants are randomly assigned to condition (i.e., treatment 1 versus treatment 2) within each block.
Whether the blocks are naturally occurring or our sampling scheme is designed purposely to create the blocks, we can exploit the homogeneity of participants within each block by including this source in our statistical analysis as to potentially reduce the error term of our statistical test.
The matched‐pairs design is a simpler version of the full‐blown randomized block design in which one can have more than just two levels of the independent variable (e.g., treatment 1 versus treatment 2 versus treatment 3). However, the principle behind the matched-pairs design and that of randomized block designs is the same, that of exploiting the covariance between conditions and removing it from the error term of the test statistic (t in matched‐pairs, F in randomized block designs).
In more advanced analyses such as repeated measures, longitudinal, and mixed effects modeling, we will say that subjects are nested within block. A nesting structure simply implies that subjects within a block share similarity compared to subjects between blocks. Good statistical analyses will attempt to account for this similarity, remove it from respective error terms for tests, and hence make the statistical test for effects more sensitive (i.e., more powerful).
As an example of a matched‐pairs situation, suppose we are interested in evaluating the effects of melatonin12dose on average hours of sleep. However, we know that due to age, some people will naturally sleep longer than others irrespective of how much melatonin they receive. We do not want this natural sleep tendency due to age to confound the effect we are actually interested in studying (i.e., that of melatonin dose), and so we will match participants on their age level, or perhaps even crudely on age group (e.g., young, middle‐aged, old), and carry out our study within each age group. Then, when we perform statistical analyses, we will be able to extract this variation due to age out of the error term of the analysis, and hence boost statistical power for estimating the effect we are actually interested in (melatonin dosage).
When we sample observations in pairs, as was true for the independent samples t‐test, the expectation of the difference between sample means is given by:
However, because observations are sampled (or “matched”) in pairs, we naturally expect there to be a covariance different from zero between pairs. We can exploit this covariance and remove it from the error term of our statistical test. As given in Hays (1994, p. 339), the variance of the difference becomes
with standard error equal to
Notice that we have subtracted
In the classic between‐subjects design where participants are not matched, the expectation is that covariance between treatments is equal to 0, and hence, we would have:
The matched-pairs design is a very important concept in statistics and design of experiments, because this simple design is the starting point to understanding more complicated designs and modeling such as mixed effects and hierarchical models.
We analyze the hypothetical data in Table 2.8 using a paired samples t‐test in R by requesting paired = TRUE
:
> treat <- c(10, 15, 20, 22, 25) > control <- c(8, 12, 14, 15, 24) > t.test(treat, control, paired = TRUE) Paired t-test data: treat and control t = 3.2827, df = 4, p-value = 0.03042 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.5860324 7.0139676 sample estimates: mean of the differences 3.8
The obtained p‐value of 0.03 is statistically significant at a 0.05 level of significance. We reject the null hypothesis and conclude the population means for the treatment conditions to be different.
As a nonparametric test, the Wilcoxon rank‐sum test featured earlier can be adapted to incorporate paired observations. For our data, we have:
> wilcox.test(treat, control, paired = TRUE) Wilcoxon signed rank test data: treat and control V = 15, p-value = 0.0625 alternative hypothesis: true location shift is not equal to 0
Table 2.9 Randomized Block Design
Treatment 1 | Treatment 2 | Treatment 3 | |
---|---|---|---|
Block 1 | 10 | 9 | 8 |
Block 2 | 15 | 13 | 12 |
Block 3 | 20 | 18 | 14 |
Block 4 | 22 | 17 | 15 |
Block 5 | 25 | 25 | 24 |
We notice that the obtained p‐value is somewhat greater for the nonparametric test than for the parametric one. In terms of significance tests, this emphasizes the fact that there is usually a cost to not being able to make parametric assumptions.
2.24 BLOCKING WITH SEVERAL CONDITIONS
We have said that in a blocking design, between treatment conditions we expect the covariance to be unequal to 0. Now, consider a design in which, once again we block, but this time on more than two treatment levels.