Methods in Psychological Research. Annabel Ness Evans
Чтение книги онлайн.
Читать онлайн книгу Methods in Psychological Research - Annabel Ness Evans страница 18
In the subsections that follow, Knez (2001) reports the results and statistical tests for the effect of light condition on the various DVs. He reports one of the effects as a “weak tendency to a significant main effect” (p. 204) with a p value of .12. We would simply say that it was not statistically significant, ns. Indeed, many of Knez’s statistical tests produced p values greater than .05. We bring this to your attention as a reminder that even peer-reviewed journal articles need to be read with a critical eye. Don’t just accept everything you read. You need to pay attention to the p values and question when they are not less than .05. You also need to examine the numbers carefully to discern the effect size.
What is noticeably missing from the results section of Knez (2001), our example article, is a calculation of effect size. Effect size gives us some indication of the strength of the effect (see Chapter 4 for more detail). Remember, statistical significance tells us that an effect was likely not due to chance and is probably a reliable effect. What statistical significance does not indicate is how large the effect is. If we inspect the numbers in Knez’s article, we see that the effects were not very large. For example, on the short-term recall task, the best performance was from the participants in the warm-lighting conditions. They had a mean score of 6.9 compared with the other groups, with a mean score of about 6.25. A difference of only 0.65 of a word on a recall task seems like a pretty small effect, but then again, one would hardly expect that lighting conditions would have a dramatic effect on performance.
Once you have finished reading the introduction, method, and results sections, you should have a pretty good idea about what was done, to whom, and what was found. In the discussion section, you will read the researcher’s interpretation of the research, comments about unexpected findings, and speculations about the importance of the work or its application.
The Discussion
The dissertation adviser of one of the authors of this book told her that he never read the discussion section of research reports. He was not interested in the interpretation of the authors. He interpreted the findings and their importance himself. We consider this good advice for seasoned researchers but not for students. The discussion section of a research article is where the author describes how the results fit into the literature. This is a discussion of the theories that are supported by the research and the theories that are not. It is also where you will find suggestions from the author as to where the research should go in the future—what questions are left unanswered and what new questions the research raises. Indeed, the discussion section may direct you in your selection of a research project. You may wish to contact the author to see if research is already being conducted on the questions posed in the discussion. Remember that it is important to be a critical consumer of research. Do not simply accept what is said in the discussion. Ask yourself if the results really do support the author’s conclusions. Are there other possible interpretations?
In the discussion section of our example article, Knez (2001) relates the findings to his previous work and the research of others. He discusses the lack of effect of light on mood and questions the mood measure that was used. We think that another possibility, which he does not explore, is that lighting may not have an influence on mood. He also describes the effect of light on cognitive performance as being something new to the literature. We could speculate that this small effect might not be a reliable finding. Certainly, the weak p values reported in the results section would indicate either that the study should be replicated or that the results were a fluke. Again, as we said before, you need to be critical when reading the literature.
Basic Statistical Procedures
Tests of Significance
t Test.
The simplest experiment involves two groups, an experimental group and a control group. The researchers treat the groups differently (the IV) and measure their performance (the DV). The question, then, is “Did the treatment work?” Are the groups significantly different after receiving the treatment? If the research involves comparing means from two groups, the t test may be the appropriate test of significance. Be aware that the t test can also be used in nonexperimental studies. For example, a researcher who compares the mean performance of women with that of men might use a t test, but this is not an experiment.
Typically, a researcher will report the group means, whether the difference was statistically significant, and the t-test results. In essence, the t test is an evaluation of the difference between two means relative to the variability in the data. Simply reporting the group means is not enough, because a large difference between two means might not be statistically significant when examined relative to the large variability of the scores of each group. Alternatively, a small difference between two means may be statistically significant if there is very little variation in scores within each group. The t test is a good test when you want to compare two groups, but what if you have more than two groups?
F Test.
The F test of significance is used to compare means of more than two groups. There are numerous experimental (and quasi-experimental) designs, known as ANOVAs, that are analyzed with the F test. Indeed, when we were graduate students, we took entire courses in ANOVA. In general, the F test, like the t test, compares between-group variability with within-group variability.
As with the t test, the researcher will report the group means and whether the differences were statistically significant. From a significant F test, the researcher knows that at least two means were significantly different. To specify which groups were different from which others, the researcher must follow the F test with post hoc (after the fact) comparisons. For example, if there were three groups and the F test was statistically significant, a post hoc test might find that all three group means were statistically significantly different or perhaps that only one mean differed from the other two. There are a large number of post hoc tests (e.g., Scheffé, Tukey’s least significant difference, and Bonferroni) that have slightly different applications. What is common to all these tests is that each produces a p value that is used to indicate which means differ from which.
As indicated above, many designs are analyzed with an F test, and they have names that indicate the number of IVs. You will find a one-way ANOVA used when there is one IV, a two-way ANOVA when there are two IVs, and a three-way ANOVA (you guessed it) when there are three. A null hypothesis is tested for each IV by calculating an F statistic. The advantage of the two- and three-way ANOVAs is that an interaction effect can also be tested. An interaction occurs when different combinations of the levels of the IVs have different effects on the DV. For example, if we wanted to investigate the effect of environmental noise (silent vs. noisy) on reading comprehension and the effect of different-colored paper (white, yellow, and pink) on reading comprehension, we could use a two-way ANOVA to evaluate the effect of each IV and also whether the color of paper might interact with the noise to influence reading comprehension. It may be that noise produces a reduction in reading comprehension for white paper but not for yellow or pink paper. The interaction effect is important because it indicates that a variable is acting as a moderating variable. In this example, the effect of environmental noise on reading comprehension is moderated by the color of the paper.
There is another type of ANOVA that is used to control for a possible confounding variable. This procedure also uses the F statistic and is called analysis of covariance, or ANCOVA. Using our paper color example, suppose