Methods in Psychological Research. Annabel Ness Evans
Чтение книги онлайн.
Читать онлайн книгу Methods in Psychological Research - Annabel Ness Evans страница 20
Important note to students
If you’re reading this material and starting to get anxious, relax! Our intention here is to discuss these inferential statistics at a conceptual level. As we indicated earlier, when you begin reading the literature, it is unlikely that you will see research using t tests or simple ANOVAs. What you will see are complex statistics that may be completely new to you. Our intention here is to give you enough information to understand what is being described in the literature.
More Complex Statistical Procedures
Multiple Regression.
If predicting someone’s performance using one predictor variable is a good idea, using more than one predictor variable is a better idea. Entire textbooks are devoted to multiple regression analysis techniques, but the basic idea is to use more than one predictor variable, X1, X2, X3, and so on, to predict one criterion variable, Y. As with simple regression, multiple regression requires the fitting of a line through your data, but first, all the predictor variables are combined, and then the linear combination of Xs is correlated with Y. It is easy to visualize multiple regression with two predictors. This would be a line in three-dimensional space. Imagining more than two predictors is difficult and fortunately not necessary. Multiple regression produces an r value that reflects how well the linear combination of Xs predicts Y. Some predictor variables are likely to be better predictors of Y than others, and the analysis produces weights that can be used in a regression equation to predict Y. Simply multiply the values of the predictor variables by their respective weights, and you have your predicted value.
Y(predicted) = B1(X1) + B2(X2) + B3(X3) + … + Constant
In addition to the weights used to predict criterion values, multiple regression analysis also provides standardized weights called beta (β) weights. These values tell us something about each individual predictor in the regression analysis. They can be interpreted much like an r value, with the sign indicating the relationship between the predictor variable and the criterion variable and the magnitude indicating the relative importance of the variable in predicting the criterion. Thus, in a multiple regression analysis, we can examine the relative contribution of each predictor variable in the overall analysis.
As you just learned, multiple regression is used to determine the influence of several predictor variables on a single criterion variable. Let’s look briefly at two useful concepts in multiple regression: (1) partial and (2) semipartial (also called part) correlation.
Partial Correlation.
Sometimes we would like to measure the relationship between two variables when a third has an influence on them both. We can partial out the effects of that third variable by computing a partial correlation. Suppose there is a correlation between age and income. It seems reasonable that older people might make more money than younger people. Is there another variable that you think might be related to age and income? How about years of education? Older people are more likely to be better educated, having had more years to go to school, and it seems likely that better-educated people earn more. So what is the true relationship between age and income if the variable years of education is taken out of the equation? One solution would be to group people by years of education and then conduct a number of separate correlations between age and income for each education group. Partial correlation, however, provides a better solution by telling us what the true relationship is between age and income when years of education has been partialled out.
Semipartial Correlation.
As we just discussed, in partial correlation, we remove the relationship of one variable from the other variables and then calculate the correlation. But what if we want to remove the influence of a variable from only one of the other variables? This is called a semipartial correlation. For example, at our school, we accept senior students in our applied psychology program based on their grades in the first and second years. We have found a strong positive correlation between previous grades and performance in our program. Suppose we could also administer an entrance exam to use as another predictor, but the exam was expensive. We can use semipartial correlation to determine how much the entrance test will increase our predictive power over and above using previous grades.
How do we do this? Well, we correlate entrance test scores and performance in the program after first removing the influence of previous grades on program performance. This correlation value then will tell us what relationship remains between entrance test scores and program performance when the correlation with previous grades has been partialled out of program performance but not out of entrance test scores. In our example, we could decide, based on this correlation, whether an expensive entrance test helped our predictive ability enough for us to go ahead and use it.
Logistic Regression.
Suppose you were interested in predicting whether a young offender would reoffend. You measure a number of possible predictor variables, such as degree of social support, integration in the community, job history, and so on, and then follow your participants for 5 years and measure whether they reoffend. The predictor variables may be continuous, but the criterion variable is discrete; they reoffend or they don’t. When we have a discrete criterion variable, we use logistic regression. Just as we used a combination of the predictor variables to predict the criterion variable in multiple regression, we do the same thing in logistic regression. The difference is that instead of predicting a value for the criterion variable, we predict the likelihood of the occurrence of the criterion variable. We express this as an odds ratio—that is, the odds of reoffending divided by the odds of not reoffending. If the probability of reoffending is .75 (i.e., there is a 75% chance of reoffending), then the probability of not reoffending is .25 (1 − .75). The odds of reoffending are .75/.25, or 3:1, and the odds of not reoffending are .25/.75, or .33. We calculate the odds ratio of reoffending versus not reoffending as .75/.33, or 2.25. In other words, the odds of reoffending are two and a quarter times higher than those of not reoffending.
Factor Analysis.
Factor analysis is a correlational technique we use to find simpler patterns of relationships among many variables.