Exploratory Factor Analysis. W. Holmes Finch
Чтение книги онлайн.
Читать онлайн книгу Exploratory Factor Analysis - W. Holmes Finch страница 6
From a statistical perspective, EFA and CFA differ in terms of the constraints that are placed upon the factor structure prior to estimation of the model parameters. With EFA there are few, if any, constraints placed on the model parameters. Observed indicators are typically allowed to have nonzero relationships with all of the factors, and the number of factors is not constrained to be a particular number. Thus, the entire EFA enterprise is concerned with answering the question of how many factors underlie an observed set of indicators, and what structure the relationship between factors and indicators takes. In contrast, CFA models are highly constrained. In most instances, each indicator variable is allowed to be associated with only a single factor, with relationships to all other factors set to 0. Furthermore, the specific factor upon which an indicator is allowed to load is predetermined by the researcher. This is why having a strong theory and prior empirical evidence is crucial to the successful fitting of CFA models. Without such strong prior information, the researcher may have difficulty in properly defining the latent structure, potentially creating a situation in which an improper model is fit to the data. The primary difficulty with fitting an incorrect model is that it may appear to fit the data reasonably well, based on statistical indices, and yet may not be the correct model. Without earlier exploration of the likely latent structure, however, it would not be possible for the researcher to know this. CFA does have the advantage of being a fully determined model, which is not the case with EFA, as we have already discussed. Thus, it is possible to come to more definitive determinations regarding which of several CFA models provides the best fit to a set of data because they can be compared directly using familiar tools such as statistical hypothesis testing. Conversely, determining the optimal EFA model for a set of data is often not a straightforward or clear process, as we will see later in the book.
In summary, EFA and CFA sit at opposite ends of a modeling continuum, separated by the amount of prior information and theory available to the researcher. The more such information and the stronger the theory, the more appropriate CFA will be. Conversely, the less that such prior evidence is available, and the weaker the theories about the latent structure, the more appropriate will be EFA. Finally, researchers should take care not to use both EFA and CFA on the same set of data. In cases where a small set of CFA models do not fit a set of sample data well, a researcher might use EFA in order to investigate potential alternative models. This is certainly an acceptable approach; however, the same set of data used to investigate these EFA-based alternatives should not then be used with an additional CFA model to validate what exploration has suggested might be optimal models. In such cases, the researcher would need to obtain a new sample upon which the CFA would be fit in order to investigate the plausibility of the EFA findings. If the same data were used for both analyses, the CFA model would likely yield spuriously good fit to the sample for the model, given that the sample data had already yielded the factor structure that is being tested, through the EFA.
EFA and Other Multivariate Data Reduction Techniques
Factor analysis belongs to a larger family of statistical procedures known collectively as data reduction techniques. In general, all data reduction techniques are designed to take a larger set of observed variables and combine them in some way so as to yield a smaller set of variables. The differences among these methods lies in the criteria used to combine the initial set of variables. We discuss this criterion for EFA at some length in Chapter 3, namely the effort to find a factor structure that yields accurate estimates of the covariance matrix of the observed variables using a smaller set of latent variables. Another statistical analysis with the goal of reducing the number of observed variables to a smaller number of unobserved variates is discriminant analysis (DA). DA is used in situations where a researcher has two or more groups in the sample (e.g., treatment and control groups) and would like to gain insights into how the groups differ on a set of measured variables. However, rather than examining each variable separately, it is more statistically efficient to consider them collectively. In order to reduce the number of variables to consider in this case, DA can be used. As with EFA, DA uses a heuristic to combine the observed variables with one another into a smaller set of latent variables that are called discriminant functions. In this case, the algorithm finds the combination(s) that maximize the group mean difference on these functions. The number of possible discriminant functions is the minimum of p and J-1, where p is the number of observed variables, and J is the number of groups. The functions resulting from DA are orthogonal to one another, meaning that they reflect different aspects of the shared group variance associated with the observed variables. The discriminant functions in DA can be expressed as follows:
Dfi = wf1 x1i + wf2 x2i + ⋅⋅⋅ + wfp xpi (Equation 1.1)
where
Dfi = Value of discriminant function f for individual i
wfp= Discriminant weight relating function f and variable p
xpi = Value of variable p for individual i.
For each of these discriminant functions (Df), there is a set of weights that are akin to regression coefficients and correlations between the observed variables and the functions. Interpretation of the DA results usually involves an examination of these correlations. An observed variable having a large correlation with a discriminant function is said to be associated with that function in much the same way that indicator variables with large loadings are said to be associated with a particular factor. Quite frequently, DA is used as a follow-up procedure to a statistically significant multivariate analysis of variance (MANOVA). Variables associated with discriminant functions with statistically significantly different means among the groups can be concluded to contribute to the group mean difference associated with that function. In this way, the functions can be characterized just as factors are, by considering the variables that are most strongly associated with them.
Canonical correlation (CC) works in much the same fashion as DA, except that rather than having a set of continuous observed variables and a categorical grouping variable, CC is used when there are two sets of continuous variables for which we want to know the relationship. As an example, consider a researcher who has collected intelligence test data that yields five subtest scores. In addition, she has also measured executive functioning for each subject in the sample, using an instrument that yields four subtests. The research question to be addressed in this study is, how strongly related are the measures of intelligence and executive functioning? Certainly, individual correlation coefficients could be used to examine how pairs of these variables are related to one another. However, the research question in this case is really about the extent and nature of relationships between the two sets of variables. CC is designed to answer just this question, by combining each set into what are known as canonical variates. As with DA, these canonical variates are orthogonal to one another so that they extract all of the shared variance between the two sets. However, whereas DA created the discriminant function by finding the linear combinations of the observed indicators that maximized group mean differences for the functions, CC finds the linear combinations for each variable set that maximize the correlation between the resulting canonical variates. Just as with DA, each observed variable is assigned a weight that is used in creating the canonical variates. The canonical variate is expressed as in Equation 1.2.
Cvi = wc1 x1i + wc2 x2i + ⋅⋅⋅ + wcp xpi (Equation 1.2)
where
Cvi