Exploratory Factor Analysis. W. Holmes Finch

Чтение книги онлайн.

Читать онлайн книгу Exploratory Factor Analysis - W. Holmes Finch страница 5

Exploratory Factor Analysis - W. Holmes Finch Quantitative Applications in the Social Sciences

Скачать книгу

from it, to the box."/>

      Figure 1.1 Example Latent Model Structure

      We can see that each observed variable, represented by the squares, is linked to the latent variable, denoted as F1, with unidirectional arrows. These arrows come from the latent to the observed variables, indicating that the former has a causal impact on the latter. Note also that each observed variable has an additional unique source of variation, known as error and represented by the circles at the far right of the diagram. Error represents everything that might influence scores on the observed variable, other than the latent variable that is our focus. Thus, if the latent variable is mathematics aptitude, and the observed variables are responses to five items on a math test, then the errors are all of the other things that might influence those math test responses, such as an insect buzzing past, distracting noises occurring during the test administration, and so on. Finally, latent variables (i.e., the factor and error terms) in this model are represented by circles, whereas observed variables are represented by squares. This is a standard way in which such models are diagrammed, and we will use it throughout the book.

      In summary, we conceptualize many constructs of interest in the social sciences to be latent, or unobserved. These latent variables, such as intelligence or aptitude, are very important, both to the goal of understanding individual human beings as well as to understanding the broader world around us. However, these constructs are frequently not directly measurable, meaning that we must use some proxy, or set of proxies, in order to gain insights about them. These proxy measures, such as items on psychological scales, are linked to the latent variable in the form of a causal model, whereby the latent variable directly causes manifest outcomes on the observed variables. All other forces that might influence scores on these observed variables are lumped together in a latent variable that we call error, and which is unique to each individual indicator variable. Next, we will describe the importance of theory in both constructing and attempting to measure these latent variables.

      The Importance of Theory in Doing Factor Analysis

      As we discussed in the previous section, latent variables are not directly observable, and we only learn about them indirectly through their impact on observed indicator variables. This is a very important concept for us to keep in mind as we move forward in this book, and with factor analysis more generally. How can we know that performance or scores on the observed variables are in fact caused by the latent variable of interest? The short answer is that we cannot know for sure. Indeed, we cannot know that the latent variable does in fact exist. Is depression a concrete, real disease? Is extraversion an actual personality trait? Is there such a thing as reading aptitude? The answer to these questions is we don’t know for sure. How then can we make statements about an individual suffering from depression, or that Juan is a good reader, or that Yi is an extravert? We can make such statements because we have developed a theoretical model that explains how our observed scores should be linked to these latent variables. For example, psychologists have taken prior empirical research as well as existing theories about mood to construct a theoretical explanation for a set of behaviors that connote the presence (or absence) of depression. These symptoms might include sleep disturbance (trouble sleeping or sleeping too much), a lack of interest in formerly pleasurable activities, and contemplation of suicide. Alone, these are simply behaviors that could be derived from a variety of sources unique to each. Perhaps an individual has trouble sleeping because he is excited about a coming job change. However, if there is a theoretical basis for linking all of these behaviors together through some common cause (depression), then we can use observed responses on a questionnaire asking about them to make inferences about the latent variable. Similarly, political scientists have developed conceptual models of political outlook to characterize how people view the world. Some people have views that are characterized as being conservative, others have liberal views, and still others fall somewhere in between the two. This notion of political viewpoint is based on a theoretical model and is believed to drive attitudes that individuals express regarding particular societal and economic issues, which in turn are manifested in responses to items on surveys. However, as with depression, it is not possible to say with absolute certainty that political viewpoint is a true entity. Rather, we can only develop a model and then assess the extent to which observations taken from nature (i.e., responses to survey questions) match with what our theory predicts.

      Given this need to provide a rationale for any relationships that we see among observed variables, and that we believe is the result of some unobserved variable, having strong theory is crucial. In short, if we are to make claims about an unobserved variable (or variables) causing observed behaviors, then we need to have some conceptual basis for doing so. Otherwise, the claims about such latent relationships carry no weight. Given that factor analysis is the formalized statistical modeling of these latent variable structures, theory should play an essential role in its use. This means that prior to conducting factor analysis, we should have a theoretical basis for what we expect to find in terms of the number of latent variables (factors), and for how observed indicator variables will be associated with these factors. This does not mean that we cannot use factor analysis in an exploratory way. Indeed, the entire focus of this text is on exploratory factor analysis. However, it does mean that we should have some sense for what the latent variable structure is likely to be. This translates into having a general sense for the number of factors that we are likely to find (e.g., somewhere between two and four), and how the observed variables would be expected to group together (e.g., items 1, 3, 5, and 8 should be measuring a common construct and thus should group together on a common factor). Without such a preexisting theory about the likely factor structure, we will not be able to ascertain when we have an acceptable factor solution and when we do not. Remember, we are using observed data to determine whether predictions from our factor model are accurate. This means that we need to have a sufficiently well-developed factor model so as to make predictions about what the results should look like. For example, what does theory say about the relationship between depression and sleep disturbance? It says that individuals suffering from depression will experience what for them are unusual sleep patterns. Thus, we would expect depressed individuals to indicate that they are indeed suffering from unusual sleep patterns. In short, having a well-constructed theory about the latent structure that we are expecting to find is crucial if we are to conduct the factor analysis properly and make good sense of the results that it provides to us.

      Comparison of Exploratory and Confirmatory Factor Analysis

      Factor analysis models, as a whole, exist on a continuum. At one extreme is the purely exploratory model, which incorporates no a priori information, such as the possible number of factors or how indicators are associated with factors. At the other extreme lies a purely confirmatory factor model in which the number of factors, as well as the way in which the observed indicators group onto these factors, is provided by the researcher. These modeling frameworks differ both conceptually and statistically. From a conceptual standpoint, exploratory models are used when the researcher has little or no prior information regarding the expected latent structure underlying a set of observed indicators. For example, if very little prior empirical work has been done with a set of indicators, or there is not much in the way of a theoretical framework for a factor model, then by necessity the researcher would need to engage in an exploratory investigation of the underlying factor structure. In other words, without prior information on which to base the factor analysis, the researcher cannot make any presuppositions regarding what the structure might look like, even with regard to the number of factors underlying the observed indicators. In other situations, there may be a strong theoretical basis upon which a hypothesized latent structure rests, such as when a scale has been developed using well-established theories. However, if very little prior empirical work exists exploring this structure, the researcher may not be able to use a more confirmatory approach and thus would rely on exploratory factor analysis (EFA) to examine several possible factor solutions, which might be limited in terms of the number of latent variables by the theoretical framework upon which the model is based. Conceptually, a confirmatory factor analysis (CFA) approach would be used when there is both a strong theoretical expectation

Скачать книгу