An Introduction to Text Mining. Gabe Ignatow

Чтение книги онлайн.

Читать онлайн книгу An Introduction to Text Mining - Gabe Ignatow страница 15

Автор:
Серия:
Издательство:
An Introduction to Text Mining - Gabe Ignatow

Скачать книгу

mining.

      2 Recognize the interdependence of philosophical assumptions and decisions about methodology.

      3 Summarize what is meant by the “two cultures” in academia.

      4 Position your own research project in terms of debates over positivism and postpositivism.

      Introduction

      You may be tempted to skim over or even skip this chapter entirely, and it is certainly possible to make use of the more technical later chapters of this textbook without giving much thought to epistemology, ontology, metatheory, or inferential logic. But if you are in the early stages of a text mining research project, you would do well to read this chapter carefully. As we discussed in the Preface to this textbook, just as the foundations of a house must be properly designed and built if the house is to last, the philosophical foundations of your research project should be as solidly constructed as possible. Text mining research often involves making strong inferences about groups of people based on the texts they produce. Researchers working with these tools frequently claim to know something about the language people use that those same people do not themselves know; justifying such claims is not a simple matter. Several academic fields are relevant to questions about when researchers are justified in using digital texts to make inferences about social groups. These fields include the philosophy of science (Curd, Cover, & Pincock, 2013), the philosophy of technology (Kaplan, 2009), and science and technology studies (STS; Kleinman & Moore, 2014).

      A historical example may be in order. Like text mining technologies today, a century ago the lie detector (polygraph machine) was a revolutionary technology with social implications that could not have been predicted. As with text mining technologies, it was claimed that lie detector technology would allow scientists to extend their powers of perception and even know what people were thinking. As a lie detector can potentially reveal things about individuals that they themselves do not know or would prefer not to reveal, text mining tools can potentially reveal what members of a group or community are thinking and feeling. But is it true that lie detectors can reveal whether people are attempting to deceive? What do data produced by lie detectors mean? How should these data be used? Lie detector technology itself does not provide answers to these questions. Instead, it took decades for individual scientists and scientific, legal, and criminal justice institutions to sort out what lie detectors can and cannot accomplish and how the data they produce could be used ethically (see Alder, 2007; Bunn, 2012). And even today there is often disagreement about the results of polygraph tests. In the same way, scientific institutions and public and private sector organizations are still in the early stages of sorting out what kinds of conclusions can be drawn from text mining research. This sorting-out process involves technical discussions but also philosophical discussions about knowledge, facts, and language.

      The philosophy of social science is one of the main fields in which researchers debate how socially sensitive research technologies such as polygraphs and text mining tools can and should be used. The philosophy of social science is an academic research area that lies at the intersection of philosophy and contemporary social science. Philosophers of social science develop and critique concepts that are foundational to the practice of social science research (Howell, 2013). They critically analyze epistemological assumptions in social research, which are assumptions about the nature of knowledge. They also analyze ontological assumptions, which are assumptions about the nature of reality, and metatheoretical assumptions, which are assumptions about the capacities and limitations of scientific theories. Social scientists often make claims about the validity and generalizability of their findings, the adequacy of their research designs, and why one theory is superior to another. Such claims are grounded in epistemological, ontological, and metatheoretical positions that are generally implicit (Woodwell, 2014). The philosophy of social science allows us to bring these positions to light and to help us understand why different approaches to social science research can, or cannot, make use of each other’s findings. In this section we briefly review what we have found to be the most critical philosophical issues that arise in text mining research, and we discuss some of the practical implications of different philosophical positions.

      Ontological and Epistemological Positions

      When are we justified in reaching a conclusion about some person or group of people based on the texts they produce? Does text mining research produce findings that are merely interesting, or can it produce findings that are true and accurate reflections of reality?

      Every approach to social science research addresses these kinds of questions based on one or another philosophical position. But the philosophical foundations of text mining research are uniquely unsettled because text mining methods are, for the most part, “mixed methods” (Creswell, 2014; Teddlie & Tashakkori, 2008) that are positioned at the intersection of the “two cultures” of the sciences and the humanities (Snow, 1959/2013). The “two cultures” was part of a 1959 lecture and subsequent book by the British novelist and scientist Snow. Snow was referring to the loss in Western society of a common culture as a result of the division between the sciences and humanities, a division that he saw as an impediment to solving social problems.

      Although the idea of two cultures may seem simplistic, within the social sciences there continues to be a divide between more scientific and more humanistic forms of knowledge. These are sometimes referred to as idiographic and nomothetic knowledge (see Chapter 5), although social scientists themselves more often refer to scientific positivism and postpositivism. Positivism is a paradigm of inquiry that prioritizes quantification, hypothesis testing, and statistical analysis; postpositivism is a more interpretive paradigm that values close reading and multiple interpretations of texts. In practice, text mining and text analysis research is usually performed as a pragmatic combination of these two paradigms. Because positivism and postpositivism are premised on different epistemological and ontological positions, they often produce research findings that are “incommensurable,” meaning that they cannot build upon one another. Positivism and postpositivism are based on epistemological and ontological orientations that can be sorted into the following five philosophical positions (Howell, 2013).

      Correspondence Theory

      The first philosophical position that provides a foundation for social research, correspondence theory, is a traditional model of knowledge and truth associated with scientific positivism. This position considers that there exists a correspondence between truth and reality and that notions of truth and reality correspond with things that actually exist in the world, be they earthworms, comets, chemical reactions, or people’s thoughts and ideas. It is understood that there are relationships between things that exist in the world and the concepts we use to describe and understand them, and the truth of concepts is gauged by how they relate to an objective reality that exists independent of how people think and talk about it. Thus, truth and knowledge are universal and absolute, and the goal of theories, be they from the social, natural, or physical sciences, is to accurately reflect objective reality through the precise use of thoughts, words, and symbols.

      An implication of correspondence theory for text analysis research projects is that the goal of such projects should be to learn objective facts about online groups and communities based on the documents they produce. If text mining and analysis methods are properly applied, they ought to be able to uncover facts about social groups that are objective and therefore incontrovertible.

      Coherence Theory

      In the second major philosophical position that has influenced social science, coherence theory, truth, knowledge, and theory must fit within a coherent system of propositions. Such a system of propositions

Скачать книгу