Data Theory. Simon Lindgren
Чтение книги онлайн.
Читать онлайн книгу Data Theory - Simon Lindgren страница 5
The third chapter, Unintended Consequences, continues to make the argument that pre-digital social theory can be repurposed to make sense of ambivalent sociality in a datafied society. In the chapter, we approach US President Donald Trump’s infamous ‘covfefe’ tweet from the perspective of the sociology of unanticipated consequences, in order to disentangle its surrounding twisted web of tweets, talk, and discourse. This is a case study, presented before we delve deeper into the territory of computational methods in the chapters that follow, to illustrate how social theory can aid the disentanglement of ambivalent online social practice. In this particular case, we will take help from sociologist Robert K. Merton’s perspective on the sometimes unpredictable, and possibly ambivalent, relationships between what people do, or intend, and the outcomes of those actions.
Chapter 4, Actor-Networks, provides an example of how computational approaches can be combined with interpretive theoretical analysis. This is done here, in an area – science, technology and society studies (Callon et al., 1983; Marres, 2017, pp. 106–8) – where such connections have already been made, and where there is great potential. The case analysis in the chapter is based on a dataset consisting of 1.1 million tweets, which were collected using search terms relating to climate change discourse. The chapter uses these data to explore how computational approaches to text analysis can be brought together with actor-network theory (Callon, 2001; Latour, 2005; Law, 1999). This is done by combining elements of the theory with suitable techniques for processing the tweets. First, an analysis based in actor-network theory needs to identify social actors (human and others) in the social context that is under analysis. This is done in this research example with the help of the computational linguistics technique of Named Entity Recognition (Grishman and Sundheim, 1996), which algorithmically identifies and tags any names of people, places, organisations, corporations, nationalities, events, and so on, that appear in the tweets. Second, actor-network theory is interested in how actors connect in relational systems. It wants to map chains of association between humans, things, and ideas that play a part in how social reality is constructed, and how ‘truths’ are manifested. In this chapter’s case example, information about such associations was gained by analysing the network contexts of the mapped actors with the help of topic modelling through so-called Latent Dirichlet Allocation (Blei, Ng, and Jordan, 2003). The information gathered through that machine learning model, in combination with techniques for visualisation from the field of social network analysis (Bastian, Heymann, and Jacomy, 2009; Shannon, 2003; Wasserman and Faust, 1994), enables the drawing of tangible maps of actor-networks. The chapter concludes by returning to the general theme of this book, by raising and discussing the issue of how and why theories and methods can, and must, be adapted and tweaked in ways that mean simplification as well as promoting the emergence of new analytical opportunities. Theories, as well as methods, should be seen as open-source: free for all to share, alter, and transform.
Chapter 5, Collective Representations, is focused on introducing early twentieth-century approaches to the sociology of knowledge to the age of the internet, and especially to the research context of current data science. The key argument in the classic, Durkheimian, approach is that language, conceptual thinking, and logic are shaped by the social contexts out of which they arise. This notion, that stereotypes, categorisations, and manners of speaking that exert great power over our reasoning and actions are social products, has formed the basis for a series of other constructionist perspectives on society and culture over the years. The chapter discusses some modern developments in the sociology of knowledge, alongside social constructionism, and poststructural perspectives such as those of Laclau and Mouffe (1985), and Deleuze and Guattari (1987), where abstract theoretical notions such as discourse, rhizome, and assemblage are exploratively brought together with data science methods. The focus is particularly on text mining through machine learning, and specifically on word embedding models. The chapter aims to show how one can approach, much as a social anthropologist would, massively networked social settings online through big data techniques, and draw on sociological theory in decoding their worldviews. The chapter includes an empirical case study of the forum website Reddit, based on a comprehensive dataset including more than 1.2 billion posts.
The next section of the book, Chapter 6, Symbolic Power, works through an example of how a well-established social theory can be transformed and adapted to enable operationalisations that are fit for social media datasets. The case in focus is Pierre Bourdieu’s theory of social practice (Bourdieu, 1977, 1984, 1992), by which he argued that the social status of an individual is the result of how a variety of resources are converted in a multitude of relational social fields. In his general theory, Bourdieu imagined society as a multidimensional space, where the resources of the individual – consciously and unconsciously – become tools for achieving status to the degree that they are recognised as important by social others. He conceptualised the resources in terms of different forms of ‘symbolic capital’: economic capital, social capital, cultural capital. In spite of being an anthropologist rather than a mathematician, Bourdieu even summarised his grand theory in terms of an equation: [(habitus) (capital)] + field = practice. In spite of these spatial and mathematical metaphors, large-scale empirical explorations and validations of his influential theory have faced serious empirical and computational challenges. This chapter’s case example makes use of a dataset of 1.7 million tweets matching the main hashtag for the 2018 Swedish general election (#val2018). Approaching the question of how power and influence are constituted in political social media discourse, the analysis builds on a conscious and quite far-reaching modification of Bourdieu’s taxonomy of capital forms, in order to make them measurable through social media data.
The seventh chapter, Theoretical I/O, gets more hands-on in terms of how a more generic analytical framework that combines interpretive sociology with data science can be developed. I revisit sociological methodologist Barney Glaser’s (1978) writings on theoretical sensitivity, and argue that his vision for the research process can be translated into the age of data science. I present a model for a research process that alternates between data and computation on the one side, and theory and interpretation on the other. The chapter also includes a concrete example of how to apply the approach. This is in the form of a case study that uses Marxist critical theory, together with the empirical case of the #deletefacebook movement on Twitter, in the wake of the Cambridge Analytica scandal in 2018. The case is used to explore and illustrate how the outlined approach can be realised in empirical and analytical practice.
The book ends with a concluding section in which I summarise and discuss the data theory approach at an overarching level.
1 Beyond Method
In light of the developments towards a datafication of society, there is a need to reinvent and adapt our research approaches in order to make them more relevant and useful. This demands a creative and somewhat anarchistic approach to existing theories and methods.
Sociologist John Law argues, while acknowledging that conventional research methods are indeed useful in some cases, that there is an urgent need to ‘remake social science in ways better equipped to deal with mess, confusion and relative disorder’ (Law, 2004, p. 11). The need to go beyond methods as we know them is underpinned by the fact that social science is not very good at understanding ‘things that are complex, diffuse and messy’. This is because the simple and clear descriptions that most conventional research methods aim for ‘don’t work if what they are describing is not itself very coherent’ (Law, 2004, p. 2). Especially in light of the high level of complexity of twenty-first-century networked society, it is imperative that we develop more ambivalent methodologies to account for our increasingly ambivalent