Data Theory. Simon Lindgren
Чтение книги онлайн.
Читать онлайн книгу Data Theory - Simon Lindgren страница 10
data piñata: Big Data method that consists of whacking data with a stick and hopefully some insights will come out. [Example:] The Big Data Scientist made a Twitter data piñata and found that Saturdays are the weekdays with the most tweets linking to kitty pictures.
(Urban Dictionary, 2018)
Such strategies may be seen by some as unscientific, as they do not rely on actual questions about real problems, but on patterns that one stumbles across more or less randomly. Indeed, in the type of research that deals with solicited data, intently collected for certain research purposes, a data piñata approach would be odd. Why should we collect some random data, just to beat it with a stick to see what pops out? And, what type of data should that be? What methods or informants should be engaged, and how? In the case of register-based or database research, a piñata strategy might be closer at hand. And this is most definitely true in the case of the types of data that are enabled by people’s use of the internet and social media.
Census and survey researcher Kingsley Purdam and his data scientist colleague Mark Elliot aptly point out that today, to a lesser and lesser degree, data is ‘something we have’, rather: ‘the reality and scale of the data transformation is that data is now something we are becoming immersed and embedded in’ (Purdam and Elliot, 2015, p. 26). Their notion of a data environment underlines that people today are at the same time generators of, but also generated by, this new environment. ‘Instead of people being researched’, Purdam and Elliot (2015, p. 26) write, ‘they are the research’. Their point is that new data types have emerged – and are constantly emerging – that demand new flexible approaches. Doing digital social research, therefore, often entails discovering and experimenting with challenges and possibilities of ever-new types and combinations of information. Among these are not only social media data, but also data traces that are left, often unknowingly, through digital encounters. Manovich gives an explanation that is so to the point that it is worth citing at length:
In the twentieth century, the study of the social and the cultural relied on two types of data: ‘surface data’ about lots of people and ‘deep data’ about the few individuals or small groups. The first approach was used in all disciplines that adapted quantitative methods. The relevant fields include quantitative schools of sociology, economics, political science, communication studies, and marketing research. The second approach was used in humanities fields such as literary studies, art history, film studies, and history. It was also used in qualitative schools in psychology, sociology, anthropology, and ethnography. […] In between these two methodologies of surface data and deep data were statistics and the concept of sampling. By carefully choosing her sample, a researcher could expand certain types of data about the few into the knowledge about the many. […] The rise of social media, along with new computational tools that can process massive amounts of data, makes possible a fundamentally new approach to the study of human beings and society. We no longer have to choose between data size and data depth.
(Manovich, 2012, pp. 461–3)
Going back to 1978 and Glaser’s book on Theoretical Sensitivity, we can find some useful pointers on how to see the research process – beyond ‘quantitative’ and ‘qualitative’. The first step, for Glaser (1978, p. 3), is ‘to enter the research setting with as few predetermined ideas as possible’, to ‘remain open to what is actually happening’. The goal is then to alternate between having an open mind – working inductively, allowing an understanding of the research object to emerge gradually – and testing the emerging ideas as one goes along – working deductively trying to verify or falsify the developing interpretations. So, we can, quite mindlessly, beat on the piñata for a little while to see what jumps out. Then try to make sense of the things that emerged, and then beat some more to see what the new stuff that is popping out adds or removes from our present analysis.
Using Glaser’s approach, then, means being truly data-driven. He argues that the overarching question that must continually be posed in any research is: ‘What is this data a study of?’ (Glaser, 1978, p. 57). Most of the time, research projects start off with a clear idea of what to study. It would not make sense to be completely oblivious as to the aims of one’s work. But still, Glaser argues, constantly repeating and renewing the question of what the data is actually about, allows for any other ideas or findings to either take place alongside the initially intended ones, or even replace them completely. The point of the question is that it ‘continually reminds the researcher that his original intents on what he thought he was going to study just might not be; and in our experience it usually is not’ (Glaser, 1978, p. 57). The other important question for empirical research is: ‘What is actually happening in the data?’ The flexibility and inductiveness of the approach wants to get at ‘what is actually going on’ in the area that is studied (Glaser and Strauss, 1967, p. 239).
My point here is that being data-driven, as is often the case when working with big data, is not (only) a new ill, caused by the datafication of society and the fascination with huge datasets. Used in the right way, a data-driven approach – a data piñata – can be truly useful in getting to know more about what goes on, what social and cultural processes may be at work, in contexts and behaviours that are still largely unknown to us. From that perspective, not really knowing what we are looking for, and why, can be a means to tread new ground, veering off the well-trodden paths, to get lost to find our way. If we don’t even know what is going on, maybe beating that piñata with a stick isn’t such a bad idea? The new data science opportunities and tools, in combination with social theory has a huge potential to help decode the deeper meanings of society and sociality today.
Breaking things to move forward
Finding good solutions – rather than adhering to rules – should be the end goal of any analytical strategy. This draws on Feyerabend’s idea that anarchism in science, rather than ‘law-and-order science’, is what will help achieve progress. And, as for the risk that such an approach will lead to an unproductive situation where anything goes, we must simply trust in our own ability to think in structured ways even without following rigid rules dogmatically:
There is no need to fear that the diminished concern for law and order in science and society that characterizes an anarchism of this kind will lead to chaos. The human nervous system is too well organized for that.
(Feyerabend, 1975, p. 13)
In order to think creatively and freely in relation to existing approaches, we must allow ourselves not to think so much about which theoretical perspectives have been conventionally agreed to be compatible with one another, or about whether it is officially correct to mix certain methods together or not. In that sense, the approach that I am proposing can be metaphorically understood as a form of hacking. Because, in spite of its popular reputation to the contrary, hacking is not (only) about breaking the law through forms of electronic vandalism. As argued by cryptologist Jon Erickson (2008), hacking can in fact even be more about adhering to rules than about breaking them. The goal of hacking is to come up with ways of using, or exploiting, the structures and resources that are in operation in any given situation in ways that may be overlooked or unintended. Hacking is about applying existing tools in smart and innovative ways to solve problems. Erickson writes that:
hacked solutions follow the rules of the system, but they use those rules in counterintuitive ways. This gives hackers their edge, allowing them to solve problems in ways unimaginable for those confined to conventional thinking and methodologies.
(Erickson, 2008, p. 16)
Datafication presents us with a new data environment – with data traces, data fragments, and unsolicited data – that offers the opportunity to think in new ways about research in the ‘spirit of hacking’, aiming to surmount ‘conventional boundaries and restrictions’ for the goal of ‘better understanding