Innovations in Digital Research Methods. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Innovations in Digital Research Methods - Группа авторов страница 4

Innovations in Digital Research Methods - Группа авторов

Скачать книгу

      Peter Halfpenny

      Rob Procter

      1.1 Introduction

      The dramatic increase over the last two decades or so in computing power, in wired and wireless connectivity, and in the availability of data has affected all aspects of our lives. Our aim in this book is to provide an accessible introduction to how social science researchers are harnessing innovations in digital technologies to transform their research methods. In this chapter we provide an overview of how and why e-Research methods have emerged, including an account of the drivers that have motivated their development and the barriers to their successful adoption. The chapters that follow examine how innovations in digital technologies are enabling the emergence of more powerful research infrastructure, services and tools, and how social science researchers are exploiting them.

      1.1.1 Digital Data

      As everyone exposed to the Internet is aware, the amount of digital data available is expanding very rapidly, both through the digitization of past records and by the accretion of ‘born digital’ materials that are in machine-readable form from the outset. The digital universe – the data we create and copy annually – is estimated to be doubling in size every two years and projected to reach 44 trillion gigabytes by 2020 (where a trillion is a million million, or 1012) (IDC, 2014). For social scientists, the predictions that more data will be generated in the next five years than in the entire history of human endeavour is both an opportunity and a challenge.

      Today, vast amounts of data are generated as people go about their daily activities, both data that is deliberately produced and that which is generated by embedded systems. For example, use of public services is captured in administrative records; in the private sector, patterns of consumption of goods and services are captured in credit and debit card records; patterns of personal communications are captured in telephone records; patterns of movement are logged by sensors, such as traffic cameras, satellites and mobile phones; the movement of goods is increasingly tracked by devices such as radio-frequency identification (RFID) tags; and the advent of the ‘Social Web’ has led to an explosion of citizen-generated content in blogs and on social networking sites.

      Currently, these data sources are barely exploited for social research purposes. The potential benefits to researchers are enormous, offering opportunities to mount multidisciplinary investigations into major social and scientific issues on a hitherto unrealizable scale by marshalling artificially produced and naturally occurring ‘big data’ of multiple kinds from multiple sources. However, exploiting these digital data sources to their full research potential requires new mechanisms for ensuring secure and confidential access to sensitive data, and new analysis tools for mining, integrating, structuring and visualizing data from multiple sources.

      1.1.2 e-Infrastructure

      Since the beginning of the new millennium, a world-wide effort has been underway to create the research infrastructure and to develop the research methods that will be needed if the ‘data deluge’ is to be harnessed effectively for research. A new generation of distributed digital technologies is leading to the development of interoperable, scalable computational tools and services that increasingly make it possible for researchers to locate, access, share, aggregate, manipulate and visualize digital data seamlessly across the Internet on a scale that was unthinkable only a decade or so ago.

      e-Infrastructure comprises the information and communication technologies (ICTs) – the networked computing hardware and software – and the digital data that are deployed to support research. A very broad definition has been adopted by Research Councils UK (2014), which spells out more fully the components that are brought together:

      e-Infrastructure refers to a combination and interworking of digitally-based technology (hardware and software), resources (data, services, digital libraries), communications (protocols, access rights and networks), and the people and organisational structures needed to support modern, internationally leading collaborative research be it in the arts and humanities or the sciences.

      This definition highlights the complexity of e-Infrastructure and, correspondingly, the enormity of the socio-technical efforts required to efficiently integrate distributed computers, data, people and organizations in order to deliver tools and services that scientists can readily adopt to their advantage in pursuing their research. (In the US, the term cyberinfrastructure is more commonly used than e-Infrastructure.)

      e-Research is the generic term that has been coined for the innovations in research methods that are emerging to take advantage of this new and vastly more powerful e-Infrastructure. Similarly, e-Social Science is the research facilitated by the e-Infrastructure. The ‘e’ in all these terms is short for ‘electronic’, although it is sometimes rendered as ‘enhanced’.

      The scope of the book is the application of e-Research methods across the social sciences, including both quantitative and qualitative data collection and analysis. The aim is to introduce the reader to the application of innovative digital research methods throughout the research lifecycle, from resource discovery, through the collection, manipulation and analysis of data, to the presentation and publication of results.

      1.2 Background

      1.2.1 e-Science

      Over the period 2001 to 2006, the UK Government invested £213m in an e-Science programme (Hey and Trefethen, 2004). The overall aim of the programme was to invent and apply computer-enabled methods to ‘facilitate distributed global collaborations over the Internet, and the sharing of very large data collections, terascale computing resources and high performance visualizations’.1 The funding was divided between a ‘core programme’, focused on developing the generic technologies needed to integrate different resources seamlessly across computer networks, and individual Research Council programmes specific to the disciplines they support. The Economic and Social Research Council (ESRC) allocation was £13.6m over the five years, with the major part of this investment devoted to setting up the National Centre for e-Social Science (NCeSS). The Centre had a distributed structure, with a coordinating Hub responsible for designing and managing the programme and eleven large three-year projects devoted to developing innovative tools and services and applying them in substantive fields of inquiry.

      The ambition of the overall e-Science programme was to promote the adoption of innovations in digital infrastructure to facilitate bigger and faster science, with collaborators worldwide addressing major research questions in new ways. The initial technical focus was grid computing, driven by a set of ‘middleware’ standards. These are the shared protocols required for the development of sophisticated software to enable large numbers of distributed and heterogeneous computer systems to be linked and inter-operate, thereby providing researchers with seamless, on-demand access to scalable processing power to handle very large-scale datasets, regardless of the location of the researchers or the data. This model of e-Infrastructure was particularly appropriate to particle physics and such challenges as weather prediction and earthquake modelling. Advances in these areas are dependent on collecting and marshalling data on a vast scale and having huge computing resources to analyse it, accessible by large networks of research teams distributed across the world.

      However, the grid computing blueprint for e-Infrastructure proved slow to mature, sometimes difficult to deploy in practice and it did not always offer the most appropriate solutions to scientists’ requirements. Meanwhile, other technologies emerged and alternative solutions to the demand for scalable computing and data storage, such as cloud computing, became available. Alongside this was the flowering of the

Скачать книгу