Innovations in Digital Research Methods. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Innovations in Digital Research Methods - Группа авторов страница 16

Innovations in Digital Research Methods - Группа авторов

Скачать книгу

in England (LSYPE),28 which links annual survey data to data from the School Census29 (as discussed below).

      The methodology of data or record linking can be simply one of matching record numbers between multiple sources but can also be probability based. This involves linkages based on similar characteristics as opposed to unique identifiers. Computational statistical techniques are involved in optimizing record matching rules and weighting different variables in the matching process. Data preparation is a key stage of this research design. Account needs to be taken of missing data and data entry errors and quality assurance procedures need to be put in place. For further discussion see Herzog et al. (2007).

      It is argued that data linkage can be cost saving and enable analyses to be conducted that would otherwise not be possible or would involve further primary data gathering. Best practices for linking data and the research and ethical issues raised are slowly being developed. A key aspect of this is the terms of use of the different data sources. Some surveys now ask for the respondent’s permission for the anonymous use of their responses for the purposes of linking with other datasets. Examples include the National Survey of Wales and the Scottish Longitudinal Study. The UK’s Economic and Social Research Council is presently reviewing the area of data access and linkage as part of its Administrative Data Task Force (see Boyle, 2012).30 The International Health Data Linkage Network is a useful information resource on linked data.31 For further discussion see Gill (2001), Herzog et al. (2007), Mason and Shihfen (2008) and Chapter 3 in this volume.

      2.2.4 Freedom of Information Requests for Social Research

      In the UK, legislation has also made public sector information increasingly available for transparency and accountability purposes and potentially for social science research. Under the Freedom of Information Act 2000 (FOI), requests for detailed records of what we term consequential data held by public bodies can be made. Unless there is good reason not to, the organization holding the data must provide the information within 20 working days.32 Accepted reasons for refusal include cost, whether the request is vexatious, and if it would prejudice a criminal investigation. The legislation has been widely used to examine transparency in government. Thousands of requests have been made since the introduction of the act, including many in areas that social science research has a track record of examining, such as government decision-making and public spending. Access to this type of data has facilitated research breakthroughs in these areas including, notably: information on MP’s expense claims, records of donations to political parties, extent of care home abuse allegations, detention of children in police cells, links between police forces and commercial companies, police work force demographics and gambling spending levels. However, as reported in Lee (2005), the majority of such requests are not for what might be considered standard social science research purposes. Nevertheless, some examples in the UK context include: local authority data on business cases for new schools (Khadaroo, 2008), Ministry of Defence medical data (Seal, 2006), Department of Health data on drug addiction policy (Mold and Berridge, 2007) and police force crime data (Hutchings et al., 2006). It is notable that as of 2013 new regulations relating to open data rights require data released under FOI requests to be prepared in reusable formats, and that the regulations also allow for the data to be used commercially.33

      2.2.5 Commercial Data Sources and Providers

      In parallel to these developments, commercial data companies are increasingly providing highly detailed, individual-level information products combining different types of data, including intentional, consequential, trace and synthetic data. The information can include such details as: name, address, full postcode, age, gender, income, occupation, number of children, household income, house type, tenure, education, consumption, length of residence, car ownership, insurance packages, ownership of ICT products, holidays, smoking, leisure activities and social attitudes34 (Purdam et al., 2004).

      Such data is compiled from different sources, including: surveys; warranty forms where citizens agree to the shared use of their details; public records; administrative records such as the Electoral Register and house sale information; and consumption records. Whilst some of this information may be considered personal, it has already been in the public domain in some form or permission for use has been given at origin (see Elliot et al., 2013).

      Where information is missing in these commercial data products, it is often imputed or simulated from other data subjects with similar characteristics. Attitude profiling is also used where demographic information is missing. This involves profiling individuals by combining responses to multiple attitudinal questions. These imputation processes can lead to questions of data accuracy. However, there is only limited research in this area. These commercial data sources are available with short lead times and techniques have been developed to combine different types of data often involving large numbers of individuals and variables. Individual record data matched to postcodes and/or individual names can be purchased at relatively low cost.

      Commercial data product suppliers are also now providing access to social media data including Twitter posts and analysis of such sources as YouTube, long-form blogs and Facebook, as well as data from online game playing. It is to these types of data we now turn, including what we consider consequential, self-published and trace data.

      2.2.6 New Data Types and Approaches

      Social media is increasingly seen as an invaluable source of data for social research. Research use of social media data might involve the textual analysis of micro-blogs such as Twitter to code for attitudes (see Chapter 8), explore networks (see Chapter 10) and contextual patterns, and to track movements using user name, time of posting, geography and network links (see Chapter 9). An example of this, which we consider in more detail in Chapter 3, was the collection and coding of 2.6 million Twitter postings during the civil disturbances in England in 2011. The data was used to examine patterns of communication during the riots and content analysis of rumour patterns (see Lewis et al., 2011; Procter et al., 2013a; 2013b). A second example of social media data use involves a study of videos posted on YouTube in response to the release of a film criticizing Islam. The analysis involved examining the nature and content of comments and the links to other uploaded videos and postings to map the scale of the protests and the nature of the dialogue (Van Zoonen et al., 2011). Other studies have examined the nature of political protest by examining online postings and discussions; see, for example, Bowman-Grieve and Conway’s (2012) research into dissident Irish Republicanism.

      Social media data can also be used in a more exploratory way for social science research. For example, purposive sampling techniques can support the development of research ideas and the testing of concepts. A recent example of such an approach, entitled The Everyday Sexism Project, involved the development of a website where the public were invited to report experiences of sexism (Bates, 2014).35 The data has strengths and weaknesses. It provides an insight into, and examples of, reported sexual harassment. However, the sample is limited. There is no verification

Скачать книгу