Administrative Records for Survey Methodology. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Administrative Records for Survey Methodology - Группа авторов страница 12

Administrative Records for Survey Methodology - Группа авторов

Скачать книгу

or update the Business Register. As a noteworthy special case, one may include here population size estimation based on Census and Census Coverage Surveys (Nirel and Glickman 2009).

      Survey evaluation covers the use of register data for checking, validating, or assessing survey data, whether they are collected in a sample or census. This may be done at both individual and aggregate levels. Reversely, using survey estimates for external validation of register-based statistics has been a natural approach from early on (Myrskyla 1991). Quality survey in a census year is another common approach in Scandinavia (Axelson et al. 2020), which is usually not directed at the population coverage errors of the Central Population Register in those countries, but at the various classification and measurement errors in the register data. Or, as mentioned above, survey data are commonly used implicitly to define the processing rules or to assess the accuracy of the register data.

      In summary, one can speak of a multisource data perspective for combining register and survey data on at least two different levels. In the wider sense, it is possible to characterize equally the uses of both register and survey data into four broad categories: (i) single-source estimation, (ii) multisource estimation, (iii) frames, and (iv) evaluation. Both can be treated as statistical data and used as such. In a narrower sense, one can greatly extend the scope of “indirect estimation” under the multisource data perspective, where register and survey data each may comprise part of the inputs on an equal footing provided the proxy variables are present. Indirect estimation will be discussed in more details in Section 1.3. But first we shall explain below what we mean by proxy variables.

      1.1.2 Concept of Proxy Variable

      Zhang (2015a) defines a proxy variable as one that is similar in definition and has the same support as the target variable. It follows that one can regard two variables as proxy to each other, without having to specify one of them to be the target (or ideal) measure. Variables such as age, sex, education, income can be useful auxiliary but not proxy variables for the binary International Labour Organization (ILO) unemployment status. In particular, sex is not a proxy despite it being binary and thus have the same support as the unemployment status, because they do not have similar definitions. The binary register-based job-seeker status is a proxy, and the ILO unemployment status does not have to be the ideal measure for every conceivable purpose. But the job-seeker status is not a proxy variable for the activity status defined as (employed, unemployed, and inactive) because the two have different support.

      Proxy variables can arise from survey data. For example, indirect interview yields proxy measures (Thomsen and Villund 2011), where household members respond on behalf of the absentees. Data collected in different modes can be proxy to each other. A variable collected in a census can be proxy to the same variable or a similarly defined one in the postcensal years. Synthetic datasets released for research can contain proxy variables for the target measures, based on which the synthetic ones are modeled and generated. Register data are perhaps the richest source of proxy variables. It is often possible to have both complete coverage and concurrency, or nearly so. As some common examples of register proxy variable one can mention economic activity status, education level, income, family and housing condition, etc. in social statistics; value-added tax (VAT) based turnover, export and import, house price, animal holding, fishing and hunting figures, arable soils, vegetation, etc. in economic and environmental statistics.

      1.2.1 Representation

      We start with the Representation side in Figure 1.1, which concerns the target population and units. Let us consider coverage error first. For instance, one may have a Population Register that is not sufficiently accurate to allow for direct tabulation of census-like population counts at detailed aggregation levels, so that Population Coverage Surveys are carried out in order to obtain the desired population estimates. The Population Register and Coverage Survey enumerations are proxies of the true population enumeration. This is the situation in Switzerland 2000 (Renaud 2007) and Israel 2008 (Nirel and Glickman 2009). Other instances may involve one or several register enumerations, Census enumeration and Census Coverage Survey enumeration. Capture–recapture methodology is a commonly used estimation approach that combines two or more proxy enumerations subjected to under-counts (Fienberg 1972; Wolter 1986; Hogan 1993). Adjustment of erroneous over-counts has attracted increasing attention recently, in situations where one does not have a Population Register and over-coverage errors are found to be large in the available register enumerations (ONS 2013). See, e.g. Zhang (2015b), for an extension of the capture–recapture modeling approach, Zhang and Dunne (2017) for trimmed dual-system estimation, and Di Cecco et al. (2018) for a latent class modeling approach.

Скачать книгу