Administrative Records for Survey Methodology. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу Administrative Records for Survey Methodology - Группа авторов страница 27
2.5 Conclusions
The goal of this chapter has been to illustrate how confidentiality protection methods can be and have been applied to linked administrative data. Our examples provide a guide to best-practices for data custodians endeavoring to walk the fine line between making data accessible and protecting individual privacy and confidentiality. Our examples also illustrate different paradigms of protection ranging from the more traditional approach of physical security to more modern formal privacy systems and the provision of synthetic data.
In concluding, we note that from a theoretical perspective, there does not appear to be a clear distinction between the threats to confidentiality in linked data relative to unlinked data, or in survey data relative to administrative data. Richly detailed data pose disclosure risks, irrespective of whether that richness is inherent in the data design, or comes from linkages of variables from multiple sources. Likewise, there are no special methods to protect confidentiality in linked versus unlinked data. Any data with a network, relational, panel or hierarchical structure poses special challenges to data providers to protect confidentiality while preserving analytical validity. Our example of the QWI shows one way this challenge has been successfully managed in a linked data setting, but the same tools could be effective in application to the QCEW, which uses the same frame, but does not involve worker-firm linkages.
However, from a legal perspective, linking two datasets can change the nature of confidentiality protection in a more practical manner. Any output must conform to the strongest privacy protections required across each of the linked datasets. For example, when the LEHD program links SSA data on individuals to IRS data on firms, any downstream research must comply with the confidentiality demands of all three agencies. Likewise, the data must conform to the U.S. Census Bureau publication thresholds for data involving individuals and firms. Hence, linking data can produce a maze of confidentiality requirements that are difficult to articulate, comply with, and monitor. Harmonizing or standardizing such requirements and practices across data providers, both public and private, and across jurisdictions would be helpful. Privacy and confidentiality issues also invite updated and continuing research on the demand for privacy from citizens and businesses, as well as the social benefit that arises from the dissemination of data.
2.A Appendix: Technical Terms and Acronyms
ACS – American Community Survey, a large survey conducted continuously by the U.S. Census Bureau, on topics such as jobs and occupations, educational attainment, veterans, housing characteristics, and several other topics (https://www.census.gov/programs-surveys/acs/)
BDS – Business Dynamics Statistics, produced by the U.S. Census Bureau, see https://www.census.gov/programs-surveys/bds.html for more details.
CBP – County Business Patterns, produced by the U.S. Census Bureau, see www.census.gov/programs-surveys/cbp.html for more details.
COEP – Canadian Out-of-Employment Panel, a survey initially conducted by McMaster University in Canada, subsequently taken over by the Statistics Canada (Browning, Jones, and Kuhn 1995)
COMPUSTAT – a commercial database maintained by Standard and Poor’s, with information on companies in the United States and around the world (http://www.compustat.com/).
HRS – Health and Retirement Study, a long-running survey run by the Institute for Social Research at the University of Michigan in the United States on aging in the United States population (http://hrsonline.isr.umich.edu/)
LEHD – Longitudinal Employer-Household Dynamics Program at the U.S. Census Bureau, which links data provided by 51 state administrations to data from federal agencies and surveys (https://lehd.ces.census.gov/)
LODES – LEHD Origin-Destination Employment Statistics describe the geographic distribution of jobs according to the place of employment and the place of worker residence, in part through the flagship webapp OnTheMap (https://onthemap.ces.census.gov/)
QWI – Quarterly Workforce Indicators, a set of local statistics of employment and earnings, produced by the Census Bureau’s LEHD program (https://lehd.ces.census.gov/data/)
SIPP – Survey of Income and Program Participation is conducted by the U.S. Census Bureau on topics such as economic well-being, health insurance, and food security (https://www.census.gov/sipp/).
SSB – the SIPP Synthetic Beta File, also known as “SIPP/SSA/IRS Public Use File”
2.A.1 Other Abbreviations
ABS – Australian Bureau of Statistics, the Australian NSO (http://abs.gov.au/)
AEA – American Economic Association (https://www.aeaweb.org)
ASA – American Statistical Association (https://www.amstat.org)
BLS – Bureau of Labor Statistics, the NSO in the United States providing data on “labor market activity, working conditions, and price changes in the economy.” (https://bls.gov)
CASD – Centre d’accès sécurisé distant aux données, the French remote access system to most administrative data files (https://casd.eu)
Census Bureau – the largest statistical agency in the United States (https://census.gov)
CMS – Center for Medicare and Medicaid Services administers US government health programs such as Medicare, Medicaid, and others (https://cms.gov/)
EIA – Energy Information Agency, collecting and disseminating information on energy generation and consumption in the United States (https://eia.gov).
FICA – Federal Insurance Contribution Act, the law regulating the system of social security benefits in the United States
IAB – Institute for Employment Research at the German Ministry of Labor (http://iab.de/en/iab-aktuell.aspx)
FSRDC