Spatial Regression Models for the Social Sciences. Jun Zhu

Чтение книги онлайн.

Читать онлайн книгу Spatial Regression Models for the Social Sciences - Jun Zhu страница 5

Spatial Regression Models for the Social Sciences - Jun Zhu Advanced Quantitative Techniques in the Social Sciences

Скачать книгу

models, and explains spatial data and enables us to make inferences and predictions.

      Geographic analysis (or geographic information analysis or spatial information analysis) examines spatial data locations, attributes, and feature relations using geographic analysis techniques, and it also extracts or creates new information from spatial data (O’Sullivan & Unwin, 2010). Examples of geographic analysis are spatial overlay, spatial interpolation, network analysis, three-dimensional analysis, geocoding, terrain analysis, and others by using GIS software (e.g., ESRI) that uses GIS and remote sensing images. Geographic analysis methods and tools have been developed mostly by geographers, geologists, and environmental scientists and have been used by researchers from a wide range of fields.

      Geographic analysis extracts or creates new information from spatial data and examines spatial data locations, attributes, and feature relations.

      1.2.2 Four Types of Spatial Data Analysis

      There are four types of spatial data analysis as categorized in existing spatial statistics and spatial econometrics literature (see, e.g., Bailey & Gatrell, 1995; Cressie, 1993; Schabenberger & Gotway, 2005; Waller & Gotway, 2004):

       spatial point pattern analysis,

       areal data analysis,

       geostatistics, and

       spatial interactive data analysis.

      Each of these types of analysis has its own set of objectives and approaches.

      Spatial point patterns (or spatial point processes) consist of the locations of events occurring in a spatial domain of interest (see, e.g., Baddeley, Rubak, & Turner, 2015; Cressie, 1993; Møller & Waagepetersen, 2003). A goal of spatial point pattern analysis often is to determine or quantify spatial patterns in the form of regularity or clustering (as deviation from randomness) or in relation to covariates. For example, disease mapping, for which the data often consist of locations of disease occurrences and are spatially referenced, is focused on the description and analysis of geographic variations in a disease (such as randomness, regularity, and clustering) and seeks explanations from demographic, environmental, behavioral, socioeconomic, genetic, and infectious risk factors (Elliott & Wartenberg, 2004; Waller & Gotway, 2004).

      Spatial point patterns consist of event locations in a spatial domain of interest.

      Areal data refer to spatial data observed over regular grid cells (or pixels) as seen in remotely sensed data or spatial data aggregated to irregular areal regions such as counties and census tracts; such data are often referred to as lattice data and are sometimes referred to as regional data (see, e.g., Schabenberger & Gotway, 2005; Waller & Gotway, 2004). Areal data analysis aims to quantify the spatial pattern of an attribute on a spatial lattice or region (regular or irregular) through a specific neighborhood structure and examines the relations between the attribute and the potential explanatory variables while accounting for spatial effects. Spatial regression modeling is a common approach used in areal data analysis. For the purposes of this book, the term areal data analysis is used.

      Areal data are spatial data observed over regular grid cells or aggregated to irregular areal regions.

      Geostatistical data refer to spatial data sampled at point locations that are continuous in space. The objectives of geostatistics are similar to those of areal data analysis, but geostatistics aims to also predict attribute values at locations that are not sampled (see, e.g., Cressie, 1993; Goodchild, 1992; Stein, 1999). Geostatistics is common in geology, soil science, and forest resource management research. For example, petroleum geologists estimate hydrocarbon fuel distribution based on a small number of hydrocarbon samples from known locations using geostatistical methods. Two key differences distinguish geostatistics from areal data analysis:

      Geostatistical data are spatial data from point locations continuous in space.

       geostatistical data are geographically referenced to specific point locations while areal data are geographically referenced to areal regions, and

       geostatistics generally measures spatial dependence by distance-based functions while areal data analysis often uses neighborhood structures.

      Spatial interaction data refer to the “flows” between origins and destinations (see, e.g., Bailey & Gatrell, 1995). Spatial interaction data analysis attempts to quantify the arrangement of flows and build models for origin and destination interactions in terms of the geographical accessibility of destinations versus origins as well as the “push factors” of origins and “pull factors” of destinations. Spatial interaction data analysis is often used in transportation planning, migration studies, and other research that has flow information.

      Spatial interaction data are the “flows” between origins and destinations.

      In this book, we restrict our attention to areal data analysis, as it is currently the spatial data analysis approach most used in the social sciences. The other methods discussed (point data analysis, geostatistics, and spatial interaction data analysis) are useful for social science studies as well, however. For example, geographers conduct demographic studies using geostatistics (e.g., Cowen & Jensen, 1998; Jensen et al., 1994; Langford, Maguire, & Unwin, 1991; Langford & Unwin, 1994; Mennis, 2003), and epidemiological and social network researchers use point data analysis and spatial interaction data analysis, respectively.

      Areal data analysis is the focus of this book because it is the spatial approach currently most used in the social sciences.

      1.3 Introduction to the Data Example

      As we addressed in the Preface, the goal of this book is to help social scientists learn practical and useful statistical methods for spatial regression with relative ease. Our approach is to use concrete social data examples and in-depth analyses to illustrate the statistical concepts, models, and methods while keeping the use of statistical formulas and proofs at a minimum. No background of mathematical statistics is assumed of readers.

      For ease of presentation, for most methods discussed in this book, we focus on one case study with one primary data example for addressing specific research questions rather than different studies with different data sets, variables, and/or research questions. Readers are encouraged to think about how their own data could be analyzed to address their research questions while reading our data analyses.

      The data example used throughout this book is a possible template for readers to consider how their own data can be analyzed to address their research questions.

      In the primary data example for this book, the state of Wisconsin in the United States is the study area of interest. We illustrate the use of spatial regression models and methods by studying population change as the response variable in relation to a variety of factors spatially and temporally at the minor civil division (MCD) level. In this book, population change is specifically referred to as a change in population size; that is, the outcome could be either population growth or population decline. Population change is a familiar subject to most social scientists and has been considered an essential component in many social science disciplines, making this data example quite accessible to many social scientists.

      In the following subsections, we first review why and how population change is seen as a spatial phenomenon

Скачать книгу