Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 20

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

Statistics, in large part, is a study of such distributions.

Schematic illustration of the pilot criterion that must be met for any pilot to be permitted to fly a plane.

      Perhaps most pervasive in the social science literature is the implicit belief held by many that methods such as regression and analysis of covariance allow one to “control” variables that would otherwise not be controllable in the nonexperimental design. As is emphasized throughout this book, statistical methods, whatever the kind, do not provide methods of controlling variables, or “holding variables constant” as it were. Not in the real way. To get these kinds of effects, you usually need a strong and rigorous bullet‐proof experimental design.

      It is true, however, that statistical methods do afford a method, in some sense, for presuming (or guessing) what might have been had controls been put into place. For instance, if we analyze the correlation between weight and height, it may make sense to hold a factor such as age “constant.” That is, we may wish to partial out age. However, partialling out the variability due to age in the bivariate correlation is not equivalent to actually controlling for age. The truth of the matter is that our statistical control is telling us nothing about what would actually be the case had we been able to truly control age, or any other factor. As will be elaborated on in Chapter 8 on multiple regression, statistical control is not a sufficient “proxy” whatsoever for experimental control. Students and researchers must keep this distinction in mind before they throw variables into a statistical model and employ words like “control” (or other power and action words) when interpreting effects. If you want to truly control variables, to actually hold them constant, you usually have to do experiments. Estimating parameters in a statistical model, confident that you have “controlled” for covariates, is simply not enough.

      In the establishment of evidence, either experimental or nonexperimental, it is helpful to consider the distinction between statistical versus physical effects. To illustrate, consider a medical scientist who wishes to test the hypothesis that the more medication applied to a wound, the faster the wound heals. The statistical question of interest is—Does amount of medication predict the rate at which a wound heals? A useful statistical model might be a linear regression where amount of medication is the predictor and rate of healing is the response. Of course, one does not “need” a regression analysis to “know” whether something is occurring. The investigator can simply observe whether the wound heals or not, and whether applying more or less medication speeds up or slows down the healing process. The statistical tool in this case is simply used to model the relationship, not determine whether or not it exists. The variable in question is a physical, biological, “real” phenomenon. It exists independent of the statistical model, simply because we can see it. The estimation of a statistical model is not necessarily the same as the hypothesized underlying physical process it is seeking to represent.

      In some areas of social science, however, the very observance of an effect cannot be realized without recourse to the statistics used to model the relationship. For instance, if I correlate self‐esteem to intelligence, am I modeling a relationship that I know exists separate from the statistical model, or, is the statistical model the only recourse I have to say that the relationship exists in the first place? Because of mediating and moderating relationships in social statistics, an additional variable or two could drastically modify existing coefficients in a model to the point where predictors that had an effect before such inclusion no longer do after. As we will emphasize in our chapters on regression:

       When you change the model, you change parameter estimates, you change effects. You are never, ever, testing individual effects in the model. You are always testing the model, and hence the interpretation of parameter estimates must be within the context of the model.

      In this day and age of extraordinary computing power, the likes of which will probably seem laughable in even a decade from the date of publication of this book, with a few clicks of the mouse and a software manual, one can obtain a principal components analysis, factor analysis, discriminant analysis, multiple regression, and a host of other relatively theoretically advanced statistical techniques in a matter of seconds. The advance of computers and especially easy‐to‐use software programs has made performing statistical analyses seemingly quite easy because even a novice can obtain output from a statistical procedure relatively quickly. One consequence of this however is that there seems to have arisen a misunderstanding in some circles that “applied statistics” somehow equates with the idea of “statistics without mathematics” or even worse, “statistics via software.”

      The word “applied” in applied statistics should not be understood to necessarily imply the use of computers. What “applied” should mean is that the focus on the writing is on how to use statistics in the context of scientific investigation, oftentimes with demonstrations with real or hypothetical data. Whether that data is analyzed “by hand” or through the use of software does not make one approach more applied than the other. If analyzed via computer, what it does make it is more computational compared to the by‐hand approach. Indeed, there is a whole field of study known as computational statistics that features a variety of software

Скачать книгу