Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis

Чтение книги онлайн.

Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 16

Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis

Скачать книгу

be considered credible however when empirical reality and the theory coincide (see Figure 1.2). The fit may not be perfect, and seldom if ever is, but when the rational coincides well with the empirical, credibility of the idea is at least tentatively assured, at least until potentially new evidence debunks it (e.g., the fall of Newtonian physics).

      We must also ensure that our theories are not too convenient of narratives fit to data. If you have ever witnessed a sporting event where the deciding point occurred by the lucky bounce of a puck in hockey or the breezy push of a tennis ball in midair, only to hear post‐match commentators laud the winning team or individual as suddenly so much better than the losing team, then you know what “convenient narratives” are all about. We must be careful not to exaggerate how well our given theory fits data simply because a few data points went “our way.” George Box once said that all models are wrong but some are useful. In any scientific endeavor, guard against falling in love with your theory or otherwise exaggerating it far beyond what the data suggest. Otherwise, it no longer is a legitimate theory, but rather is simply your brand and more a product of subjective bias and “career‐building” than anything scientific. After 20 years of advocating a theory, is the researcher you are speaking to really prepared to “accept” evidence that contradicts his or her theory? They have a lot of stakes in that theory, their whole career may have been built upon it, are they really willing to accept “defeat” of it? Indeed, one reason I believe why economic predictions, for instance, are often looked upon with suspicion, is because economists, like psychologists (and theoretical physicists, for that matter), are far too quick to advance theories as though they were near facts. “Sexy theories” sound great and may be marketable to uncritical consumers and media (make an outlandish claim on cable, you'll be a hero!), but to good scientists, theories are always only as good as the data that exist to support them. Science is exciting, to be sure, but should not be overly speculative. If you are looking for fireworks, then you are best to choose a field other than science.

      The word “model” is perhaps the most popular word featured in textbooks, tutorials, and lectures having anything to do with the application of quantitative methods. Attempting to define just what is a model in statistics can be a bit challenging. We discuss the concept by referring to Everitt's definition:

      A description of the assumed structure of a set of observations that can range from a fairly imprecise verbal account to, more usually, a formalized mathematical expression of the process assumed to have generated the observed data. The purpose of such a description is to aid in understanding the data.

      (Everitt, 2002, p. 247)

      Source: Diamond et al. (2007). Licensed under CC by 3.0.

      The curve is an inverted “U” shape (an approximate parabola) that provides a useful model relating these two attributes (i.e., performance and arousal). If one exhibits very low arousal, performance will be minimal. If one exhibits a very high degree of arousal, performance will likely also suffer. However, if one exhibits a moderate range of arousal, performance will likely be optimal. The model in this case, as in most cases, does not account for all the data one might collect. The extent to which it accounts for most of the data is the extent to which the model may be, in general, deemed “useful.” The use of a model is also enhanced if it can make accurate predictions of future behavior.

      Why did George Box say that all models are wrong, some are useful? The reason is that even if we obtain a perfectly fitting model, there is nothing to say that this is the only model that will account for the observed data. Some, such as Fox (1997), even encourage divorcing statistical modeling as accounting for deterministic processes. In discussing the determinants of one's income, for instance, Fox remarks:

      I believe that a statistical model cannot, and is not literally meant to, capture the social process by which incomes are “determined” … No regression model, not even one including a residual, can reproduce this process … The unfortunate tendency to reify statistical models – to forget that they are descriptive summaries, not literal accounts of social processes – can only serve to discredit quantitative data analysis in the social sciences. (p. 5)

Скачать книгу