Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis
Чтение книги онлайн.
Читать онлайн книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis страница 37
The standardized covariance is known as the Pearson product‐moment correlation coefficient, or simply r, which is a biased estimator of its population counterpart, ρxy, except when ρxy is exactly equal to 0. The bias of the estimator r can be minimized by computing an adjustment found in Rencher (1998, p. 6), originally proposed by Olkin and Pratt (1958):
Because the correlation coefficient is standardized, we can place lower and upper bounds on it. The minimum the correlation can be for any set of data is −1.0, representing a perfect negative relationship. The maximum the correlation can be is +1.0, representing a perfect positive relationship. A correlation of 0 represents the absence of a linear relationship. For further discussion on how the Pearson correlation can be a biased estimate under conditions of nonnormality (and potential solutions), see Bishara and Hittner (2015).
One can gain an appreciation for the upper and lower bound of r by considering the fact that the numerator, which is an average cross‐product, is being divided by another product, that of the standard deviations of each variable. The denominator thus can be conceptualized to represent the total amount of cross‐product variation possible, that is, the “base,” whereas the numerator represents the total amount of cross‐product variation actually existing between the variables because of a linear relationship. The extent to which covxy accounts for all of the possible “cross‐variation” in
It is important to emphasize that a correlation of 0 does not necessarily represent the absence of a relationship. What it does represent is the absence of a linear one. Neither the covariance or Pearson's r capture nonlinear relationships, and so it is possible to have very strong relations in a sample or population yet still obtain very low values (even zero) for the covariance or Pearson r. Always plot your data to see what is going on before drawing any conclusions. Correlation coefficients should never be presented without an accompanying plot to characterize the form of the relationship.
We compute the Pearson correlation coefficient on Galton's data between child
and parent
:
> cor(child, parent) [1] 0.4587624
We can test it for statistical significance by using the cor.test
function:
> cor.test(child, parent) Pearson's product-moment correlation data: child and parent t = 15.7111, df = 926, p-value < 2.2e-16 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: 0.4064067 0.5081153 sample estimates: cor 0.4587624
We can see that observed t is statistically significant with a computed 95% confidence interval having limits 0.41 to 0.51, indicating that we can be 95% confident that the true parameter lies approximately between the limits of 0.41 and 0.51. Using the package ggplot2
(Wickham, 2009), we plot the relationship between parent and child (with a smoother):
> library(ggplot2) > qplot(child, parent, data = Galton, geom = c("point", "smooth"))
One drawback of such a simple plot is that the frequency of data points in the bivariate space cannot be known by inspection of the plot alone. Jittering is a technique that allows one to visualize the density of points at each parent–child pairing. By jittering, we can see where most of the data fall in the parent–child scatterplot (i.e., points are concentrated toward the center of the plot):
> qplot(child, parent, geom = "jitter")
2.17 PSYCHOMETRIC VALIDITY, RELIABILITY: A COMMON USE OF CORRELATION COEFFICIENTS
Correlation coefficients, specifically the Pearson correlation, are employed in virtually all fields of study, and without the invention or discovery of correlation, most modern‐day statistics would simply not exist. This is especially true for the field of psychometrics, which is the science that deals with the measurement of psychological qualities such as intelligence, self‐esteem, motivation, among others. Psychometrics features the development of psychometric tests purported to measure the construct of interest. For an excellent general introduction to psychometrics, consult McDonald (1999).
When developing psychometric instruments, two statistical characteristics of these tests are especially important: (1) validity, and (2) reliability. Validity of a test takes many forms, including face validity, criterion validity, and most notably, construct validity. Construct validity attempts to assess whether a purported psychometric test actually measures what it was designed to measure, and one way of evaluating construct validity is to correlate the newly developed measure with that of an existing measure that is already known to successfully measure the construct.
For example, in the area of depression assessment, the Beck Depression Inventory (BDI) is a popular self‐report measure often used in evaluating one's level or symptoms of depression. Now, if we were to develop a new test, in order to learn whether that new test measures something called “depression,” we may wish to compute a Pearson correlation of that measure with the BDI. To the extent that the correlation is relatively high, we might tentatively conclude that the new measure is assessing the same (or at least a similar) construct as that of the BDI. Not surprisingly, these correlations in this context often go by the name of validities in the psychometric literature. If a test lacks construct validity, then there is little guarantee that it is measuring the construct under investigation. Fields such as psychology depend on such construct validation to gain some sense of certainty that their measures are tapping into what they are most interested in. Clinical psychology, especially, depends on the strength of such things as construct validity to secure a sense of sureness that their diagnostic tests are measuring what they are thought to measure. Without psychometrics, clinical testing in this way would be no more advanced than folk or “pop” psychology tests we often find on the internet, which are usually wholly unscientific.
The second area of concern, that of reliability, is just as important. Two popular and commonly used forms of reliability in psychometrics are those of test–retest and internal consistency reliability. Test–retest reliability evaluates the consistency of test scores across one or more measurement time points. For example, if I measured your IQ today, and the test was worth