Data Science in Theory and Practice. Maria Cristina Mariani
Чтение книги онлайн.
Читать онлайн книгу Data Science in Theory and Practice - Maria Cristina Mariani страница 25
The notation
Example 3.3 Consider the following data matrix introduced in Example 3.1:
Each receipt yields a pair of measurements, total dollar sales, and number of movies sold. Since there are three receipts, we have a total of three observations on each variable. We find the sample variances and covariance
Therefore,
3.5 Correlation Matrices
A correlation matrix is a table showing correlation coefficients between variables. Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. The sample correlation between the
(3.6)
where
Substituting
(3.7)
for
The sample correlation coefficient is a measure of the linear association between two variables and does not depend on the units of measurement, i.e. when you construct the sample correlation coefficient, the units of measurement that are used cancel out. The sample correlation matrix is analogous to the covariance matrix with correlations in place of covariances:
(3.8)