Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen

Чтение книги онлайн.

Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 14

Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen

Скачать книгу

sum subscript i equals 1 end subscript superscript n left parenthesis x subscript i 2 end subscript minus x with bar on top subscript 2 right parenthesis squared to the power of text ​ end text end exponent end style end root end fraction comma"/> (2.4)

      where s1 and s2 are the sample standard deviation of x1 and x2, respectively. The sample correlation ranges between −1 and 1, with values close to 1, −1, and 0 indicating a strong positive linear association, a strong negative linear association, and no linear association, respectively.

x1x2x3
3515190.970.3
2300168.764.0
2800168.965.0
2122166.364.4
2293169.166.0
2765176.864.8
2275171.765.5
1890159.164.2
2926173.266.3
1909158.863.6
table attributes columnalign left end attributes row cell x subscript 1 equals text curb.weight end text end cell row cell x subscript 2 equals text length end text end cell row cell x subscript 3 equals text width end text end cell end table

      To obtain the sample covariance for the variables curb.weight and length in the data set in Table 2.1, we first calculate the sample means 1, 2, and sum from i equals 1 to n of x subscript i 1 end subscript x subscript i 2 end subscript as:

s squared equals fraction numerator begin display style sum subscript i equals 1 end subscript superscript n left parenthesis x subscript i minus x with bar on top right parenthesis squared end style over denominator n minus 1 end fraction equals fraction numerator begin display style sum subscript i equals 1 end subscript superscript n x subscript i superscript 2 minus n x with bar on top squared end style over denominator n minus 1 end fraction. sum from i equals 1 to n of x subscript i 1 end subscript x subscript x 2 end subscript equals left parenthesis 3515 right parenthesis left parenthesis 190.9 right parenthesis plus left parenthesis 2300 right parenthesis left parenthesis 168.7 right parenthesis plus midline horizontal ellipsis plus left parenthesis 1909 right parenthesis left parenthesis 158.8 right parenthesis equals 4262679.

      By (2.2), the sample covariance of the two variables can be obtained as

table attributes columnalign left end attributes row cell s subscript 12 equals fraction numerator begin display style sum subscript i equals 1 end subscript superscript n x subscript i 1 end subscript x subscript i 2 end subscript minus n x with bar on top subscript 1 x with bar on top subscript 2 end style over denominator n minus 1 end fraction end cell row cell equals fraction numerator 4262679 minus left parenthesis 10 right parenthesis left parenthesis 2479.5 right parenthesis left parenthesis 170.35 right parenthesis over denominator 9 end fraction equals 4316.8. end cell end table table attributes columnalign left end attributes row cell s subscript 1 superscript 2 equals fraction numerator begin display style sum from i equals 1 to n of x subscript i 1 end subscript superscript 2 minus n x with bar on top subscript 1 superscript 2 end style over denominator n minus 1 end fraction equals fraction numerator 63 844 665 minus left parenthesis 10 right parenthesis left parenthesis 2479.5 right parenthesis squared over denominator 9 end fraction equals 262 829.2 comma end cell row cell s subscript 2 superscript 2 equals fraction numerator begin display style sum from i equals 1 to n of x subscript i 2 end subscript superscript 2 minus n x with bar on top subscript 2 superscript 2 end style over denominator n minus 1 end fraction equals fraction numerator 290 947.8 minus left parenthesis 10 right parenthesis left parenthesis 170.35 right parenthesis squared over denominator 9 end fraction equals 84.07. end cell end table

      By (2.4), we have

r subscript 12 equals fraction numerator begin display style s subscript 12 end style over denominator begin display style s subscript 1 s subscript 2 end style end fraction equals fraction numerator begin display style 4316.8 end style over denominator begin display style square root of 262829.2 end root square root of 84.07 end root end style end fraction equals 0.918 comma

      which is close to 1 and corresponding to a strong positive linear association between the curb weight and length of cars.

      Example 2.3 In R, the sample mean, variance, covariance, and correlation can be found using functions mean(), var(), cov(), and cor(), respectively. For example, the following R codes can be used to find the sample mean and sample variance of curb.weight, and the sample covariance and correlation between curb.weight and length, in the auto.spec data set.

      mean(auto.spec.df$curb.weight) var(auto.spec.df$curb.weight) with(auto.spec.df, cov(curb.weight, length)) with(auto.spec.df, cor(curb.weight, length))> mean(auto.spec.df$curb.weight) [1] 2555.566 > var(auto.spec.df$curb.weight) [1] 271107.9 > with(auto.spec.df, cov(curb.weight, length)) [1] 5638.336 > with(auto.spec.df, cor(curb.weight, length)) [1] 0.8777285

      2.2.2 Sample Mean Vector and Sample Covariance Matrix

      A multivariate data set consists of n observations collected from n items or units and each observation contains measurements on p variables, x1, x2,…, xp. The measurement vector for the ith observation is denoted by

x subscript i equals open parentheses table attributes columnspacing 1em rowspacing 4 pt end attributes row cell x subscript i 1 end subscript end cell row cell x subscript i 2 end subscript end cell row vertical ellipsis row cell x subscript i p end subscript end cell end table close parentheses.

      The sample mean vector is the vector of sample means for the p variables, which is defined

Скачать книгу