Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen
Чтение книги онлайн.
Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 16
where cT = (c1 c2 … cp). It can be seen that the sample mean of z is
The sample variance of z can be found as
Because sample variance is always non-negative, for any c ∈ ℛp we have cT Sc ≥ 0 from (2.8). Therefore, the sample covariance matrix S is always a positive semidefinite matrix.
In general, if we have q linear combinations of x1, x2,…, xp defined by:
or in matrix notation,
The sample mean vector and sample covariance matrix of
are given by
Obviously, (2.9) and (2.10) are generalizations of (2.7) and (2.8), respectively.
Example 2.5 For the auto.spec
data set, using the mean()
function of R
the sample means of the variables city.mpg
and highway.mpg
can be found as 25.22 and 30.75, respectively. If we are interested in the overall MPG of a car, denoted by z, as the following weighted average of x1 = city.mpg
and x2 = highway.mpg
:
where c = (0.4 0.6)T. Then by (2.7) the sample mean of the overall MPG in the data set is
To find the sample variance of z, first we obtain the sample covariance matrix for city.mpg
and highway.mpg
using the cov()
function of R
:
cov(auto.spec.df[, c("city.mpg", "highway.mpg")]) cor(auto.spec.df[, c("city.mpg", "highway.mpg")])
The function cor()
calculates the sample correlation matrix. Based on the output from the above R
codes, we have
By (2.8), the sample variance of z is