Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen
Чтение книги онлайн.
Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 19
As for the sample covariance, the value of the population covariance of two random variables depends on the scaling, possibly due to the difference of measuring unit of the variables. A scaling-independent measure of the degree of linear association between the random variables Xj and Xk is given by the population correlation:
It is clear that ρjk = ρkj. And the population correlation matrix of a random vector X is a symmetric matrix defined as
For univariate variables X and Y and a constant c, we have E(X + Y) = E(X) + E(Y) and E(cX) = cE(X). Similarly, for random vectors X and Y and a constant matrix C, it can be seen that
The covariance matrix of Z = CX is
The similarity of (3.2) and (2.10) is pretty clear. When C is a row vector cT = (c1, c2,…, cp), CX = cTX = c1X1 + … + cp Xp and
where μ and Σ are the mean vector and covariance matrix of X.
Let X1 and X2 denote two subvectors of X, i.e.,
where Σ11 = cov(X1) and Σ22 = cov(X2). The matrix Σ12 contains the covariance of each component in X1 and each component in X2. Based on the symmetry of Σ, we have
3.2 Density Function and Properties of Multivariate Normal Distribution
Normal distribution is the most commonly used distribution for continuous random variables. Many statistical models and inference methods are based on the univariate or multivariate normal distribution. One advantage of the normal distribution is its mathematical tractability. More importantly, the normal distribution turns out to be a good approximation to the “true” population distribution for many sample statistics and real-world data due to the central limit theorem, which says that the summation of a large number of independent observations from any population with the same mean and variance approximately follows a normal distribution.
Recall that a univariate random variable X with mean μ and variance σ2 is normally distributed, which is denoted by X ∼ N (μ, σ2), if it has the probability density function
The multivariate normal distribution is an extension of the univariate normal distribution. If a p-dimensional random vector