Читать онлайн книгу - Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen

Скачать книгу

to the population mean, the population variance and covariance can be estimated by the sample variance and covariance introduced in Section 2.2. The sample variance and covariance are both random variables, and are unbiased estimators of the population variance and covariance. Consequently, the sample covariance matrix S is an unbiased estimator of the population covariance matrix Σ, that is, E(S) = Σ.

As for the sample covariance, the value of the population covariance of two random variables depends on the scaling, possibly due to the difference of measuring unit of the variables. A scaling-independent measure of the degree of linear association between the random variables Xj and Xk is given by the population correlation:

$table row cell bold rho subscript bold j bold k end subscript bold equals fraction numerator bold sigma subscript bold j bold k end subscript over denominator square root of bold sigma subscript bold j bold j end subscript end root square root of bold sigma subscript bold k bold k end subscript end root end fraction bold. end cell end table$

It is clear that ρjk = ρkj. And the population correlation matrix of a random vector X is a symmetric matrix defined as

table row cell bold cor open parentheses bold X close parentheses bold equals open parentheses table row bold 1 cell bold rho subscript bold 12 end cell bold horizontal ellipsis cell bold rho subscript bold 1 bold p end subscript end cell row cell bold rho subscript bold 21 end cell bold 1 bold horizontal ellipsis cell bold rho subscript bold 2 bold p end subscript end cell row bold vertical ellipsis bold vertical ellipsis blank bold vertical ellipsis row cell bold rho subscript bold p bold 1 end subscript end cell cell bold rho subscript bold p bold 2 end subscript end cell bold vertical ellipsis bold 1 end table close parentheses bold. end cell end table

For univariate variables X and Y and a constant c, we have E(X + Y) = E(X) + E(Y) and E(cX) = cE(X). Similarly, for random vectors X and Y and a constant matrix C, it can be seen that

(3.1)

E open parentheses bold CX close parentheses space equals space C open parentheses E open parentheses bold X close parentheses close parentheses.

The covariance matrix of Z = CX is

begin inline style sum for bold z of equals cov open parentheses bold Z close parentheses equals cov open parentheses bold CX close parentheses equals bold C bold sum for bold x of bold C to the power of bold T bold. end style (3.2)

The similarity of (3.2) and (2.10) is pretty clear. When C is a row vector c^T = (c₁, c₂,…, cp), CX = c^TX = c₁X₁ + … + cp Xp and

begin inline style table row cell bold E bold left parenthesis bold c to the power of bold T bold X bold right parenthesis end cell cell bold equals bold c to the power of bold T bold mu end cell end table end style (3.3)

begin inline style table row cell bold var bold left parenthesis bold c to the power of bold T bold X bold right parenthesis end cell cell bold equals bold c to the power of bold T bold sum bold c end cell end table end style (3.4)

where μ and Σ are the mean vector and covariance matrix of X.

Let X₁ and X₂ denote two subvectors of X, i.e., bold X equals open parentheses table row cell bold X subscript bold 1 end cell row cell bold X subscript bold 2 end cell end table close parentheses . The mean vector and the covariance matrix of X can be partitioned as

(3.5)

(3.6)

where Σ₁₁ = cov(X₁) and Σ₂₂ = cov(X₂). The matrix Σ₁₂ contains the covariance of each component in X₁ and each component in X₂. Based on the symmetry of Σ, we have capital sigma subscript 21 equals capital sigma subscript 12 superscript T .

3.2 Density Function and Properties of Multivariate Normal Distribution

Normal distribution is the most commonly used distribution for continuous random variables. Many statistical models and inference methods are based on the univariate or multivariate normal distribution. One advantage of the normal distribution is its mathematical tractability. More importantly, the normal distribution turns out to be a good approximation to the “true” population distribution for many sample statistics and real-world data due to the central limit theorem, which says that the summation of a large number of independent observations from any population with the same mean and variance approximately follows a normal distribution.

Recall that a univariate random variable X with mean μ and variance σ² is normally distributed, which is denoted by X ∼ N (μ, σ²), if it has the probability density function

$table row cell f left parenthesis x right parenthesis equals fraction numerator 1 over denominator square root of 2 pi sigma squared end root end fraction e to the power of negative left parenthesis x minus mu right parenthesis squared divided by 2 sigma squared end exponent comma text end text minus infinity less than x less than infinity. end cell end table$ (3.7)

The multivariate normal distribution is an extension of the univariate normal distribution. If a p-dimensional random vector

Скачать книгу

Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen

Чтение книги онлайн.

Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 19

Информация о книге:

3.2 Density Function and Properties of Multivariate Normal Distribution