Читать онлайн книгу - Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen

Скачать книгу

and (3.2), we have E(CX) = Cμ and cov(CX) = CΣCT. So CX ∼ Nq(Cμ, CΣCT).

Normality of subvectors. Let X1 = (X1, X2,…, Xq) be the subvector of the first q elements of X and X2 = (Xq+1, Xq+2,…, Xp) be the subvector of the remaining p − q elements of X. From (3.5) and (3.6), μ and Σ can be partitioned as (3.10)where μi and Σii are the mean vector and covariance matrix of Xi, for i = 1, 2. If X follows a multivariate normal distribution, we have the additional property that both X1 and X2 follow a multivariate normal distribution. That is, if X ∼ Np (μ, Σ), then X1 ∼ Nq (μ1, Σ11) and X2 ∼ Np–q (μ2, Σ22). A special case of this property is that each element of X also follows a (univariate) normal distribution. That is, if X ∼ Np (μ, Σ), then Xj ∼ N(μj, σjj), j = 1, 2,…, p. The converse of this result is not true. If each element of a random vector X follows a univariate normal distribution, X may not follow a multivariate normal distribution.

Zero covariance implies independence. If X ∼ Np (μ, Σ) and , the mean vector and covariance matrix of X can be partitioned as in (3.10). The subvectors X1 and X2 are independent if and only if Σ12 = 0. Specifically, for any two elements Xi and Xj of X, Xi and Xj are independent if and only if σij = cov(Xi, Xj) = 0. Note that if Xi and Xj do not follow joint normal distribution, and Xi and Xj are independent, we still have cov(Xi, Xj) = 0. However, the converse is not necessarily true. That is, if cov(Xi, Xj) = 0, Xi and Xj may not be independent.

Conditional distributions are normal. Suppose and the mean vector and covariance matrix of X is given by (3.10). If X1 and X2 are not independent, we have Σ12 ≠ 0 and the conditional distribution of X1 given X2 = x2, is multivariate normal with (3.11) (3.12)Note that the mean vector of the conditional distribution is a linear function of x2. But the covariance matrix of the conditional distribution does not depend on x2. If X1 and X2 are independent, clearly the conditional distribution of X1 given X2 = x2 is simply Nq (μ1, Σ11), the unconditional distribution of X1.

3.3 Maximum Likelihood Estimation for Multivariate Normal Distributions

If the population distribution is assumed to be multivariate normal with mean vector μ and covariance matrix Σ. The parameters μ and Σ can be estimated from a random sample of n observations x₁, x₂,…, x_n. A commonly used method for parameter estimation is the maximum likelihood estimation (MLE), and the estimated parameter values are called the maximum likelihood estimates. The idea of the maximum likelihood estimation is to find μ and Σ that maximize the joint density of the x’s, which is called the likelihood function. For multivariate normal distribution, the likelihood function is

$table row cell bold L bold left parenthesis bold mu bold comma bold capital sigma bold semicolon bold x subscript bold 1 bold comma bold x subscript bold 2 bold comma bold horizontal ellipsis bold comma bold x subscript bold n bold right parenthesis bold equals bold product from bold i bold equals bold 1 to bold n of bold f bold left parenthesis bold x subscript bold i bold semicolon bold mu bold comma bold capital sigma bold right parenthesis end cell row cell bold equals bold product from bold i bold equals bold 1 to bold n of fraction numerator bold 1 over denominator bold left parenthesis bold 2 bold pi bold right parenthesis to the power of bold p bold divided by bold 2 end exponent bold vertical line bold capital sigma bold vertical line to the power of bold 1 bold divided by bold 2 end exponent end fraction bold e to the power of bold minus bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis to the power of bold T bold capital sigma to the power of bold minus bold 1 end exponent bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis bold divided by bold 2 end exponent end cell row cell bold equals fraction numerator bold 1 over denominator bold left parenthesis bold 2 bold pi bold right parenthesis to the power of bold np bold divided by bold 2 end exponent bold vertical line bold capital sigma bold vertical line to the power of bold n bold divided by bold 2 end exponent end fraction bold e to the power of bold minus bold sum from bold i bold equals bold 1 to bold n of bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis to the power of bold T bold capital sigma to the power of bold minus bold 1 end exponent bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis bold divided by bold 2 end exponent bold. end cell end table$ (3.13)

It is often easier to find the MLE by minimizing the negative log likelihood function, which is given by

(3.14)

Taking the derivative of (3.14) with respect to μ, we have

$table row cell fraction numerator bold partial differential over denominator bold partial differential bold mu end fraction bold l bold left parenthesis bold mu bold comma bold capital sigma bold right parenthesis bold equals bold minus bold sum from bold i bold equals bold 1 to bold n of bold capital sigma to the power of bold minus bold 1 end exponent bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis bold. end cell end table$ (3.15)

Setting the partial derivative in (3.15) to zero, the MLE of μ is obtained as

(3.16)

which is the sample mean vector of the data set x₁, x₂,…, xn. The derivation of the MLE of Σ is more involved and beyond the scope of this book. The result is given by

$table row cell bold capital sigma with bold hat on top equals 1 over n sum from i equals 1 to n of left parenthesis bold x subscript i minus bold x with bold bar on top right parenthesis left parenthesis bold x subscript i minus bold x with bold bar on top right parenthesis to the power of T equals fraction numerator n minus 1 over denominator n end fraction bold S comma end cell end table$ (3.17)

where S is the sample covariance matrix as given in (2.6). Since the MLE uses n instead of n – 1 in the denominator, it is a biased estimator. So the sample covariance matrix S is more commonly used to estimate Σ,

Скачать книгу

Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen

Чтение книги онлайн.

Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 21

Информация о книге:

3.3 Maximum Likelihood Estimation for Multivariate Normal Distributions