Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen
Чтение книги онлайн.
Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 23
Let X1, X2,…, Xn denote a random sample from a multivariate normal population. The test statistic in (3.19) can be naturally generalized to the multivariate distribution as
where X̄ and S are the sample mean vector and the sample covariance matrix of X1, X2,…, Xn. The T2 statistic in (3.19) is called Hotelling’s T2 in honor of Harold Hotelling who first obtained its distribution. Assuming H0 is true, we have the following result about the distribution of the T2-statistic:
where Fp,n−p denotes the F-distribution with p and n − p degrees of freedom. Based on the results on the distribution of T2, we reject H0 at the significance level of α if
where Fp,n−p denotes the upper (100α)th percentile of the F-distribution with p and n − p degrees of freedom. The p-value of the test based on the T2-statistic is
where F(p,n − p) denotes a random variable distributed as Fp,n−p.
The T2 statistic can also be written as
which can be interpreted as the standardized distance between the sample mean X̄ and μ0. The distance is standardized by S/n, which is equal to the sample covariance matrix of X̄. When the standardized distance between X̄ and μ0 is beyond the critical value given in the right-hand side of (3.22), the true mean is not likely equal to be μ0 and we reject H0.
The concept of univariate confidence interval can be extended to multivariate confidence region. For p-dimensional normal distribution, the 100(1 − α)% confidence region for μ is defined as
It is clear that the confidence region for μ is an ellipsoid centered at x̄. Similar to the univariate case, the null hypothesis H0 :μ = μ0 is not rejected at level α if and only if μ0 is in the 100(1 − α)% confidence region for μ.
The T2-statistic can also be derived as the likelihood ratio test of the hypotheses in (3.20). The likelihood ratio test is a general principle of constructing statistical test procedures and having several optimal properties for reasonably large samples. The detailed study of the likelihood ratio test theory is beyond the scope of this book.
Substituting the MLE of μ and Σ in (3.16) and (3.17), respectively, into the likelihood function in (3.13), it is easy to see
where
It can be seen that