Читать онлайн книгу - Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen

Скачать книгу

useful property of MLE is the invariance property. In general, let

denote the MLE of the parameter vector θ. Then the MLE of a function of θ, denoted by h(θ), is given by h(

). This result makes it very convenient to find the MLE of any function of a parameter, given the MLE of the parameter. For example, based on (3.17), it is easy to see that the MLE of the variance of Xj, the jth element of X, is given by

table row cell sigma with hat on top subscript j j end subscript equals 1 over n sum from n to i equals 1 of left parenthesis X subscript i j end subscript minus X with bar on top subscript j right parenthesis squared. end cell end table

Then based on the invariance property, the MLE of the standard deviation √σ_jj is .

The MLE has some good asymptotic properties and usually performs well for data sets of large sample sizes. For example, under mild regularity conditions, MLE satisfies the property of consistency, which guarantees that the estimator converges to the true value of the parameter as the sample size becomes infinite. In addition, under certain regularity conditions, the MLE is asymptotically normal and efficient. That is, as the sample size becomes infinite, the distribution of MLE will converge to a normal distribution with variance equal to the optimal asymptotic variance. The details of the regularity conditions are beyond the scope of this book. But these conditions are quite general and often satisfied in common circumstances.

3.4 Hypothesis Testing on Mean Vectors

In this section, we study how to determine if the population mean μ is equal to a specific value μ₀ when the observations follow a normal distribution. We start by reviewing the hypothesis testing results for univariate data. Suppose X₁, X₂,…, Xn are a random sample of independent univariate observations following the normal distribution N(μ, σ²). The test on μ is formulated as

table row cell H subscript 0 colon mu equals mu subscript 0 text vs. end text H subscript 1 colon mu not equal to mu subscript 0 comma end cell end table

where H₀ is the null hypothesis and H₁ is the (two-sided) alternative hypothesis. For this test, we use the following test statistic:

$table row cell t equals fraction numerator top enclose X minus mu subscript 0 over denominator s divided by square root of n end fraction comma end cell end table$ (3.18)

where X̄ is the sample mean and s² is the sample variance $s squared equals fraction numerator 1 over denominator n minus 1 end fraction sum subscript i equals 1 end subscript superscript n left parenthesis X subscript i minus X with bar on top right parenthesis squared$ . The sample mean X̄ follows N(μ, σ²/n) and (n − 1)s²/σ² follows a χ² distribution with n − ¹ degrees of freedom. Consequently, under H₀ the t statistic in (3.18) follows a Student’s t-distribution with n − ¹ degrees of freedom. We reject H₀ at significance level α and conclude that μ is not equal to μ₀ if |t|>t_α/2,n−1, where t_α/2,n−1 denotes the upper 100(α/2)th percentile of the t-distribution with n − 1 degrees of freedom. Intuitively, |t|>t_α/2,n−1 indicates that we only have a small probability to observe |t| if we sample from the Student’s t-distribution with n − 1 degrees of freedom. Thus, it is very likely the null hypothesis H₀ is not correct and we should reject H₀.

The test based on a fixed significance level α, say α = 0.05, has the disadvantage that it gives the decision maker no idea about whether the observed value of the test statistic is just barely in the rejection region or if it is far into the region. Instead, the p-value can be used to indicate how strong the evidence is in rejecting the null hypothesis H₀. The p-value is the probability that the test statistic will take on a value that is at least as extreme as the observed value when the null hypothesis is true. The smaller the p-value, the stronger the evidence we have in rejecting H₀. If the p-value is smaller than α, H₀ will be rejected at the significance level of α. The p-value based on the t statistic in (3.18) can be found as

table row cell P equals 2 text Pr end text left parenthesis T left parenthesis n minus 1 right parenthesis > semicolon vertical line t vertical line right parenthesis comma end cell end table

where T(n − 1) denotes a random variable following a t distribution with n − 1 degrees of freedom.

We can define the 100(1 − α)% confidence interval for μ as

$table row cell left square bracket top enclose x minus t subscript alpha divided by 2 comma n minus 1 end subscript fraction numerator s over denominator square root of n end fraction comma space top enclose x plus t subscript alpha divided by 2 comma n minus 1 end subscript fraction numerator s over denominator square root of n end fraction right square bracket. end cell end table$

It is easy to see that the null hypothesis H₀ is not rejected at level α if and only if μ₀ is in the 100(1 − α)% confidence interval for μ. So the confidence interval consists of all those “plausible” values of μ₀ that would not be rejected by the test of H₀ at level α.

To see the link to the test statistic used for a multivariate normal distribution, we consider an equivalent rule to reject H₀, which is based on the square of the t statistic:

$table row cell t squared equals fraction numerator left parenthesis X with bar on top minus mu subscript 0 right parenthesis squared over denominator s squared divided by n end fraction equals n left parenthesis X with bar on top minus mu subscript 0 right parenthesis left parenthesis s squared right parenthesis to the power of negative 1 end exponent left parenthesis X with bar on top minus mu subscript 0 right parenthesis. end cell end table$ (3.19)

We reject H₀ at significance level α if t²>(t_α/2,n−1)².

For a multivariate distribution with unknown mean μ and known Σ, we consider testing the following hypotheses:

Скачать книгу

Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen

Чтение книги онлайн.

Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 22

Информация о книге:

3.4 Hypothesis Testing on Mean Vectors