Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen
Чтение книги онлайн.
Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 25
A point estimate of μ can be obtained by maximizing the posterior distribution. This method is called the maximum a posteriori (MAP) estimate. The MAP estimate of μ can be written as
From (3.27), it can be seen that the MAP estimate is closely related to MLE. Without the prior g(μ), the MAP is the same as the MLE. So if the prior follows a uniform distribution, the MAP and MLE will be equivalent. Following this argument, if the prior distribution has a flat shape, we expect that the MAP and MLE are similar.
We first consider a simple case where the data follow a univariate normal distribution with unknown mean μ and known variance σ2. The likelihood function based on a random sample of independent observations D = {x1, x2,…, xn} is given by
Based on (3.26), we have
where g(μ) is the probability density function of the prior distribution. We choose a normal distribution N(μ0, σ02) as the prior for μ. This prior is a conjugate prior because the resulting posterior distribution will also be normal. By completing the square in the exponent of the likelihood and prior, the posterior distribution can be obtained as
where
The posterior mean given in (3.28) can be understood as a weighted average of the prior mean μ0 and the sample mean x̄, which is the MLE of μ. When the sample size n is very large, the weight for x̄ is close to one and the weight for μ0 is close to 0, and the posterior mean is very close to the MLE, or the sample mean. On the other hand, when n is very small, the posterior mean is very close the prior mean μ0. Similarly, if the prior variance σ02 is very large, the prior distribution has a flat shape and the posterior mean is close to the MLE. Note that because the mode of a normal distribution is equal to the mean, the MAP of μ is exactly μn. Consequently, when n is very large, or when the prior is flat, the MAP is close to the MLE.
Equation (3.29) shows the relationship between the posterior variance and the prior variance. It is easier to understand the relationship if we consider the inverse of the variance, which is called the precision. A high (low) precision corresponds to a low (high) variance. Equation (3.29) basically says that the posterior precision is equal to the prior precision with an added precision contribution proportional to n. Each observation adds a contribution of
Figure 3.3 Posterior distribution of the mean with various sample sizes
When the data follow a p-dimensional multivariate normal distribution with unknown mean μ and known covariance matrix Σ, the posterior distribution based on a random sample of independent observations D = {x1, x2,…, xn} is given by
where g(μ) is the density of the conjugate prior distribution Np(μ0, Σ0). Similar to the univariate case, the posterior distribution of μ can be obtained as
where