Читать онлайн книгу - Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen

Скачать книгу

is replaced by μ₀.

The likelihood ratio test statistic is the ratio of the maximum likelihood over the subset of the parameter space specified by H₀ and the maximum likelihood over the entire parameter space. Specifically, the likelihood ratio test statistic of H₀ : μ = μ₀ is

$table row cell L R equals fraction numerator m a x subscript bold capital sigma L left parenthesis bold italic mu subscript 0 comma bold capital sigma right parenthesis over denominator m a x subscript bold italic mu comma bold capital sigma end subscript L left parenthesis bold italic mu comma bold capital sigma right parenthesis end fraction equals left parenthesis fraction numerator vertical line bold capital sigma with bold hat on top vertical line over denominator vertical line bold capital sigma with bold hat on top subscript 0 vertical line end fraction right parenthesis to the power of n divided by 2 end exponent. end cell end table$ (3.23)

The test based on the T²-statistic in (3.21) and the likelihood ratio test is equivalent because it can be shown that

$table row cell L R equals open parentheses 1 plus fraction numerator T squared over denominator n minus 1 end fraction close parentheses to the power of negative n divided by 2 end exponent. end cell end table$ (3.24)

Example 3.2: Hot rolling is among the key steel-making processes that convert cast or semi-finished steel into finished products. A typical hot rolling process usually includes a melting division and a rolling division. The melting division is a continuous casting process that melts scrapped metals and solidifies the molten steel into semi-finished steel billet; the rolling division will further squeeze the steel billet by a sequence of stands in the hot rolling process. Each stand is composed of several rolls. The side_temp_defect data set contains the side temperature measurements on 139 defective steel billets at Stand 5 of a hot rolling process where the side temperatures are measured at 79 equally spaced locations spread along the stand. In this example, we focus on the three measurements at locations 2, 40, and 78, which correspond to locations close to the middle and the two ends of the stands. The nominal mean temperature values at the three locations are 1926, 1851, and 1872, which are obtained based on a large sample of billets without defects. We want to check if the defective billets have significantly different mean side temperature from the nominal values. We can, therefore, test the hypothesis

table row cell H subscript 0 colon bold mu equals open parentheses table row 1926 row 1851 row 1872 end table close parentheses end cell end table

The following R codes calculate the sample mean, sample covariance matrix, and the T²-statistic for the three side temperature measurements.

side.temp.defect <- read.csv("side_temp_defect.csv",

header = F) X <- side.temp.defect[, c(2, 40, 78)] mu0 <- c(1926, 1851, 1872) x.bar <- apply(X, 2, mean) # sample mean S <- cov(X) # sample var-cov matrix n <- nrow(X) p <- ncol(X) alpha = 0.05 T2 <- n*t(x.bar-mu0)%*%solve(S)%*%(x.bar -mu0) F0 <- (n-1)*p/(n-p)*qf(1-alpha, p, n-p) p.value <- 1 - pf((n-p)/((n-1)*p)*T2, p, n-p)

Using the above R codes, the sample mean and sample covariance matrix are obtained as

table row cell top enclose bold x equals open parentheses table row 1930 row 1848 row 1864 end table close parentheses comma space bold S equals open parentheses table row cell 2547.4 end cell cell negative 111.0 end cell cell 133.7 end cell row cell negative 111.0 end cell cell 533.1 end cell cell 300.7 end cell row cell 133.7 end cell cell 300.7 end cell cell 562.5 end cell end table close parentheses end cell end table

The T²-statistic is obtained by (3.21) as T² = 19.71. The right-hand side of (3.22) at α = 0.05 is obtained as F₀ = 8.13. Since the observed value of T² exceeds the critical value F₀, we reject the null hypothesis H₀ and conclude that the mean vector of the three side temperatures of the defective billets is significantly different from the nominal mean vector. In addition, the p-value is 0.0004 < α =0.05, which further confirms that H₀ should be rejected.

3.5 Bayesian Inference for Normal Distribution

Let D = {x₁, x₂,…, xn} denote the observed data set. In the maximum likelihood estimation, the distribution parameters are considered as fixed. The estimation errors are obtained by considering the random distribution of possible data sets D. By contrast, in Bayesian inference, we treat the observed data set D as the only data set. The uncertainty in the parameters is characterized through a probability distribution over the parameters.

In this subsection, we focus on Bayesian inference of normal distribution when the mean μ is unknown and the covariance matrix Σ is assumed as known. The Bayesian inference is based on the Bayes’ theorem. In general, the Bayes’ theorem is about the conditional probability of an event A given that an event B occurs:

$table row cell text Pr end text left parenthesis A vertical line B right parenthesis equals fraction numerator text Pr end text left parenthesis B vertical line A right parenthesis text Pr end text left parenthesis A right parenthesis over denominator text Pr end text left parenthesis B right parenthesis end fraction. end cell end table$

Applying Bayes’ theorem for Bayesian inference of μ, we have

$table row cell f left parenthesis bold mu vertical line D right parenthesis equals fraction numerator f left parenthesis D vertical line bold mu right parenthesis g left parenthesis bold mu right parenthesis over denominator f left parenthesis D right parenthesis end fraction comma end cell end table$ (3.25)

where g(μ) is the prior distribution of μ, which is the distribution before observing the data, and f(μ|D) is called as the posterior distribution, which is the distribution after we have observed D. The function f(D|μ) on the right-hand side of (3.25) is the density function for the observed data set D. If it is viewed as a function of the unknown parameter μ, f(D|μ) is exactly the likelihood function of μ. Therefore the Bayes’ theorem can be stated in words as

table row cell text posterior end text proportional to text likelihood end text cross times text prior end text comma end cell end table (3.26)

where ∝ stands for “is proportional to”. Note the denominator p(D) in the right-hand side of (3.25) is a constant which does not depend on the parameter μ. It plays the normalization role to ensure the left-hand side is a valid probability density function and integrates to one. Taking the integral of the right-hand side of (

Скачать книгу

Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen

Чтение книги онлайн.

Читать онлайн книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen страница 24

Информация о книге:

3.5 Bayesian Inference for Normal Distribution