Nonlinear Filters. Simon Haykin

Чтение книги онлайн.

Читать онлайн книгу Nonlinear Filters - Simon Haykin страница 34

Nonlinear Filters - Simon  Haykin

Скачать книгу

general, there may not exist simple analytic forms for the corresponding PDFs. Without an analytic form, the PDF for a single variable will be equivalent to an infinite‐dimensional vector that must be stored for performing the required computations. In such cases, obtaining the Bayesian solution will be computationally intractable. In other words, the Bayesian solution except for special cases, is a conceptual solution, and generally speaking, it cannot be determined analytically. In many practical situations, we will have to use some sort of approximation, and therefore, settle for a suboptimal Bayesian solution [46]. Different approximation methods lead to different filtering algorithms.

      The relevant portion of the data obtained by measurement can be interpreted as information. In this line of thinking, a summary of the amount of information with regard to the variables of interest is provided by the Fisher information matrix [51]. To be more specific, Fisher information plays two basic roles:

      1 It is a measure of the ability to estimate a quantity of interest.

      2 It is a measure of the state of disorder in a system or phenomenon of interest.

      The first role implies that the Fisher information matrix has a close connection to the estimation‐error covariance matrix and can be used to calculate the confidence region of estimates. The second role implies that the Fisher information has a close connection to Shannon's entropy.

      Let us consider the PDF p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis, which is parameterized by the set of parameters bold-italic theta. The Fisher information matrix is defined as:

      (4.17)bold upper F Subscript bold-italic theta Baseline equals double-struck upper E Subscript bold x Baseline left-bracket left-parenthesis nabla Subscript bold-italic theta Baseline log p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis right-parenthesis left-parenthesis nabla Subscript bold-italic theta Baseline log p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis right-parenthesis Superscript upper T Baseline right-bracket period

      This definition is based on the outer product of the gradient of log p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis with itself, where the gradient is a column vector denoted by nabla Subscript bold-italic theta. There is an equivalent definition based on the second derivative of log p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis as:

      (4.18)bold upper F Subscript bold-italic theta Baseline equals minus double-struck upper E Subscript bold x Baseline left-bracket StartFraction partial-differential squared log p Subscript bold-italic theta Baseline left-parenthesis bold x right-parenthesis Over partial-differential squared bold-italic theta EndFraction right-bracket period

      From the definition of bold upper F Subscript bold-italic theta, it is obvious that Fisher information is a function of the corresponding PDF. A relatively broad and flat PDF, which is associated with lack of predictability and high entropy, has small gradient contents and, in effect therefore, low Fisher information. On the other hand, if the PDF is relatively narrow and has sharp slopes around a specific value of bold x, which is associated with bias toward that particular value of bold x and low entropy, it has large gradient contents and therefore high Fisher information. In summary, there is a duality between Shannon's entropy and Fisher information. However, a closer look at their mathematical definitions reveals an important difference [27]:

       A rearrangement of the tuples may change the shape of the PDF curve significantly, but it does not affect the value of the summation in (2.95) or integration in (2.96), because the summation and integration can be calculated in any order. Since is not affected by local changes in the PDF curve, it can be considered as a global measure of the behavior of the corresponding PDF.

       On the other hand, such a rearrangement of points changes the slope, and therefore gradient of the PDF curve, which, in turn, changes the Fisher information significantly. Hence, the Fisher information is sensitive to local rearrangement of points and can be considered as a local measure of the behavior of the corresponding PDF.

      Both entropy (as a global measure of smoothness in the PDF) and Fisher information (as a local measure of smoothness in the PDF) can be used in a variational principle to infer about the PDF that describes the phenomenon under consideration. However, the local measure may be preferred in general [27]. This leads to another performance metric, which is discussed in Section 4.5.

      To assess the performance of an estimator, a lower bound is always desirable. Such a bound is a measure of performance limitation that determines whether or not the design criterion is realistic and implementable. The Cramér–Rao lower bound (CRLB) is a lower bound that represents the lowest possible mean‐square error in the estimation of deterministic parameters for all unbiased estimators. It can be computed as the inverse of the Fisher information matrix. For random variables, a similar version of the CRLB, namely, the posterior Cramér–Rao lower bound (PCRLB) was derived in [52] as:

      (4.19)bold upper P Subscript k vertical-bar k Baseline equals double-struck upper E left-bracket left-parenthesis bold x Subscript k Baseline minus ModifyingAbove bold x With Ì‚ Subscript k Baseline right-parenthesis left-parenthesis bold x Subscript k Baseline minus ModifyingAbove bold x With Ì‚ Subscript k Baseline right-parenthesis Superscript upper T Baseline right-bracket greater-than-or-equal-to bold upper F Subscript k Superscript negative 1 Baseline comma

      where bold upper F Subscript k Superscript negative 1 denotes the inverse of Fisher information matrix at time instant k. This bound is also referred to as the Bayesian CRLB [53, 54]. To compute it in an online manner, an iterative version of the PCRLB for nonlinear filtering using state‐space models was proposed in [55], where the posterior information matrix of the hidden state vector is decomposed for each discrete‐time instant by virtue of the factorization of the joint PDF of the state variables. In this way, an iterative structure is obtained for evolution of the information matrix. For a nonlinear system with the following state‐space model with zero‐mean additive Gaussian noise:

      (4.20)StartLayout 1st Row 1st Column bold x Subscript k plus 1 2nd Column equals bold f Subscript k Baseline left-parenthesis bold x Subscript k Baseline right-parenthesis plus bold v Subscript k Baseline comma EndLayout

      (4.21)

Скачать книгу