EEG Signal Processing and Machine Learning. Saeid Sanei
Чтение книги онлайн.
Читать онлайн книгу EEG Signal Processing and Machine Learning - Saeid Sanei страница 54
(4.97)
Assuming ΔW is very small, it is concluded that:
where, ∇w (.)represents gradient with respect to w. This means that the above equation (Eq. 4.98) is satisfied by setting Δw = − μ∇w (.), where μ is the learning rate or convergence parameter. Hence, the general update equation takes the form:
(4.99)
Using the least mean square (LMS) approach, ∇w (η(w)) is replaced by an instantaneous gradient of the squared error signal, i.e.:
(4.100)
Therefore, the LMS‐based update equation is
(4.101)
Also, the convergence parameter, μ, must be positive and should satisfy the following:
(4.102)
where λ max represents the maximum eigenvalue of the autocorrelation matrix R . The LMS algorithm is the most simple and computationally efficient algorithm. However, the speed of convergence can be slow especially for correlated signals. The recursive least‐squares (RLS) algorithm attempts to provide a high speed stable filter, but it is numerically unstable for real‐time applications [40, 41]. Defining the performance index as:
Then, by taking the derivative with respect to w we obtain
where 0 < γ ≤ 1 is the forgetting factor [40, 41]. Replacing for e(n) in the above equation (Eq. 4.104) and writing it in vector form gives:
(4.105)
where
(4.106)
and
(4.107)
From this equation:
(4.108)
The RLS algorithm performs the above operation recursively such that P and R are estimated at the current time n as:
(4.109)
(4.110)
In this case
(4.111)
where M represents the finite impulse response (FIR) filter order. Conversely:
(4.112)
which can be simplified using the matrix inversion lemma [42]:
(4.113)
and finally, the update equation can be written as:
(4.114)
where
and the error e(n) after each iteration is recalculated as:
(4.116)
The second term in the right‐hand side of the above equation is
4.9 Principal Component Analysis
All suboptimal transforms such as the DFT and DCT decompose the signals into a set of coefficients, which do not necessarily represent the constituent components of the signals. Moreover, the transform kernel is independent of the data hence they are not efficient in terms of both decorrelation of the samples and energy compaction. Therefore, separation of the signal and noise components is generally not achievable using these suboptimal transforms.
Expansion of the data into a set of orthogonal components certainly achieves maximum decorrelation of the signals. This enables separation of the data into the signal and noise subspaces.
Figure 4.12 The general application of PCA.
For a single‐channel EEG the Karhunen–Loéve transform is used to decompose the ith channel signal into a set of weighted orthogonal basis functions:
(4.117)