EEG Signal Processing and Machine Learning. Saeid Sanei
Чтение книги онлайн.
Читать онлайн книгу EEG Signal Processing and Machine Learning - Saeid Sanei страница 55
where Φ = {ϕk } is the set of orthogonal basis functions. The weights wi, k are then calculated as:
(4.118)
Often noise is added to the signal, i.e. xi (n) = si (n) + vi (n), where vi (n) is additive noise. This degrades the decorrelation process. The weights are then estimated in order to minimize a function of the error between the signal and its expansion by the orthogonal basis, i.e. e i = xi − Φwi . Minimization of the error in this case is generally carried out by solving the least‐squares problem. In a typical application of PCA as depicted in Figure 4.12, the signal and noise subspaces are separated by means of some classification procedure.
4.9.1 Singular Value Decomposition
Singular value decomposition (SVD) is often used for solving the least‐squares (LS) problem. This is performed by decomposition of the M × M square autocorrelation matrix R into its eigenvalue matrix Λ = diag(λ1, λ2, … λ M ) and an M × M orthogonal matrix of eigenvectors V, i.e. R = VΛVH , where (.) H denotes Hermitian (conjugate transpose) operation. Moreover, if A is an M × M data matrix such that R = AH A then there exist an M × M orthogonal matrix U, an M × M orthogonal matrix V, and an M × M diagonal matrix ∑ with diagonal elements equal to
(4.119)
Hence ∑ 2 = Λ. The columns of U are called left singular vectors and the rows of VH are called right singular vectors. If A is rectangular N × M matrix of rank k then U will be N × N and ∑ will be:
(4.120)
where S = diag(σ1, σ2, … σ k ), where σ i =
(4.121)
where ∑ † is an M × N matrix defined as:
(4.122)
A † has a major role in the solutions of least‐squares problems, and S −1 is a k × k diagonal matrix with elements equal to the reciprocals of the singular values of A, i.e.
(4.123)
In order to see the application of the SVD in solving the LS problem consider the error vector e defined as:
where d is the desired signal vector and Ah is the estimate
(4.125)
or equivalently
(4.126)
Since U is a unitary matrix, ‖e 2‖ = ‖UH e ‖2. Hence, the vector h that minimizes ‖e 2‖ also minimizes ‖UH e ‖2. Finally, the unique solution as an optimum h (coefficient vector) may be expressed as [43]:
(4.127)
where k is the rank of A. Alternatively, as the optimum least‐squares coefficient vector:
(4.128)
Performing PCA is equivalent to performing an SVD on the covariance matrix. PCA uses the same concept as SVD and orthogonalization to decompose the data into its constituent uncorrelated orthogonal components such that the autocorrelation matrix is diagonalized. Each eigenvector represents a principal component and the individual eigenvalues are numerically related to the variance they capture in the direction of the principal components. In this case the mean squared error (MSE) is simply the sum of the N‐K eigenvalues, i.e.:
(4.129)
PCA is widely used in data decomposition, classification, filtering, and whitening. In filtering applications, the signal and noise subspaces are separated and the data are reconstructed from only the eigenvalues and eigenvectors of the actual signals. PCA is also used for BSS of correlated mixtures if the original sources can be considered statistically uncorrelated.
Figure 4.13 Adaptive estimation of the weight vector w(n).
The PCA problem is then summarized as how to find the weights w in order to minimize the error given the observations only. The LMS algorithm is used here to iteratively minimize the MSE as:
(4.130)
The