Matrix and Tensor Decompositions in Signal Processing. Gérard Favier

Чтение книги онлайн.

Читать онлайн книгу Matrix and Tensor Decompositions in Signal Processing - Gérard Favier страница 13

Matrix and Tensor Decompositions in Signal Processing - Gérard Favier

Скачать книгу

TD CMTucker CSTF Imputation Cost functions CPD TD Imputation with low-rank constraint Cost functions CPD TD

       – for the imputation problem with the low-rank constraint, the term χ in the cost function replaces the low-rank constraint with the nuclear norm of χ, since the function rank (χ) is not convex, and the nuclear norm is the closest convex approximation of the rank. In Liu et al. (2013), this term is replaced by where Xn represents the mode-n unfolding of χ7;

       – in the case of the CMTucker model, the coupling considered here relates to the first modes of the tensor χ and the matrix Y of data via the common matrix factor A.

      Coupled matrix and tensor factorization (CMTF) models were introduced in Acar et al. (2011b) by coupling a CPD model with a matrix factorization and using the gradient descent algorithm to estimate the parameters. This type of model was used by Acar et al. (2017) to merge EEG and fMRI data with the goal of analyzing brain activity. The EEG signals are modeled with a normalized CPD model (see Chapter 5), whereas the fMRI data are modeled with a matrix factorization. The data are coupled through the subjects mode (see Table I.1). The cost function to be minimized is therefore given by:

      [I.3]image

      where the column vectors of the matrix factors (A, B, C) have unit norm, Σ is a diagonal matrix whose diagonal elements are the coefficients of the vector σ and α > 0 is a penalty parameter that allows the importance of the sparseness constraints on the weight vectors (g, σ) to be increased or decreased, modeled by means of the l1 norm. The advantage of merging EEG and fMRI data with the criterion [I.3] is that the acquisition and observation methods are complementary in terms of resolution, since EEG signals have a high temporal resolution but low spatial resolution, while fMRI imaging provides high spatial resolution;

       – in the case of the CSTF model (Li et al. 2018), the tensor of high-resolution hyperspectral images (HR-HSI) is represented using a third-order Tucker model that has a sparse core with the following modes: space (width) × space (height) × spectral bands. The matrices denote the dictionaries for the width, height and spectral modes, composed of nw, nh and ns atoms, respectively, and the core tensor contains the coefficients relative to the three dictionaries. The matrices W∗, H∗ and S∗ are spatially and spectrally subsampled versions with respect to each mode. The term λ is a regularization parameter for the sparseness constraint on the core tensor, expressed in terms of the l1 norm of .

      7 See definition [3.41] of the unfolding Xn, and definitions [1.65] and [1.67] of the Frobenius norm (||.||F) and the nuclear norm (||.||) of a matrix; for a tensor, see section 3.16.

      The drawbacks of these optimization methods include slow convergence for gradient-type algorithms and high numerical complexity for the Gauss–Newton and Levenberg–Marquardt algorithms due to the need to compute the Jacobian matrix of the criterion w.r.t. the parameters being estimated, as well as the inverse of a large matrix.

      Alternating optimization methods are therefore often used instead of a global optimization w.r.t. all matrix and tensor factors to be estimated. These iterative methods perform a sequence of separate optimizations of criteria linear in each unknown factor while fixing the other factors with the values estimated at previous iterations. An example is the standard ALS (alternating least squares) algorithm, presented in Chapter 5 for estimating PARAFAC models. For constrained optimization, the alternating direction method of multipliers (ADMM) is often used (Boyd et al. 2011).

      To complete this introductory chapter, let us outline the key knowledge needed to employ tensor tools, whose presentation constitutes the main objective of this second volume:

       – arrangement (also called reshaping) operations that express the data tensor as a vector (vectorization), a matrix (matricization), or a lower order tensor by combining modes; conversely, the tensorization and Hankelization operations allow us to construct tensors from data contained in large vectors or matrices;

       – tensor operations such as transposition, symmetrization, Hadamard and Kronecker products, inversion and pseudo-inversion;

       – the notions of eigenvalue and singular value of a tensor;

       – tensor decompositions/models, and their uniqueness properties;

       – algorithms used to solve dimensionality reduction problems and, hence, best low-rank approximation, parameter estimation and missing data imputation. This algorithmic aspect linked to tensors will be explored in more depth in Volume 3.

      Tensor operations and decompositions often use matrix tools, so we will begin by reviewing some matrix decompositions in

Скачать книгу