Introduction to Graph Neural Networks. Zhiyuan Liu

Чтение книги онлайн.

Читать онлайн книгу Introduction to Graph Neural Networks - Zhiyuan Liu страница 6

Introduction to Graph Neural Networks - Zhiyuan Liu Synthesis Lectures on Artificial Intelligence and Machine Learning

Скачать книгу

or manifolds, while in this book we only focus on problems defined on graphs and we also investigate other mechanisms used in GNNs such as gate mechanism, attention mechanism, and skip connections. Gilmer et al. [2017] propose the message passing neural network (MPNN) which could generalize several GNN and graph convolutional network approaches. Wang et al. [2018b] propose the non-local neural network (NLNN) which unifies several “self-attention”-style methods. However, in the original paper, the model is not explicitly defined on graphs. Focusing on specific application domains, Gilmer et al. [2017] and Wang et al. [2018b] only give examples of how to generalize other models using their framework and they do not provide a review over other GNN models. Lee et al. [2018b] provides a review over graph attention models. Battaglia et al. [2018] propose the graph network (GN) framework which has a strong capability to generalize other models. However, the graph network model is highly abstract and Battaglia et al. [2018] only give a rough classification of the applications.

      Zhang et al. [2018f] and Wu et al. [2019c] are two comprehensive survey papers on GNNs and they mainly focus on models of GNN. Wu et al. [2019c] categorize GNNs into four groups: recurrent graph neural networks (RecGNNs), convolutional graph neural networks (ConvGNNs), graph auto-encoders (GAEs), and spatial-temporal graph neural networks (STGNNs). Our book has a different taxonomy with Wu et al. [2019c]. We present graph recurrent networks in Chapter 6. Graph convolutional networks are introduced in Chapter 5 and a special variant of ConvGNNs, graph attention networks, are introduced in Chapter 7. We present the graph spatial-temporal networks in Section 9.4 as the models are usually used on dynamic graphs. We introduce graph auto-encoders in Section 10.4 as they are trained in an unsupervised fashion.

      In this book, we provide a thorough introduction to different GNN models as well as a systematic taxonomy of the applications. To summarize, the major contents of this book are as follows.

      • We provide a detailed review over existing GNN models. We introduce the original model, its variants and several general frameworks. We examine various models in this area and provide a unified representation to present different propagation steps in different models. One can easily make a distinction between different models using our representation by recognizing corresponding aggregators and updaters.

      • We systematically categorize the applications and divide them into structural scenarios, non-structural scenarios, and other scenarios. For each scenario, we present several major applications and their corresponding methods.

      CHAPTER 2

      Basics of Math and Graph

       2.1 LINEAR ALGEBRA

      The language and concepts of linear algebra have come into widespread usage in many fields in computer science, and machine learning is no exception. A good comprehension of machine learning is based upon a thoroughly understanding of linear algebra. In this section, we will briefly review some important concepts and calculations in linear algebra, which are necessary for understanding the rest of the book. In this section, we review some basic concepts and calculations in linear algebra which are necessary for understanding the rest of the book.

      • Scalar: A number.

      • Vector: A column of ordered numbers, which can be expressed as the following:

Image

      The norm of a vector measures its length. The Lp norm is defined as follows:

Image

      The L1 norm, L2 norm and L norm are often used in machine learning.

      The L1 norm can be simplified as

Image

      In Euclidean space ℝn, the L2 norm is used to measure the length of vectors, where

Image

      The L norm is also called the max norm, as

Image

      With Lp norm, the distance of two vectors x1, x2 (where x1 and x2 are in the same linear space) can be defined as

Image

      A set of vectors x1, x2, …, xm are linearly independent if and only if there does not exist a set of scalars λ1, λ2 …, λm, which are not all 0, such that

Image

      • Matrix: A two-dimensional array, which can be expressed as the following:

Image

      where A ∈ ℝm × n.

      Given two matrices A ∈ ℝm × n and B ∈ ℝn × p, the matrix product of AB can be denoted as C ∈ ℝm × p, where

Image

      It can be proved that matrix product is associative but not necessarily commutative. In mathematical language,

Image

      holds for arbitrary matrices A, B, and C (presuming that the multiplication is legitimate). Yet

Image

      is not always true.

      For each n × n square matrix A, its determinant (also denoted as |A|) is defined as

Image

      where k1k2kn is a permutation of 1, 2, …, n and τ(k1k2kn) is the inversion number of the permutation k1k2kn, which is the number of inverted sequence in the permutation.

      If matrix A is a square matrix, which means that m = n, the inverse matrix of A (denoted as A−1) satisfies that

Image

Скачать книгу