Introduction to Graph Neural Networks. Zhiyuan Liu
Чтение книги онлайн.
Читать онлайн книгу Introduction to Graph Neural Networks - Zhiyuan Liu страница 8
The standard deviation is the square root of variance. In some level, covariance expresses the degree to which two variables vary together:
Greater covariance indicates higher relevance between f(x) and g(y).
2.2.2 PROBABILITY DISTRIBUTIONS
Probability distributions describe the probability of a random variable or several random variables on every state. Several distributions that are useful in the area of machine learning are listed as follows.
• Gaussian distribution: it is also known as normal distribution and can be expressed as:
where μ is the mean of variable x and σ2 is the variance.
• Bernoulli distribution: random variable X can either be 0 or 1, with a probability P(X = 1) = p. Then the distribution function is
It is quite obvious that E(X = p and Var(X) = p(1 − p).
• Binomial distribution: repeat the Bernoulli experiment for N times and the times that X equals to 1 is denote by Y, then
is the Binomial distribution satisfying that E(Y = np) and Var(Y) = np(1 − p).
• Laplace distribution: the Laplace distribution is described as
2.3 GRAPH THEORY
Graphs are the basic subjects in the study of GNNs. Therefore, to get a comprehensive understanding of GNN, the fundamental graph theory is required.
2.3.1 BASIC CONCEPTS
A graph is often denoted by G = (V, E), where V is the set of vertices and E is the set of edges. An edge e = u, v has two endpoints u and v, which are said to be joined by e. In this case, u is called a neighbor of v, or in other words, these two vertices are adjacent. Note that an edge can either be directed or undirected. A graph is called a directed graph if all edges are directed or undirected graph if all edges are undirected. The degree of vertice v, denoted by d(v), is the number of edges connected with v.
2.3.2 ALGEBRA REPRESENTATIONS OF GRAPHS
There are a few useful algebra representations for graphs, which are listed as follows.
• Adjacency matrix: for a simple graph G = (V, E) with n-vertices, it can be described by an adjacency matrix A ∈ ℝn × n, where
It is obvious that such matrix is a symmetric matrix when G is an undirected graph.
• Degree matrix: for a graph G = (V, E) with n-vertices, its degree matrix D ∈ ℝn × n is a diagonal matrix, where
• Laplacian matrix: for a simple graph G = (V, E) with n-vertices, if we consider all edges in G to be undirected, then its Laplacian matrix L = ℝn × n can be defined as
Thus, we have the elements:
Note that the graph is considered to be an undirected graph for the adjacency matrix.
• Symmetric normalized Laplacian: the symmetric normalized Laplacian is defined as:
The elements are given by:
• Random walk normalized Laplacian: it is defined as:
The elements can be computed by:
• Incidence matrix: another common used matrix in representing a graph is incidence matrix. For a directed graph G = (V, E) with n-vertices and m-edges, the corresponding incidence matrix is M ∈ ℝn × m, where
For a undirected graph, the corresponding incidence matrix satisfies that
CHAPTER 3
Basics of Neural Networks
Neural networks are one of the most important models in machine learning. The structure of artificial neural networks, which consists of numerous neurons with connections to each other, bears great resemblance to that of biological neural networks. A neural network learns in the following way: initiated with random weights or values, the connections between neurons updates its weights or values by the back propagation algorithm repeatedly till the model performs rather precisely. In the end, the knowledge that a neural network learned is stored in the connections in a digital manner. Most of the researches on neural network try to change the way it learns (with different algorithms or different structures), aiming to improve the generalization ability of the model.
3.1 NEURON
The basic units of neural networks are neurons, which can receive a series of inputs and return the corresponding output. A classic neuron is as shown in