Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Чтение книги онлайн.

Читать онлайн книгу Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic страница 91

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic

Скачать книгу

target="_blank" rel="nofollow" href="#fb3_img_img_3a8a2b90-e3a8-5349-985a-56baf378796a.png" alt="equation"/>

      (5.27)equation

      where images is the normalized attention coefficient computed by the k‐th attention mechanism. The attention architecture in [35] has several properties: (i) the computation of the node‐neighbor pairs is parallelizable, thus making the operation efficient; (ii) it can be applied to graph nodes with different degrees by specifying arbitrary weights to neighbors; and (iii) it can be easily applied to inductive learning problems.

      Apart from different variants of GNNs, several general frameworks have been proposed that aim to integrate different models into a single framework.

      Message passing neural networks (MPNNs) [36]: This framework abstracts the commonalities between several of the most popular models for graph‐structured data, such as spectral approaches and non‐spectral approaches in graph convolution, gated GNNs, interaction networks, molecular graph convolutions, and deep tensor neural networks. The model contains two phases, a message passing phase and a readout phase. The message passing phase (namely, the propagation step) runs for T time steps and is defined in terms of th message function Mt and the vertex update function Ut . Using messages images, the updating functions of the hidden states images are

      (5.28)equation

      where evw represents features of the edge from node v to w. The readout phase computes a feature vector for the whole graph using the readout function R according to

      (5.29)equation

      (5.30)equation

      where images is the adjacency matrix, one for each edge label e. The is the gated recurrent unit introduced in [25]. i and j are neural networks in function R.

      (5.31)equation

      where i is the index of an output position, and j is the index that enumerates all possible positions. f(hi, hj) computes a scalar between i and j representing the relation between them. g(hj) denotes a transformation of the input hj , and a factor 1/ images is utilized to normalize the results.

      There are several instantiations with different f and g settings. For simplicity, the linear transformation can be used as the function g. That means g(hj) = Wghj , where Wg is a learned weight matrix. The Gaussian function is a natural choice for function f, giving images, where images is dot‐product similarity and C (h) =∑∀j f(hi, hj). It is straightforward to extend the Gaussian function by computing similarity in the embedding space giving images with θ (hi) = Wθhi , φ(hj ) = Wφhj , and images. The function f can also be implemented as a dot‐product similarity f(hi, hj) = θ(hi)T φ(hj ). Here, the factor images, where N is the number of positions in h. Concatenation can also be used, defined as images, where wf is a weight vector projecting the vector to a scalar and images

      5.1.3 Graph Networks

      The Graph Network (GN) framework [37] generalizes and extends various GNN, MPNN, and NLNN approaches. A graph is defined as a 3‐tuple G = (u, H, E) (H is used instead of V for notational consistency). u is a global attribute, images is the set of nodes (of cardinality Nv), where each hi is a node’s attribute. images is the set of edges (of cardinality Ne), where each ek is the edge’s attribute, rk is the index of the receiver node, and sk is the index of the sender node.

      GN block contains three “update” functions, φ, and three “aggregation” functions, ρ,

      (5.32)equation

      where images images, and images. The ρ functions must be invariant to permutations of their inputs and should take variable numbers of arguments.

      The computation steps of a GN block:

      1 φe is applied per edge, with arguments (, ,u), and returns . The set of resulting per‐edge outputs for each node i is, = , and is the set of all per‐edge outputs.

      2 ρe → h is applied to , and aggregates the edge updates for edges that project to vertex i, into which will be used in the next step’s node update.

      3 φh is applied to each node i, to compute an updated node attribute, . The set of resulting per‐node outputs is .

      4 ρe → u is applied

Скачать книгу