Computational Statistics in Data Science. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Computational Statistics in Data Science - Группа авторов страница 40

Computational Statistics in Data Science - Группа авторов

Скачать книгу

an MLP can have any number of hidden layers. The more hidden layers there are, the more complex the model, and therefore the more difficult it is to train/optimize the weights. The model remains almost exactly the same, except for the insertion of multiple hidden layers between the first hidden layer and the output layer. Values for each node in a given layer are determined in the same way as before, that is, as a nonlinear transformation of the values of the nodes in the previous layer and the associated weights. Training the network via backpropagation is almost exactly the same.

      4.1 Introduction

      A CNN is a modified DNN that is particularly well equipped to handling image data. CNN usually contains not only fully connected layers but also convolutional layers and pooling layers, which make a difference. Image is a matrix of pixel values, which should be flattened to vectors before feeding into DNN as DNN takes a vector as input. However, spatial information might be lost in this process. The convolutional layer can take a matrix or tensor as input and is able to capture the spatial and temporal dependencies in an image.

      In the convolutional layer, the weight matrix (kernel) scans over the input image to produce a feature matrix. This process is called convolution operation. The pooling layer operates similar to the convolutional layer and has two types: Max Pooling and Average Pooling. The Max Pooling layer returns the maximum value from the portion of the image covered by the kernel matrix. The Average Pooling layer returns the average of all values covered by the kernel matrix. The convolution and pooling process can be repeated by adding additional convolutional and pooling layers. Deep convolutional networks have been successfully trained and used in image classification problems.

stat08316fgz002
.

      4.2 Convolutional Layer

      4.3 LeNet‐5

stat08316fgz003
.

stat08316fgz004

      LeNet‐5 of LeCun et al. [8].

      Source: Modified from LeCun et al. [8].

      The first layer (C1) is a convolutional layer, which consists of six kernel matrices of size 5 times 5 and stride 1. Each of the kernel matrices will scan over the input image and produce a feature matrix of size 28 times 28. Therefore, six different kernel matrices will produce six different feature matrices. The second layer (S2) is a Max Pooling layer, which takes the 28 times 28 matrices as input. The kernel size of this pooling layer is 2 times 2, and the stride size is 2. Therefore, the outputs of this layer are six 14 times 14 feature matrices.

      Source: LeCun et al. [8].

Indices of output matrices
1 1 5

Скачать книгу