hpcs-16-mic-v2

Speedup of deep neural network learning on the MIC-architecture
git clone https://git.igankevich.com/hpcs-16-mic-v2.git
Log | Files | Refs

intro.tex (2839B)


      1 \section{INTRODUCTION}
      2 
      3 A perceptron that features more than one hidden (learning) layer is called a Deep Learning Network. To train this network the method of backpropagation is usually employed, which is an iterative gradient algorithm, used for minimizing the errors of neural network learning.
      4 
      5 An algorithm iteration consists of three main step functions: \textit{dnnForward} puts a training sample through the network, yielding a certain result; \textit{dnnBackward} determines the error, then, in each layer of the network, starting with the penultimate, calculates the correction of weight coefficients for each node; \textit{dnnUpdate} updates the weight of neurons according to a previously calculated adjustment. Network learning ends when an error reaches its specified minimum accepted level. Such network demonstrates remarkable results in many areas, including those of voice and image recognition. However its deficiency is in a very lengthy learning process. Therefore, it has been decided to investigate the issue of effective functioning of this type of networks on parallel computational architectures. For testing purposes an 8-layered neural network was taken (1 input, 6 hidden, 1 output). In the interest of result analysis the following parameters were chosen: neural network learning speed and precision of object recognition.
      6 
      7 The task was carried out on the Intel Xeon processor (see Table~\ref{tab:platform} for specifications). First, the task was carried out employing only one core. Then the code optimization was carried out to prepare for the launch on the parallel architecture. The decision was made to test the effectiveness of Many Integrated Core (MIC) architecture~\cite{duran2012intel} in light of finding the solution to the task. This architecture features a large number of x86 cores in one co-processor, coupled with the main processor. Intel Xeon Phi co-processor specifications are also shown in Table~\ref{tab:platform}.
      8 
      9 \begin{table}[h]
     10     \centering
     11     \caption{Computational platform specifications.}
     12     \begin{tabular}{lp{0.7\columnwidth}}
     13         \midrule
     14         Processor   & 2xIntel Xeon CPU E5-2695 v2 (12 cores, 2 streams by core, 2.40 Ghz)  \\
     15         Coprocessor & Intel Xeon Phi-5110P (60 cores, 4 streams by core, 1.052 Ghz)  \\
     16         \bottomrule
     17     \end{tabular}
     18     \label{tab:platform}
     19 \end{table}
     20 
     21 %The decision was made to test the effectiveness of MIC-architecture in light of finding the solution to the task. MIC (Many Integrated Core) is an architecture that features a large number of x86 cores in one co-processor, coupled with the main processor.
     22 %The decision was made to test the effectiveness of MIC-architecture (she features a large number of x86 cores in one co-processor, coupled with the main processor) in light of finding the solution to the task.