arma-thesis

git clone https://git.igankevich.com/arma-thesis.git
Log | Files | Refs | LICENSE

commit b639a6667e30af8a985c713ff2fa5af4494d2f24
parent 764f1033cd0eb065102d7b04e3665a1ff0fa3e71
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Fri, 20 Oct 2017 13:36:38 +0300

Describe distributed AR algorithm.

Diffstat:
arma-thesis.org | 36++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+), 0 deletions(-)

diff --git a/arma-thesis.org b/arma-thesis.org @@ -3571,6 +3571,42 @@ without interruption. ** MPP implementation **** Distributed AR model algorithm. +This algorithm, unlike its parallel counterpart, employs copying of data to +execute computation on a different cluster node, and since network bandwidth is +much lower than memory bandwidth, the size of data that is sent over the network +have to be optimised to get better performance than on SMP system. One way to +accomplish this is to distribute them between cluster nodes copying in the +coefficients and all the boundary points, and copying out generated wavy surface +part. Autoregressive dependencies prevent from creating all the parts at once +and statically distributing them between cluster nodes, so the parts are created +dynamically on the first node, when dependent points become available. So, +distributed AR model algorithm is a "master-slave" algorithm in which the master +dynamically creates tasks for each wavy surface part taking into account +autoregressive dependencies between points and sends them to slaves, and slaves +compute each wavy surface part and send them back to the master. + +In MPP implementation each task is modelled by a kernel: there is a master +kernel that creates slave kernels on demand, and a slave kernel that computes +wavy surface part. In ~act~ method of master kernel a slave kernel for the first +wavy surface part\nbsp{}--- a part that does not depend on any points\nbsp{}--- +is created. When this kernel returns, the master kernel in ~react~ method +determines which parts can be computed in turn, creates a slave kernel for each +part and sends them to the pipeline. In ~act~ method of slave kernel wavy +surface part is generated and then the kernel sends itself back to the master. +The ~react~ method of slave kernel is empty. + +Distributed AR algorithm implementation has several advantages over the parallel +one. +- Bscheduler pipelines automatically distribute slave kernels between available + cluster nodes, and the main programme does not have to deal with these + implementation details. +- There is no need to implement minimalistic job scheduler, which determines + execution order of jobs (kernels) taking into account autoregressive + dependencies: the order is fully defined in ~react~ method of the master + kernel. +- There is no need in separate version of the algorithm for single cluster node, + the implementation works transparently on any number of nodes. + **** Performance of distributed AR model implementation. #+begin_src R :file build/bscheduler-performance.pdf source(file.path("R", "benchmarks.R"))