hpcs-17-subord

git clone https://git.igankevich.com/hpcs-17-subord.git
Log | Files | Refs

commit b5c438a22224a80e8121776800a3a40dad36ec3e
parent 2e8fd912892f483d44757ea1263f71c8872c757a
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Sat, 18 Feb 2017 19:31:15 +0300

Minor changes.

Diffstat:
src/body.tex | 72++++++++++++++++++++++++++++++------------------------------------------
1 file changed, 30 insertions(+), 42 deletions(-)

diff --git a/src/body.tex b/src/body.tex @@ -1,34 +1,35 @@ \section{Computational kernel hierarchy} -The core provides classes and methods to simplify development of distributed -applications and middleware. The focus is to make distributed application -resilient to failures, i.e.~make it fault tolerant and highly available, and do -it transparently to a programmer. All classes are divided into two layers: the -lower layer consists of classes for single node applications, and the upper -layer consists of classes for applications that run on an arbitrary number of -nodes. There are two kinds of tightly coupled entities in the package~--- -kernels and pipelines~--- which are used together to compose a programme. +The framework provides classes and methods to simplify development of +distributed applications and middleware. The focus is to make distributed +application resilient to failures, i.e.~make it fault tolerant and highly +available, and do it transparently to a programmer. All classes are divided +into two layers: the lower layer consists of classes for single node +applications, and the upper layer consists of classes for applications that run +on an arbitrary number of nodes. There are two kinds of tightly coupled +entities in the framework~--- \emph{kernels} and \emph{pipelines}~--- which are +used together to compose a~programme. Kernels implement control flow logic in theirs \Method{act} and \Method{react} -methods and store the state of the current control flow branch. Both logic and -state are implemented by a programmer. In \Method{act} method some function is -either sequentially computed or decomposed into subtasks (represented by -another set of kernels) which are subsequently sent to a pipeline. In -\Method{react} method subordinate kernels that returned from the pipeline are -processed by their parent. Calls to \Method{act} and \Method{react} methods are -asynchronous and are made within threads spawned by a pipeline. For each kernel -\Method{act} is called only once, and for multiple kernels the calls are done -in parallel to each other, whereas \Method{react} method is called once for -each subordinate kernel, and all the calls are made in the same thread to -prevent race conditions (for different parent kernels different threads may be -used). +methods and store the state of the current control flow branch. Domain-specific +logic and state are implemented by a programmer. In~\Method{act} method some +function is either sequentially computed or decomposed into subtasks +(represented by another set of kernels) which are subsequently sent to a +pipeline. In~\Method{react} method subordinate kernels that returned from the +pipeline are processed by their parent. Calls to \Method{act} and +\Method{react} methods are asynchronous and are made within threads spawned by +a pipeline. For each kernel \Method{act} is called only once, and for multiple +kernels the calls are done in parallel to each other, whereas \Method{react} +method is called once for each subordinate kernel, and all the calls are made +in the same thread to prevent race conditions (for different parent kernels +different threads may be used). Pipelines implement asynchronous calls to \Method{act} and \Method{react}, and try to make as many parallel calls as possible considering concurrency of the -platform (no.~of cores per node and no.~of nodes in a cluster). A pipeline +platform (no.~of cores per node and no.~of nodes in a cluster). A~pipeline consists of a kernel pool, which contains all the subordinate kernels sent by their parents, and a thread pool that processes kernels in accordance with -rules outlined in the previous paragraph. A separate pipeline exists for each +rules outlined in the previous paragraph. A~separate pipeline exists for each compute device: There are pipelines for parallel processing, schedule-based processing (periodic and delayed tasks), and a proxy pipeline for processing of kernels on other cluster nodes. @@ -37,13 +38,13 @@ In principle, kernels and pipelines machinery reflect the one of procedures and call stacks, with the advantage that kernel methods are called asynchronously and in parallel to each other. The stack, which ordinarily stores local variables, is modelled by fields of a kernel. The sequence of processor -instructions before nested procedure calls is modelled by \Method{act} method, -and sequence of processor instructions after the calls is modelled by -\Method{react} method. The procedure calls themselves are modelled by -constructing and sending subordinate kernels to the pipeline. Two methods are -necessary because calls are asynchronous and one must wait before subordinate -kernels complete their work. Pipelines allow circumventing active wait, and -call correct kernel methods by analysing their internal state. +instructions before nested procedure calls is modelled by~\Method{act} method, +and sequence of processor instructions after the calls is modelled +by~\Method{react} method. The procedure calls themselves are modelled +by~constructing and sending subordinate kernels to the pipeline. Two methods +are necessary because calls are asynchronous and one must wait before +subordinate kernels complete their work. Pipelines allow circumventing active +wait, and call correct kernel methods by analysing their internal state. \section{Cluster scheduler architecture} @@ -78,19 +79,6 @@ provided by replicating master kernel to a subordinate node. When any of the replicas fails, another one is used in place. Detailed explanation of the fail over algorithm is provided in Section~\ref{sec:failure-scenoarios}. -\subsection{Security} - -Scheduler driver is able to communicate with scheduler daemons in local area -network. Inter-daemon messaging is not encrypted or signed in any way, assuming -that local area network is secure. There is also no protection from Internet -``noise''. Submission of the task to a remote cluster can be done via SSH -(Secure Shell) connection/tunnel which is de facto standard way of -communication between Linux/UNIX servers. So, scheduler security is based on -the assumption that it is deployed in secure local area network. Every job is -run from the same user, as there is no portable way to switch process owner in -Java. - - \section{Failure scenarios} \label{sec:failure-scenoarios}