Translate governing principles. Copy overview.

commit 330f0659ebd50513558bde664a4275d348553ffa
parent d49441d8a9cdadddd0483ae68a1e1faa69ba5514
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Fri, 20 Jan 2017 14:52:25 +0300

Translate governing principles. Copy overview.

Diffstat:
phd-diss-ru.org  | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
phd-diss.org  | 142 +++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------

2 files changed, 140 insertions(+), 51 deletions(-)
diff --git a/phd-diss-ru.org b/phd-diss-ru.org
@@ -1780,7 +1780,6 @@ eqref:eq:old-sol-2d, сопоставимы для волн малых ампл
 
 * Высокопроизводительный программный комплекс для моделирования морского волнения
 ** Модель вычислений
-*** Основополагающие принципы модели
 *** Отображение алгоритма генерации взволнованной поверхности на вычислительную модель
 Модель АРСС реализована в программном комплексе, работающем по принципу
 вычислительного конвейера, в котором каждое звено применяет фильтр к выходным
@@ -1935,6 +1934,54 @@ cite:malewicz2010pregel,seo2010hama. Преимущество конвейера
 Это минимизирует время простоя процессора и других устройств компьютера и
 повышает общую пропускную способность кластера.
 
+*** Основополагающие принципы модели
+Модель конвейера обработки данных строится по следующим принципам, следование
+которым обеспечивает максимальную эффективность программы.
+- В модели отсутствует понятие сообщения, роль сообщения выполняет сам
+  управляющий объект: он может быть передан по сети на другой узел и получить
+  доступ к полям любого другого управляющего объекта на этом узле. Гарантировать
+  существование такого объекта может только логика программы.
+- Управляющий объект представляет собой /сопрограмму/, которая при вызове
+  отправляется в пул задач и затем выполняется планировщиком асинхронно. Тело
+  сопрограммы может содержать произвольное количество вызовов других сопрограмм.
+  Каждый вызов отправляет соответствующую сопрограммы в пул задач и сразу
+  завершается. Управляющие объекты, находящиеся в пуле, могут быть обработаны в
+  любом порядке; это используется планировщиком для извлечения максимального
+  параллелизма из вычислительной системы путем распределения объектов из пула
+  между доступными узлами кластера и ядрами процессора.
+- Асинхронное выполнение управляющих объектов позволяет избежать явной
+  синхронизации после вызова сопрограммы (отправки объекта в очередь);
+  планировщик возвращает поток управления в родительский управляющий объект
+  каждый раз когда какой-либо его дочерний объект завершает выполнение. Такое
+  /взаимодействие/ превращает сопрограмму в некоторого рода обработчик событий,
+  в котором событием является дочерний объект, а обработчиком --- родительский.
+- Механизм обработки событий может быть использован для передачи потока
+  управления не только в родительский, но и любой другой объект, адрес которого
+  известен; взаимодействие с объектами, осуществляемое вразрез с иерархией
+  сильно усложняет поток управления и стек вызовов сопрограмм теряет древовидную
+  структуру. Только логика программы может гарантировать существование в памяти
+  машины двух взаимодействующих объектов. Один из способов обеспечения такой
+  гарантии --- взаимодействие между дочерними объектами одного родительского
+  объекта, которое типично для параллельной пакетной обработки данных. Поскольку
+  такого рода взаимодействие можно осуществить в рамках иерархии через
+  родительский объект, его можно считать оптимизацией, позволяющей избавиться от
+  накладных расходов при передаче данных через промежуточный узел. Для программ,
+  логика которых полностью основана на событиях (например, для серверов и
+  программ с графическим интерфейсом), ситуация иная, и такого рода
+  взаимодействия являются основными.
+- Также, взаимодействия, идущие вразрез с иерархией, усложняют алгоритмы
+  обеспечения отказоустойчивости, применяемый для кластеров машин. Гарантировать
+  нахождение определенного управляющего объекта в памяти соседнего узла
+  невозможно, поскольку узел может выйти из строя прямо во время выполнения
+  соответствующей сопрограммы. В результате, при аварийном завершении
+  управляющего объекта, все его дочерние объекты должны быть выполнены заново.
+  Это подталкивает программиста к созданию
+  - глубоких иерархий сильно связанных управляющих объектов (которые
+    взаимодействуют между собой на одном уровне иерархии), уменьшающих накладные
+    расходы на повторное выполнение сопрограмм;
+  - толстых иерархий слабо связанных управляющих объектов, обеспечивающих
+    максимальную степень параллелизма.
+
 ** Реализация для систем с общей памятью (SMP)
 *** Алгоритм распределения нагрузки
 Наиболее простым и широко применяемым подходом к распределению нагрузки на
diff --git a/phd-diss.org b/phd-diss.org
@@ -1713,55 +1713,6 @@ work.
 
 * High-performance software implementation of ocean wave simulation
 ** Computational model
-*** Governing principles
-- There are no messages in the model, a kernel is itself a message that can
-  be sent over network to another node and directly access any kernel on the
-  local node. It is responsibility of a programmer to ensure that such kernel
-  exist.
-- A kernel is best viewed as a cooperative routine, which is submitted to
-  execution queue upon the call and is executed asynchronously by system
-  scheduler. There can be any number of calls to other subroutines inside
-  routine body. Every call submits corresponding subroutine to execution
-  queue and returns immediately. Kernels in the queue can be executed in any
-  order; this fact is used by system scheduler to exploit parallelism
-  offered by the computer by distributed kernels from the queue across
-  available cluster nodes and processor cores.
-- Asynchronous execution prevents the use of explicit synchronisation after
-  the call to subroutine is made; system scheduler returns control flow to
-  the routine each time one of its subroutine returns. Such *cooperation*
-  transforms each routine which calls subroutines into event handler, where
-  each event is a subroutine and the handler is the routine that called
-  them. In many batch processing programmes control flow enters each routine
-  which calls subroutines at least two times: the first time it occurs upon
-  the call to the routine and the second time happens when control flow
-  returns to the caller after completion of a subroutine.
-- The routine may communicate with any number of local kernels, addresses of
-  which it knows; communication with routines which are not adjacent in the
-  call stack complexifies control flow and call stack looses its tree shape.
-  It is responsibility of a programmer to ensure that communicating kernels
-  are present in memory. One way to ensure this is to perform communication
-  between subroutines which are called from the same routine. Incidentally,
-  it is the usual way of writing batch parallel programmes: each such
-  communication creates a cycle in the call stack graph, and a cycle between
-  different (possibly non-adjacent) layers of kernel hierarchy is redundant,
-  because there are other edges that can be used instead. The situation may
-  be different when the programme is interactive or event-based.
-- The other disadvantage of communication which does no occur along
-  hierarchical links is that it complexifies resiliency algorithms when
-  executed across cluster network. Since it is difficult to ensure that
-  a kernel resides in memory of a neighbour node, because a node may fail in
-  the middle of its execution. Thus, upon failure of a node all of the
-  subroutines which are called from the same routine must be restarted. This
-  encourage a programmer to construct
-   - deep hierarchies of tightly-coupled kernels (which require
-    communication on the same level of hierarchy) to reduce overhead of
-    recomputation,
-   - fat hierarchies of loosely-coupled kernels.
-  Deep hierarchy is not only requirement of technology, it helps optimise
-  communication of cluster nodes limiting it to adjacent nodes.
-- No explicit synchronisation.
-- Local communications between adjacent nodes in the hierarchy.
-
 *** Mapping wavy surface generation algorithm on computational model
 #+name: fig:pipeline
 #+begin_src dot :exports results :file build/pipeline.pdf
@@ -1869,6 +1820,98 @@ digraph {
 #+RESULTS: fig:pipeline
 [[file:build/pipeline.pdf]]
 
+*** Computational model overview
+The core provides classes and methods to simplify development of distributed
+applications and middleware. The main focus of this package is to make
+distributed application resilient to failures, i.e. make it fault tolerant and
+highly available, and do it transparently to a programmer. All classes are
+divided into two layers: the lower layer consists of classes for single node
+applications, and the upper layer consists of classes for applications that run
+on an arbitrary number of nodes. There are two kinds of tightly coupled entities
+in the package --- kernels and pipelines --- which are used together to compose
+a programme.
+
+Kernels implement control flow logic in theirs act and react methods and store
+the state of the current control flow branch. Both logic and state are
+implemented by a programmer. In act method some function is either sequentially
+computed or decomposed into subtasks (represented by another set of kernels)
+which are subsequently sent to a pipeline. In react method subordinate kernels
+that returned from the pipeline are processed by their parent. Calls to act and
+react methods are asynchronous and are made within threads spawned by a
+pipeline. For each kernel act is called only once, and for multiple kernels the
+calls are done in parallel to each other, whereas react method is called once
+for each subordinate kernel, and all the calls are made in the same thread to
+prevent race conditions (for different parent kernels different threads may be
+used).
+
+Pipelines implement asynchronous calls to act and react, and try to make as many
+parallel calls as possible considering concurrency of the platform (no. of cores
+per node and no. of nodes in a cluster). A pipeline consists of a kernel pool,
+which contains all the subordinate kernels sent by their parents, and a thread
+pool that processes kernels in accordance with rules outlined in the previous
+paragraph. A separate pipeline exists for each compute device: There are
+pipelines for parallel processing, schedule-based processing (periodic and
+delayed tasks), and a proxy pipeline for processing of kernels on other cluster
+nodes.
+
+In principle, kernels and pipelines machinery reflect the one of procedures and
+call stacks, with the advantage that kernel methods are called asynchronously
+and in parallel to each other. The stack, which ordinarily stores local
+variables, is modelled by fields of a kernel. The sequence of processor
+instructions before nested procedure calls is modelled by act method, and
+sequence of processor instructions after the calls is modelled by react method.
+The procedure calls themselves are modelled by constructing and sending
+subordinate kernels to the pipeline. Two methods are necessary because calls are
+asynchronous and one must wait before subordinate kernels complete their work.
+Pipelines allow circumventing active wait, and call correct kernel methods by
+analysing their internal state.
+
+*** Governing principles
+- There are no messages in the model, a kernel is itself a message that can be
+  sent over network to another node and directly access any kernel on the local
+  node. It is responsibility of a programmer to ensure that such kernel exist.
+- A kernel is a /cooperative routine/, which is submitted to task pool upon the
+  call and is executed asynchronously by a scheduler. There can be any number of
+  calls to other subroutines inside routine body. Every call submits
+  corresponding subroutine to task pool and returns immediately. Kernels in the
+  pool can be executed in any order; this fact is used by a scheduler to exploit
+  parallelism offered by the computer by distributing kernels from the pool
+  across available cluster nodes and processor cores.
+- Asynchronous execution prevents the use of explicit synchronisation after the
+  call to subroutine is made; system scheduler returns control flow to the
+  routine each time one of its subroutine returns. Such *cooperation* transforms
+  each routine which calls subroutines into event handler, where each event is a
+  subroutine and the handler is the routine that called them. In many batch
+  processing programmes control flow enters each routine which calls subroutines
+  at least twice: the first time it occurs upon the call to the routine and the
+  second time happens when control flow returns to the caller after completion
+  of a subroutine.
+- The routine may communicate with any number of local kernels, addresses of
+  which it knows; communication with routines which are not adjacent in the call
+  stack complexifies control flow and call stack looses its tree shape. It is
+  responsibility of a programmer to ensure that communicating kernels are
+  present in memory. One way to ensure this is to perform communication between
+  subroutines which are called from the same routine. Incidentally, it is the
+  usual way of writing batch parallel programmes: each such communication
+  creates a cycle in the call stack graph, and a cycle between different
+  (possibly non-adjacent) layers of kernel hierarchy is redundant, because there
+  are other edges that can be used instead. The situation may be different when
+  the programme is interactive or event-based.
+- The other disadvantage of communication which does not occur along
+  hierarchical links is that it complexifies resiliency algorithms when executed
+  across cluster network. Since it is difficult to ensure that a kernel resides
+  in memory of a neighbour node, because a node may fail in the middle of its
+  execution. Thus, upon failure of a node all of the subroutines which are
+  called from the same routine must be restarted. This encourages a programmer
+  to construct
+   - deep hierarchies of tightly-coupled kernels (which require communication on
+     the same level of hierarchy) to reduce overhead of recomputation,
+   - fat hierarchies of loosely-coupled kernels.
+  Deep hierarchy is not only requirement of technology, it helps optimise
+  communication of cluster nodes limiting it to adjacent nodes.
+- No explicit synchronisation.
+- Local communications between adjacent nodes in the hierarchy.
+
 ** SMP implementation
 *** Load balancing algorithm
 *** Evaluation
@@ -1905,7 +1948,6 @@ arma.plot_factory_vs_openmp_overlap(
 #+RESULTS: fig:factory-overlap
 [[file:build/factory-vs-openmp-overlap.pdf]]
 
-
 ** MPP implementation
 *** Cluster node discovery algorithm
 *** Fail over algorithm

	arma-thesis
	git clone https://git.igankevich.com/arma-thesis.git
	Log \| Files \| Refs \| LICENSE

phd-diss-ru.org	\|	49	++++++++++++++++++++++++++++++++++++++++++++++++-
phd-diss.org	\|	142	+++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------

arma-thesis