hpcs-17-subord

git clone https://git.igankevich.com/hpcs-17-subord.git
Log | Files | Refs

commit 48ca5657878d0a48fa8835e9077aa079d128ffa3
parent e0ffa648d2629d7fc16f02042af1ff9e0dfe1783
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Fri, 12 May 2017 16:12:32 +0300

Add a diagram of the syste architecture.

Diffstat:
.gitignore | 1+
Makefile | 4++++
dot/ppl.dot | 103+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
src/body.tex | 13+++++++++++--
4 files changed, 119 insertions(+), 2 deletions(-)

diff --git a/.gitignore b/.gitignore @@ -2,3 +2,4 @@ build *~ Rplots.pdf *-converted-to.pdf +/dot/*.svg diff --git a/Makefile b/Makefile @@ -3,6 +3,7 @@ build/sc12.pdf \ build/sc1.pdf \ build/sc2.pdf \ build/sc3.pdf \ +build/ppl.pdf \ build/test-1-phys.pdf \ build/test-1-virt.pdf \ build/test-2-phys.pdf \ @@ -42,6 +43,9 @@ build/sc%.pdf: img/sc%.svg --file=$< \ --export-pdf=$@ +build/ppl.pdf: dot/ppl.dot + dot -Tpdf -o $@ $< + build: mkdir -p build diff --git a/dot/ppl.dot b/dot/ppl.dot @@ -0,0 +1,103 @@ +graph Pipeline { + + node [fontname="Times",fontsize=8,margin="0.01,0.01",shape=box,height="0.1",width="0.1"] + graph [fontname="Times",fontsize=8,nodesep="0.07",ranksep="0.05",rankdir="LR",margin="-0.1,-0.1"] + edge [arrowsize=0.66] + + subgraph cluster_daemon { + label="Daemon process" + style=filled + color=lightgrey + + factory [label="Factory"] + parallel_ppl [label="Parallel\npipeline"] + io_ppl [label="I/O\npipeline"] + sched_ppl [label="Schedule-based\npipeline"] + net_ppl [label="Network\npipeline"] + proc_ppl [label="Process\npipeline"] + + upstream [label="Upstream\nthread pool"] + downstream [label="Downstream\nthread pool"] + } + + factory--parallel_ppl + factory--io_ppl + factory--sched_ppl + factory--net_ppl + factory--proc_ppl + + subgraph cluster_hardware { + label="Compute devices" + style=filled + color=lightgrey + + cpu [label="CPU"] + core0 [label="Core 0"] + core1 [label="Core 1"] + core2 [label="Core 2"] + core3 [label="Core 3"] + + storage [label="Storage"] + disk0 [label="Disk 0"] + + network [label="Network"] + nic0 [label="NIC 0"] + + timer [label="Timer"] + + } + + core0--cpu + core1--cpu + core2--cpu + core3--cpu + + disk0--storage + nic0--network + + parallel_ppl--upstream + parallel_ppl--downstream + + upstream--{core0,core1,core2,core3} [style="dashed"] + downstream--core0 [style="dashed"] + + io_ppl--core0 [style="dashed"] + io_ppl--disk0 [style="dashed"] + sched_ppl--core0 [style="dashed"] + sched_ppl--timer [style="dashed"] + net_ppl--core0 [style="dashed"] + net_ppl--nic0 [style="dashed"] + proc_ppl--core0 [style="dashed"] + + subgraph cluster_children { + style=filled + color=white + + subgraph cluster_child0 { + label="Child process 0" + style=filled + color=lightgrey + labeljust=right + + app0_factory [label="Factory"] + app0 [label="Child process\rpipeline"] + } + +# subgraph cluster_child1 { +# label="Child process 1" +# style=filled +# color=lightgrey +# labeljust=right +# +# app1_factory [label="Factory"] +# app1 [label="Child process\rpipeline"] +# } + } + + proc_ppl--app0 +# proc_ppl--app1 + + app0_factory--app0 [constraint=false] +# app1_factory--app1 [constraint=false] + +} diff --git a/src/body.tex b/src/body.tex @@ -1,6 +1,7 @@ \section{System architecture} -Our model of computer system has layered architecture: +Our model of computer system has layered architecture +(fig.~\ref{fig:pipeline}): \paragraph{Physical layer} Consists of nodes and direct/routed physical network links. On this layer full network connectivity, i.e. an ability to send @@ -12,7 +13,7 @@ roles are dynamically assigned to daemon processes, any physical cluster node may become a master or a slave. Dynamic reassignment uses leader election algorithm that does not require periodic broadcasting of messages, and the role is derived from node's IP address. Detailed explanation of the algorithm is -provided in~\cite{gankevich2015subordination}. Its strengths is scalability to +provided in~\cite{gankevich2015subordination}. Its strengths are scalability to a large number of nodes and low overhead, which are essential for large-scale high-performance computations, and its weakness is in artificial dependence of node's position in the hierarchy on its IP address, which is not desirable in @@ -79,6 +80,14 @@ are necessary because calls are asynchronous and one must wait before subordinate kernels complete their work. Pipelines allow circumventing active wait, and call correct kernel methods by analysing their internal state. +\begin{figure}% + \centering% + \includegraphics{ppl}% + \caption{Mapping of parent and child process pipelines to compute devices. + Solid lines denote aggregation, dashed lines denote mapping between + logical and physical entities.\label{fig:pipeline}} +\end{figure}% + \section{Resilience to multiple node failures} In our system a node is considered failed if the corresponding network