hpcs-17-subord

git clone https://git.igankevich.com/hpcs-17-subord.git
Log | Files | Refs

commit eb2c73c6f6ef54e4bc257fe8de765e84515e15e6
parent 4039eab5220586126f8e4878b1b423092e7764b1
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Mon, 15 May 2017 18:02:32 +0300

Add reference to the source code. Add algorithm (comment).

Diffstat:
bib/refs.bib | 8+++++++-
main.tex | 1+
src/body.tex | 21+++++++++++++++------
src/tail.tex | 2+-
4 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/bib/refs.bib b/bib/refs.bib @@ -69,7 +69,7 @@ booktitle={Journal of Physics: Conference Series}, volume={78}, number={1}, - pages={012022}, + pages={12--22}, year={2007}, organization={IOP Publishing} } @@ -109,3 +109,9 @@ year={2005}, publisher={Elsevier} } + +@misc{factoryGithub, + title={Factory: A framework for distributed computing}, + author={Ivan Gankevich and Yuri Tipikin}, + howpublished={\url{https://igankevich.github.io/factory/index.html}} +} diff --git a/main.tex b/main.tex @@ -6,6 +6,7 @@ %\usepackage{fixltx2e} \usepackage{url} \usepackage{booktabs} +\usepackage[ruled]{algorithm2e} %\hyphenation{op-tical net-works semi-conduc-tor} diff --git a/src/body.tex b/src/body.tex @@ -233,6 +233,16 @@ computational step, modelled by the principal, is re-executed from the initial state, and there is no simple and reliable way of taking into account partial results which were produced so far by the subordinates. +%\begin{algorithm} +% \KwData{$s$ --- subordinate kernel, $result$ --- \texttt{send} status.} +% \While{neighbours list is not empty \textnormal{\textbf{and}} $result\neq0$}{ +% $n \leftarrow \text{neighbours.front()}$\\ +% $result \leftarrow$ Send $s$ with the copy of its principal to $n$. +% }\\ +% Delete $s$. +% \caption{An algorithm for fail over.\label{alg:failover}} +%\end{algorithm} + %\begin{figure} % \centering % \includegraphics{sc12} @@ -260,9 +270,8 @@ acts the same as in the first scenario, when we move to daemon hierarchy one more possible variant is added. In deep kernel hierarchy a kernel may act as a subordinate and as a principal at the same time. Thus, we need to copy not only direct principal of each subordinate kernel, but also all principals -higher in the hierarchy recursively (fig.~\ref{fig:sc3}). So, the additional -variant is a generalisation of the two previous ones for deep kernel -hierarchies. +higher in the hierarchy recursively. So, the additional variant is a +generalisation of the two previous ones for deep kernel hierarchies. Handling principal failure in a deep kernel hierarchy may involve a lot of overhead, because its failure is detected only when a subordinate finishes its @@ -328,9 +337,9 @@ of recovery process, the whole process is restarted from the beginning. \section{Evaluation} Proposed node failure handling approach was evaluated on the example of -real-world application. The application generates ocean wavy surface in -parallel with specified frequency-directional spectrum. There are two -sequential steps in the programme. The first step is to compute model +real-world application~\cite{factoryGithub}. The application generates ocean +wavy surface in parallel with specified frequency-directional spectrum. There +are two sequential steps in the programme. The first step is to compute model coefficients by solving system of linear algebraic equations. The system is solved in parallel on all cores of the principal node. The second step is to generate wavy surface, parts of which are generated in parallel on all cluster diff --git a/src/tail.tex b/src/tail.tex @@ -51,7 +51,7 @@ kernels upon a failure. With respect to various high-availability cluster projects~\cite{robertson2000linux,haddad2003ha,leangsuksun2005achieving} our -approach has the following advantages. First, is scales with the large number +approach has the following advantages. First, it scales with the large number of nodes, as only point-to-point communication between slave and master node is used instead of broadcast messages (which has been shown in the previous work~\cite{gankevich2015subordination}), hence, the use of several switches and