commit e7932b0622313f64c15e623ae065e732b921f46d
parent 5f27624c5dcf93e3854ef227ef247beff062cb12
Author: Ivan Gankevich <igankevich@ya.ru>
Date: Fri, 24 Mar 2017 19:49:55 +0300
Revise the first failure scenario.
Diffstat:
src/body.tex | | | 135 | +++++++++++++++++++++++++++++++++++++++++++++---------------------------------- |
1 file changed, 77 insertions(+), 58 deletions(-)
diff --git a/src/body.tex b/src/body.tex
@@ -103,77 +103,96 @@ one node to another (local kernels are not considered).
\subsection{Failure scenarios}
\label{sec:failure-scenarios}
-Now we discuss failure scenarios and how scheduler can handle it. First, define
-clearly relations between sets of daemons and kernels. We named such relations
-in different manner to avoid misunderstanding, because order rule itself is the
-same. There are two intersection hierarchies, the horizontal one~--- daemons
-hierarchy, and vertical one~--- kernels hierarchy. In horizontal
-daemon-to-daemon hierarchy relations defined as master-slave. Thus, node (and,
-accordingly, its daemon) with the nearest IP address to gateway will be a
-master, and every other node will be a slave. This master-slave hierarchy
-introduced to scheduler for better kernels distribution. Vertical hierarchy of
-kernels organized in principal-to-subordinate order. Principal kernel produce
-subordinates and so provides task atomization to archive fault tolerance.
-
-The main purpose of scheduler is to continue or restore execution while failures
-occur in daemons hierarchy. There are three types of such failures.
+To disambiguate hierarchical links between daemon processes and kernels, we
+will use the following naming conventions throughout the text. If the link is
+between two daemon processes, the relationship is \emph{master-slave}. If the
+link is between two kernels, then the relationship is
+\emph{principal-subordinate} (or \emph{parent-child}). Two hierarchies are
+orthogonal to each other in a sense that no kernel may have a link to a daemon,
+and vice versa. Since daemon hierarchy is used to distribute the load on the
+cluster, kernel hierarchy is mapped onto it, and this mapping can be arbitrary.
+It is common to have principal kernel on a slave node with its subordinate
+kernels distributed evenly between all cluster nodes (including the node where
+the principal is located). Both hierarchies can be arbitrarily deep, but
+``shallow'' ones are preferred for highly parallel programmes, as there are
+less number of hops when kernels are distributed between cluster nodes. Since
+there is one-to-one correspondence between daemons and cluster nodes, they are
+used interchangeably in the paper.
+
+The main purpose of the system is to provide continuous execution of kernels in
+the presence of daemon (and consequently node) failures. There are three types
+of such failures.
\begin{enumerate}
- \item Failure of at most one node.
- \item Failure of more than one node but less than total number of nodes.
- \item Failure of all nodes (electricity outage).
+ \item Simultaneous failure of at most one node.
+ \item Simultaneous failure of more than one node but less than total number
+ of nodes.
+ \item Simultaneous failure of all nodes (electricity outage).
\end{enumerate}
+For the sake of simplicity, it is assumed that parallel programme runs on all
+cluster nodes.
-By dividing kernels on principals and subordinate we create restore points. Each
-principal is, mainly, a control unit, with a goal. To archive it, principal make
-portion of task and delegates parts to subordinates. With such delegation
-principal copies itself to each subordinate in order of appearance. To ensure
-correct restoration, when the new partition is ready to deploy as new
-subordinate, principal include in that kernel information about all previously
-generated subordinates, expressed as ordered list of daemons address where
-subordinates transferred. So, then we discuss about failures, we mean that
-daemon is gone, and all kernels of all types at this node break their tasks
-execution process. To resolve failed states scheduler restore kernels using
-existing or newly appeared daemons accordingly to each mentioned scenarios.
-
-Consider the first scenario. In accordance to principal-to-subordinate
-hierarchy, there are two variants of this failure: then principal is gone and
-then any subordinate is gone. Subordinate itself is not a valuable part of
-execution, it is a simple worker. Our scheduler not stored any subordinate
-states, but only principle state. Thus, to restore execution, scheduler finds
-last valid principle state and simply recreate~\ref{fig:sc12} failed
-subordinate on most appropriate daemon. When principle is gone~\ref{fig:sc1} we
-need to restore it only once and only on one node. To archive this limitation,
-each subordinate will try to find any available daemon from its addresses list
-in reverse order. If such daemon exists and available, finding process will
-stop, as current subordinate kernel will assume that the kernel found will take
-principal restoration process.
+By dividing kernels into principals and subordinates we create recovery points.
+Each principal is, mainly, a control unit, with a goal. To achieve it,
+principal divides the task into parts and creates a subordinate to compute each
+of them. The principal copies itself to each subordinate in the order of their
+creation, and includes in each subordinate a list of all node IP addresses to
+which previously created subordinates were sent (a list of \emph{neighbours}).
+When a daemon fails or goes offline, all kernels which reside on the
+corresponding cluster node are considered failed, and recovery process is
+triggered in accordance with the following scenarios.
+
+\paragraph*{Scenario~1} With respect to kernel hierarchy, there are two
+possible variants of this failure: when a principal fails and when a
+subordinate fails,~--- but both of them may or may not reside on the same
+cluster node.
+
+Since a subordinate is a simple worker, rather than a valuable part of
+execution, a copy of it (in initial state) is simply restored from the node
+where its parent is located (Figure~\ref{fig:subordinate-fails}). Each daemon
+maintains a list of kernels, that were sent to a particular subordinate node.
+When the corresponding network connection closes all kernels from the list are
+automatically re-sent to available node, or executed locally of there no
+network connections.
+
+When a principal fails every subordinate has its copy, but we need to restore
+it only once and only on one node to correctly continue programme execution. To
+ensure that the principal is restored only once, each subordinate tries to find
+the first surviving node from the IP address list of neighbours
+(Figure~\ref{fig:principal-fails}). If such node is online, the search stops
+and the subordinate kernel is deleted. If the node is not found, the
+subordinate restores the principal from the copy and deletes itself. Kernel
+deletion is necessary, because the whole computational step, modelled by the
+principal, is re-executed from the principal initial state, and there is no
+simple and reliable way of taking into account the results which was produced
+so far by the subordinates.
\begin{figure}
\centering
- \includegraphics{sc1}
- \caption{First scenario of restoration after principle fails.}
- \label{fig:sc1}
+ \includegraphics{sc12}
+ \caption{Restoration after only one subordinate fails.}
+ \label{fig:subordinate-fails}
\end{figure}
\begin{figure}
\centering
- \includegraphics{sc12}
- \caption{Restoration after only one subordinate fails.}
- \label{fig:sc12}
+ \includegraphics{sc1}
+ \caption{First scenario of restoration after principle fails.}
+ \label{fig:principal-fails}
\end{figure}
-In comparison with first scenario, the second one is more complicate yet
-frequent. While on principal-to-subordinate layer scheduler act
-same~\ref{fig:sc12}, then we move to daemons layer one more variant added. In
-kernel hierarchy principal kernels mostly a dual kernel. For a higher level
-kernels it seems like a subordinate, for rest lower kernels it is a principal.
-Thus, we need to add to our restoration scope only a state of principals
-principle. As a result, we add to variants from first scenario situation,a one
-where principals principal also is gone. Since scheduler through daemons knew
-all kernels state before it begin a restoration process, first it will check
-state of principals principle. If it's gone~\ref{fig:sc12}, all subordinates
-will be started accordingly to hierarchy once again, despite their states.
+\paragraph*{Scenario~2} In comparison to the first scenario, the second one is
+more complicate yet frequent. While on principal-to-subordinate layer scheduler
+act same~\ref{fig:subordinate-fails}, then we move to daemons layer one more
+variant added. In kernel hierarchy principal kernels mostly a dual kernel. For
+a higher level kernels it seems like a subordinate, for rest lower kernels it
+is a principal. Thus, we need to add to our restoration scope only a state of
+principals principle. As a result, we add to variants from first scenario
+situation,a one where principals principal also is gone. Since scheduler
+through daemons knew all kernels state before it begin a restoration process,
+first it will check state of principals principle. If it is
+gone~\ref{fig:subordinate-fails}, all subordinates will be started accordingly
+to hierarchy once again, despite their states.
\begin{figure}
\centering