hpcs-17-subord

git clone https://git.igankevich.com/hpcs-17-subord.git
Log | Files | Refs

commit 53ba4181703f28fa05f0b59a9b135e25994d8872
parent dca290b508fbf2d47453733a8d52c1dc2b260db5
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Thu, 16 Feb 2017 19:43:49 +0300

Run spell-check.

Diffstat:
src/body.tex | 53+++++++++++++++++++++++++++--------------------------
1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/src/body.tex b/src/body.tex @@ -95,19 +95,19 @@ Java. \label{sec:failure-scenoarios} Now we discuss failure scenarios and how scheduler can handle it. First, define -clearly relations between sets of deamons and kernels. We named such relations -in diffrent manner to avoid misunderstanding, becouse order rule itself is the -same. There are two intersection hierarcies, the horizontal one --- daemons -hierarchy, and vertical one --- kernels hierarchy. In horizontal +clearly relations between sets of daemons and kernels. We named such relations +in different manner to avoid misunderstanding, because order rule itself is the +same. There are two intersection hierarchies, the horizontal one~--- daemons +hierarchy, and vertical one~--- kernels hierarchy. In horizontal daemon-to-daemon hierarchy relations defined as master-slave. Thus, node (and, accordingly, its daemon) with the nearest IP address to gateway will be a master, and every other node will be a slave. This master-slave hierarchy -introduced to scheduler for better kernels distribtion. Vertical hierarchy of +introduced to scheduler for better kernels distribution. Vertical hierarchy of kernels organized in principal-to-subordinate order. Principal kernel produce -subordinates and so provides task atomization to acrhive fault tolerance. +subordinates and so provides task atomization to archive fault tolerance. The main purpose of scheduler is to continue or restore execution while failures -occure in daemons hierarchy. There are three types of such failures. +occur in daemons hierarchy. There are three types of such failures. \begin{itemize} \item Failure of at most one node. @@ -115,24 +115,24 @@ occure in daemons hierarchy. There are three types of such failures. \item Failure of all nodes (electricity outage). \end{itemize} -By diveding kernels on principals and subordinate we create restore points. Each +By dividing kernels on principals and subordinate we create restore points. Each principal is, mainly, a control unit, with a goal. To archive it, principal make -portion of task and deligates parts to subordinates. With such deligation -principal copys itself to each subordinate in order of appearence. To ensure +portion of task and delegates parts to subordinates. With such delegation +principal copies itself to each subordinate in order of appearance. To ensure correct restoration, when the new partition is ready to deploy as new -subordinate, principal include in that kernel information about all previosly -generated subordinates, expressed as ordered list of daemons address where subordinates -transfered. So, then we discuss about failures, we mean that daemon is gone, and -all kernels of all types at this node break their tasks execution process. To -resolve failed states scheduler restore kernels using exisitng or newly appeared -daemons accordingly to each mentioned scenarios. +subordinate, principal include in that kernel information about all previously +generated subordinates, expressed as ordered list of daemons address where +subordinates transferred. So, then we discuss about failures, we mean that +daemon is gone, and all kernels of all types at this node break their tasks +execution process. To resolve failed states scheduler restore kernels using +existing or newly appeared daemons accordingly to each mentioned scenarios. Consider first scenario. In accordance to principal-to-subordinate hierarchy, there are two variants of this failure: then principal was gone and then any -subordinate was gone. Subordinate itself is not a valueble part of execution, it +subordinate was gone. Subordinate itself is not a valuable part of execution, it is a simple worker. Our scheduler not stored any of subordinate, but only principle state. Thus, to restore execution, scheduler use principle to simply -recrate failed subordinate on most appropriate daemon. When principle is gone we +recreate failed subordinate on most appropriate daemon. When principle is gone we need to restore it only once and only on one node. To archive this limitation, each subordinate will try to find any available daemon from addresses list in reverse order. If such daemon exists and available, finding process will stop, @@ -140,7 +140,8 @@ as current subordinate kernel will assume the found kernel will take principal restoration process. \begin{itemize} - \item + \item +\end{itemize} \section{Evaluation} @@ -150,26 +151,26 @@ restoration process. There are two scenarios of failures. Failure of more than one node at a time and electricity outage. In the first scenario failure is handled by sending a -list previous ip addresses to the subsequent kernels in the batch. Then if +list previous IP addresses to the subsequent kernels in the batch. Then if subordinate node and its master fail simultaneously, the surviving subordinate -nodes scan all of the ip addresses they received until they find alive node and +nodes scan all of the IP addresses they received until they find alive node and parent is revived on this node. -We believe that kernel coordinates and inter dependecies is enough to mitigate +We believe that kernel coordinates and inter dependencies is enough to mitigate any type of failure: given that at least one node survives, all applications -continue their ececution in possibly degraded state. However it requires +continue their execution in possibly degraded state. However it requires duplicating all parents in the hierarchy on the subordinate node. Only electricity outage requires writing data to disk other failures can be mitigated by duplicating kernels in memory. The only purpose of kernel hierarchy is to provide fail over for kernels. The -only purpose of daemon heirarchy is to provide load balancing and automatically -reconfigurable topology and to reduce the number of sinultaneous connections. +only purpose of daemon hierarchy is to provide load balancing and automatically +reconfigurable topology and to reduce the number of simultaneous connections. This topology reduces the number of simultaneous connections, thus preventing network overload. This topology is used to distribute the load from the current node to its neighbours by simply iterating over all directly connected daemons. -Transmitting ip addresses of previous nodes is an optimisation over mapping to +Transmitting IP addresses of previous nodes is an optimisation over mapping to only linear hierarchies, that is hierarchies where only one subordinate is allowed at any given point of time.