iccsa-16-factory

Factory: Master Node High-Availability for Big Data Applications and Beyond
git clone https://git.igankevich.com/iccsa-16-factory.git
Log | Files | Refs

abstract.tex (1243B)


      1 \begin{abstract}
      2 Master node fault-tolerance is the topic that is often dimmed in the discussion
      3 of big data processing technologies. Although failure of a master node can take
      4 down the whole data processing pipeline, this is considered either improbable or
      5 too difficult to encounter. The aim of the studies reported here is to propose
      6 rather simple technique to deal with master-node failures. This technique is
      7 based on temporary delegation of master role to one of the slave nodes and
      8 transferring updated state back to the master when one step of computation is
      9 complete. That way the state is duplicated and computation can proceed to the
     10 next step regardless of a failure of a delegate or the master (but not both). We
     11 run benchmarks to show that a failure of a master is almost ``invisible'' to
     12 other nodes, and failure of a delegate results in recomputation of only one step
     13 of data processing pipeline. We believe that the technique can be used not only
     14 in Big Data processing but in other types of applications.
     15 
     16 \keywords{parallel computing $\cdot$ Big Data processing $\cdot$ distributed
     17   computing $\cdot$ backup node $\cdot$ state transfer $\cdot$ delegation
     18   $\cdot$ cluster computing $\cdot$ fault-tolerance}
     19 \end{abstract}