iccsa-17-ascheduler

Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark
git clone https://git.igankevich.com/iccsa-17-ascheduler.git
Log | Files | Refs

asched-iccsa17.tex (29213B)


      1 
      2 %%%%%%%%%%%%%%%%%%%%%%% file typeinst.tex %%%%%%%%%%%%%%%%%%%%%%%%%
      3 %
      4 % This is the LaTeX source for the instructions to authors using
      5 % the LaTeX document class 'llncs.cls' for contributions to
      6 % the Lecture Notes in Computer Sciences series.
      7 % http://www.springer.com/lncs       Springer Heidelberg 2006/05/04
      8 %
      9 % It may be used as a template for your own input - copy it
     10 % to a new file with a new name and use it as the basis
     11 % for your article.
     12 %
     13 % NB: the document class 'llncs' has its own and detailed documentation, see
     14 % ftp://ftp.springer.de/data/pubftp/pub/tex/latex/llncs/latex2e/llncsdoc.pdf
     15 %
     16 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     17 
     18 
     19 \documentclass[runningheads,a4paper]{llncs}
     20 
     21 \usepackage{amssymb}
     22 \setcounter{tocdepth}{3}
     23 \usepackage{graphicx}
     24 \usepackage{booktabs}
     25 \usepackage{numprint}
     26 \usepackage{url}
     27 \usepackage{cite}
     28 %\urldef{\mailsa}\path|{alfred.hofmann, ursula.barth, ingrid.haas, frank.holzwarth,|
     29 %\urldef{\mailsb}\path|anna.kramer, leonie.kunz, christine.reiss, nicole.sator,|
     30 %\urldef{\mailsc}\path|erika.siebert-cole, peter.strasser, lncs}@springer.com|    
     31 \newcommand{\keywords}[1]{\par\addvspace\baselineskip
     32 \noindent\keywordname\enspace\ignorespaces#1}
     33 
     34 \begin{document}
     35 
     36 \mainmatter  % start of an individual contribution
     37 
     38 % first the title is needed
     39 \title{Distributed data processing on microcomputers with Ascheduler and Apache Spark}
     40 
     41 % a short form should be given in case it is too long for the running head
     42 \titlerunning{Distributed data processing on microcomputers}
     43 
     44 % the name(s) of the author(s) follow(s) next
     45 %
     46 % NB: Chinese authors should write their first names(s) in front of
     47 % their surnames. This ensures that the names appear correctly in
     48 % the running heads and the author index.
     49 %
     50 \author{Vladimir Korkhov\inst{1} \and Ivan Gankevich\inst{1} \and Oleg Iakushkin\inst{1} \and Dmitry Gushchanskiy\inst{1} \and Dmitry Khmel\inst{1} \and Andrey Ivashchenko\inst{1}
     51  \and Alexander Pyayt\inst{2} \and Sergey Zobnin\inst{2} \and Alexander Loginov\inst{2}}
     52 
     53 \authorrunning{V.Korkhov et.al.}
     54 % (feature abused for this document to repeat the title also on left hand pages)
     55 
     56 % the affiliations are given next; don't give your e-mail address
     57 % unless you accept that it will be published
     58 \institute{Saint Petersburg State University,\\
     59 7/9 Universitetskaya nab., St. Petersburg, 199034, Russia\\
     60 \email{v.korkhov@spbu.ru}
     61 \and
     62 Siemens LLC, St. Petersburg, Russia
     63 %\mailsa\\
     64 %\mailsb\\
     65 %\mailsc\\
     66 %\url{http://www.springer.com/lncs}
     67 }
     68 
     69 %
     70 % NB: a more complex sample for affiliations and the mapping to the
     71 % corresponding authors can be found in the file "llncs.dem"
     72 % (search for the string "\mainmatter" where a contribution starts).
     73 % "llncs.dem" accompanies the document class "llncs.cls".
     74 %
     75 
     76 %\toctitle{Lecture Notes in Computer Science}
     77 %\tocauthor{Authors' Instructions}
     78 \maketitle
     79 
     80 
     81 \begin{abstract}
     82 
     83 Modern architectures of data acquisition and processing often consider low-cost and low-power devices that can be bound together to form a distributed infrastructure. In this paper we overview possibilities to organize a distributed computing testbed based on microcomputers similar to Raspberry Pi and Intel Edison. The goal of the research is to investigate and develop a scheduler for orchestrating distributed data processing and general purpose computations on such unreliable and resource-constrained hardware. Also we consider integration of the scheduler with well-known distributed data processing framework Apache Spark. We outline the project carried out in collaboration with Siemens LLC to compare different configurations of the hardware and software deployment and evaluate performance and applicability of the tools to the testbed.
     84 
     85 \keywords{microcomputers, scheduling, Apache Spark, Raspberry Pi, fault tolerance, high availability}
     86 \end{abstract}
     87 
     88 
     89 \section{Introduction}
     90 
     91 The problem of building distributed computing infrastructures for data collection and processing has been around for many years. One of the well-known technologies for building large-scale computing infrastructures is grid computing. It provides means to connect heterogeneous, dynamic resources into a single metacomputer. However, being focused on high-performance computing systems, grid technologies do not suit well other classes of basic hardware. One of such examples are low-performance, low-cost unreliable microcomputers similar to Raspberry Pi or Intel Edison, sometimes also called System-on-Chip (SoC) devices. To be able to execute distributed applications over a set of such devices extensive fault-tolerance support is needed along with low resource usage profile of the middleware.
     92 
     93 In this paper we discuss an approach to orchestrate distributed computing and data processing on microcomputers with help of custom scheduler focused on fault tolerance and dynamic rescheduling of computational kernels that represent the application. This scheduler, which is named Ascheduler, provides its own low-level API to create and manage computational kernels. Currently the Ascheduler is a closed-source project built on the ideas and approaches presented in ~\cite{gankevich2015subordination,gankevich2016factory,gankevich2016nonstop}.
     94 
     95 In addition, the scheduler has been integrated into Apache Spark~\cite{spark} data processing framework instead of the default scheduler used by Spark. This opened possibilities to use a wide range of existing Spark-based programs on the underlying microcomputer infrastructure controlled by the Ascheduler.
     96 
     97 The project aimed to solve the following main tasks:
     98 \begin{itemize}
     99     \item Develop automatic failover and high-availability mechanisms for computer system.
    100     \item Develop automatic elasticity mechanism for computer system.
    101     \item Enable adjusting application algorithm precision taking into account current number of healthy cluster nodes.
    102     \item Adjust load distribution taking into account actual and heterogeneous monitoring data from cluster nodes.
    103     \item Adjust micro-kernel execution order to minimise peak memory footprint of cluster nodes.
    104 \end{itemize}
    105 
    106 The task of data processing on resource-constrained and unreliable hardware emerges within the framework of sensor real-time near-field data processing. The implementation of the system, allowing to carry out the processing in the field, will allow one to quickly respond to sudden changes in sensor readings and reduce the time of decision-making. The implementation of general-purpose computations in such a system allows one to use the same hardware and software system for a diverse high-tech equipment.
    107 
    108 The paper is organised as follows: Section 2 presents an overview of related work on using microcomputers for building distributed data processing systems with Hadoop and Spark; Section 3 presents the architecture of our solution; Section 4 explains how Ascheduler is integrated with Apache Spark; Section 5 presents experimental evaluation; Section 6 discusses the results and Section 7 concludes the paper.
    109 
    110 \section{Related work}
    111 
    112 There are a number of publications which report on successful deployments of Hadoop and Spark on various resource-constrained platforms:
    113 \begin{itemize}
    114     \item Hadoop on Raspberry Pi~\cite{cox2014iridis};
    115     \item Hadoop on Raspberry Pi~\cite{fox2015raspberry};
    116     \item Spark on Raspberry Pi~\cite{hajji2016understanding};
    117     \item Spark on Cubieboard~\cite{kaewkasi2014study}.
    118 \end{itemize}
    119 These papers outline common problems and solutions when running Hadoop/Spark on resource-constrained systems. These are:
    120 \begin{itemize}
    121     \item large memory footprint problems,
    122     \item too slow/resource-hungry Java VM,
    123     \item overheating problems.
    124 \end{itemize}
    125 
    126 These works do not report any particular problem with Java on resource-constrained platforms and all of them use standard JRE. Neither they report any overheating or large memory footprint problems (although, Raspberry Pi, for example, does not have a cooler). However, all the papers deal with system boards in laboratory or similar environments, where these problems are non-existent. Additionally, the authors run only simple tests to demonstrate that the system is working, and no production-grade application is studied nor large-scale performance tests performed. Using Java and standard JRE for scheduler development seems rational for simple workloads, however, large workloads may require additional boards to cope with memory footprint or boost processing power.
    127 
    128 \section{Architecture}
    129 \subsection{Architecture overview}
    130 The core concepts and architecture used for the implementation of Ascheduler are described in detail in~\cite{gankevich2015subordination,gankevich2016factory,gankevich2016nonstop}. Here we summarise the most important aspects relevant to the current testbed implementation.
    131 
    132 To solve the problem of fault-tolerance of slave cluster nodes we use a simple restart: try to re-execute the task from the failed node on a healthy one. To solve the problem of high-availability of the master node we use replication: copy minimum necessary amount of state to restart the task on the backup node. When the master node fails, its role is delegated to the backup node, and task execution continues. When the backup node fails, the master node restarts the current stage of the task. The most important feature of the approach used in Ascheduler is to ensure master node fault-tolerance without any external controller (e.g.~Zookeeper in Hadoop/Spark ecosystem).
    133 
    134 Cluster nodes are combined into a tree hierarchy that is used to uniquely determine the master, backup and slave nodes roles without a conflict~\cite{gankevich2016factory}.
    135 
    136 Each node may perform any combination of roles at the same time, but can not be both master and backup. The initial construction of the hierarchy is carried out automatically, and the node's position in the hierarchy is solely determined by the position of its IP-addresses in a subnet.
    137 
    138 When any cluster node fails or a new one joins the cluster, the hierarchy is rebuilt automatically.
    139 
    140 The elasticity of the computer system is provided by dividing each task on a large number of subtasks (called micro-kernels), between which hierarchical links are established. All micro-kernels are processed asynchronously, which makes it possible to distribute them on the cluster nodes and processor cores, balancing the load. Typically, the amount of micro-kernels in a problem exceeds the total number of nodes/cores in the cluster, so the order of their processing can be optimised so as to minimise memory footprint, or to minimise power consumption by grouping all of the micro-kernels on a small number of nodes, or to ensure the maximum speed of task execution, distributing micro-kernels across all nodes in the cluster. If the cluster capacity is not enough to handle the current data flow/volume of data, micro-kernel pools on the cluster nodes overflow, and excessive kernels may be transferred to a more powerful remote server/cluster. The amount of data, that must be replicated to the backup node to ensure the high-availability, equals to the amount of RAM occupied by a kernel, and can be controlled by the programmer.
    141 
    142 Figure~\ref{fig:overview} shows the schematic view of the system.
    143 
    144 %Project structure
    145 %WP1: Development of scheduler core and API
    146 %WP2: Study of communication mechanisms for loT devices and their networking behavior under computational load
    147 %WP3: Analysis of scheduler integration options with Apache Spark
    148 %WP4: Study of GPGPU computational capabilities
    149 %WP5: Development of system monitoring and visualisation tools
    150 
    151 \begin{figure}
    152 \centering
    153 \includegraphics[width=12cm]{fig/overview2.jpg}
    154 \caption{Schematic view.}
    155 \label{fig:overview}
    156 \end{figure}
    157 
    158 \subsection{Hardware}
    159 
    160 Microcomputers used in the testbed:
    161 \begin{itemize}
    162 \item Raspberry Pi 3 Model B (2 pieces)
    163 \item Raspberry Pi 1 
    164 \item Intel Edison
    165 \item Orange Pi (2 pieces)
    166 \end{itemize}
    167 
    168 \subsection{Scheduler core and API}
    169 
    170 The Ascheduler has layered architecture, as discussed in~\cite{gankevich2015subordination,gankevich2016factory,gankevich2016nonstop}:
    171 \begin{itemize}
    172     \item Physical layer. Consists of nodes and direct/routed network links.
    173     \item Daemon layer. Consists of daemon processes residing on cluster nodes and hierarchical (master/slave) links between them.
    174     \item Kernel layer. Consists of kernels and hierarchical (parent/child) links between them.
    175 \end{itemize}
    176 
    177 Master and slave roles are dynamically assigned to daemon processes, any physical cluster node may become master or slave. Dynamic reassignment uses leader election algorithm that does not require periodic broadcasting of messages, and the role is derived from node's IP address. Detailed explanation of the algorithm is provided in~\cite{gankevich2015subordination}.
    178 
    179 Software implementation of Ascheduler consists of three main components (Fig.~\ref{fig:components}):
    180 \begin{itemize}
    181     \item Task scheduler core (which is used to compose distributed applications).
    182     \item Scheduler daemon based on the core.
    183     \item A driver which integrates scheduler into Apache Spark.
    184 \end{itemize}
    185 
    186 \begin{figure}
    187 \centering
    188 \includegraphics[width=8cm]{fig/ascheduler-components.png}
    189 \caption{Scheduler components.}
    190 \label{fig:components}
    191 \end{figure}
    192 
    193 
    194 \subsubsection{Task scheduler core.}
    195 The core provides classes and methods to simplify development of distributed applications and middleware. The main focus of this package is to make distributed application resilient to failures, i.e. make it fault tolerant and highly available, and do it transparently to a programmer.
    196 
    197 All classes are divided into two layers: the lower layer consists of classes for single node applications, and the upper layer consists of classes for applications that run on an arbitrary number of nodes. There are two kinds of tightly coupled entities in the package~--- \emph{kernels} and \emph{pipelines}~--- which are used together to compose a programme.
    198 Kernels implement control flow logic in their \texttt{act} and \texttt{react} methods and store the state of the current control flow branch. Both logic and state are implemented by a programmer. In \texttt{act} method some function is either sequentially computed or decomposed into subtasks (represented by another set of kernels) which are subsequently sent to a pipeline. In \texttt{react} method subordinate kernels that returned from the pipeline are processed by their parent. Calls to act and react methods are asynchronous and are made within threads spawned by a pipeline. For each kernel \texttt{act} is called only once, and for multiple kernels the calls are done in parallel to each other, whereas \texttt{react} method is called once for each subordinate kernel, and all the calls are made in the same thread to prevent race conditions (for different parent kernels different threads may be used).
    199 
    200 Pipelines implement asynchronous calls to \texttt{act} and \texttt{react}, and try to make as many parallel calls as possible considering concurrency of the platform (no. of cores per node and no. of nodes in a cluster). A pipeline consists of a kernel pool, which contains all the subordinate kernels sent by their parents, and a thread pool that processes kernels in accordance with rules outlined in the previous paragraph. A separate pipeline exists for each compute device: There are pipelines for parallel processing, schedule-based processing (periodic and delayed tasks), and a proxy pipeline for processing kernels on other cluster nodes.
    201 
    202 In principle, kernels and pipelines machinery reflect the one of procedures and call stacks, with the advantage that kernel methods are called asynchronously and in parallel to each other. The stack, which ordinarily stores local variables, is modelled by fields of a kernel. The sequence of processor instructions before nested procedure calls is modelled by act method, and sequence of processor instructions after the calls is modelled by react method. The procedure calls themselves are modelled by constructing and sending subordinate kernels to the pipeline. Two methods are necessary because calls are asynchronous and one must wait before subordinate kernels complete their work. Pipelines allow circumventing active wait, and call correct kernel methods by analysing their internal state.
    203 
    204 \subsubsection{Scheduler daemon.}
    205 The purpose of the daemon is to accept tasks from the driver and launch applications in child processes to run these tasks. Each task is wrapped in a kernel, which is used to create a new child process. All subsequent tasks are sent to the newly created process via shared memory pages, and results are sent back via the same interface. The same protocol is used to exchange kernels between parent and child processes and between different cluster nodes. This allows scheduler daemon to distribute kernels between cluster nodes without knowing exact Java classes that implement kernel interface.
    206 
    207 Scheduler daemon is a thin layer on top of the core classes which adds a set of configuration options, automatically discovers other daemons over local area network and launches child processes for each application to process tasks from the driver.
    208 
    209 \subsubsection{Apache Spark integration driver.}
    210 The purpose of the driver is to send Apache Spark tasks to scheduler daemon for execution. The driver connects to an instance of the scheduler daemon via its own protocol (the same protocol that is used to send kernels), wraps each task in a kernel and sends them to the daemon. The driver is implemented using the same set of core classes. This allows testing the driver without a scheduler (replace integration tests with unit tests) as well as using the driver without a scheduler, i.e.~process all kernels locally, on the same node where Spark client runs.
    211 
    212 \subsubsection{Fault tolerance and high availability.}
    213 The scheduler has fault tolerance and high availability built into its low-level core API. Every failed kernel is restarted on healthy node or on its parent node, however, failure is detected only for kernels that are sent from one node to another (local kernels are not considered). High availability is provided by replicating master kernel to a subordinate node. When any of the replicas fails, another one is used in place. Detailed explanation of the fail over algorithm is provided in~\cite{gankevich2016factory}.
    214 
    215 \subsubsection{Security.}
    216 Scheduler driver is able to communicate with scheduler daemons in local area network. Inter-daemon messaging is not encrypted or signed in any way, assuming that local area network is secure. There is also no protection from Internet ``noise''. Submission of the task to a remote cluster can be done via SSH (Secure Shell) connection/tunnel which is \textit{de facto} standard way of communication between Linux/UNIX servers. So, scheduler security is based on the assumption that it is deployed in secure local area network. Every job is run from the same user, as there is no portable way to switch process owner in Java.
    217 
    218 \subsection{Ascheduler integration with Spark}
    219 Starting with the version 2.0, custom schedulers can be integrated in Spark via implementation of three interfaces. For better understanding of Spark classes and their interconnections please refer to Mastering Apache Spark 2.0~\cite{mastering-spark} and source code of Spark classes available at \url{https://github.com/apache/spark}, as sometimes there are useful information in code comments. Class diagram of all implemented Apache Spark interfaces as well as wrapper classes is shown in figure~\ref{fig:spark-int}.
    220 
    221 \begin{figure}
    222 \centering
    223 %\includegraphics[width=12cm]{fig/apache-spark-integration.png}
    224 \includegraphics[width=\linewidth]{fig/apache-spark-integration.eps}
    225 \caption{Apache Spark integration}
    226 \label{fig:spark-int}
    227 \end{figure}
    228 
    229 \subsection{Communication}
    230 The aim of the project was to build a wireless microcomputer cluster. To create a Wi-Fi based ad hoc network mesh we have chosen a protocol with a driver and API: B.A.T.M.A.N. (Better Approach To Mobile Adhoc Networking mesh protocol)~\cite{BATMAN}. B.A.T.M.A.N. helps organizing and routing wireless ad-hoc networks that are unstructured, dynamically change their topology, and are based on an inherently unreliable medium. Additionally, B.A.T.M.A.N. provides means to collect the knowledge about the network topology, state and quality of the links~--- this information is used by Ascheduler to make scheduling decisions aware of physical network topology and links.
    231 
    232 
    233 \section{Creating Apache Spark applications for running with Ascheduler}
    234 Apache Spark connects to Ascheduler via an implementation of interfaces for custom schedulers. Ascheduler works with Spark version 2.0.2 only. Since Ascheduler integration required access to classes and interfaces considered private in Apache Spark, work of Ascheduler with another versions of Spark is not guaranteed.
    235 
    236 Ascheduler integration with Spark has been implemented in a way that allows using Spark functionality disregarding the choice of the scheduler. If Spark is used with several schedulers, the user might want to explicitly choose the scheduling mode. It can be done by creating \texttt{SparkContext} from \texttt{SparkConf} with method \texttt{setMaster(masterURL)} invoked. Here \texttt{masterURL} corresponds to particular scheduler with parameters. For Ascheduler string value \texttt{ascheduler} could be used for the cluster mode and \texttt{ascheduler-local}~--- for the local mode. Spark driver for Ascheduler has more masterURL options, because of some hard-coded Spark limitations that have to be bypassed:
    237 \begin{itemize}
    238     \item \texttt{local-ascheduler} for using cluster Ascheduler from Spark shell
    239     \item \texttt{local-ascheduler-local} for using local Ascheduler from Spark shell
    240     \item \texttt{local[O\_O]-ascheduler-local} for using Spark Streaming with Ascheduler in local version.
    241 \end{itemize}
    242 
    243 Spark programs running on Ascheduler were tested both on local and cluster versions. Spark with Ascheduler supports a wide range of standard operations and functions, such as:
    244 \begin{itemize}
    245     \item running both in Spark shell and as standalone applications;
    246     \item operating on Resilient Distributed Datasets (RDDs): mapping, reducing, grouping operations;
    247     \item partition-wise transformations on RRDs: controllable re-partitioning, shuffling, persisting RDDs, calling functions for partitions;
    248     \item Multi-RDD operations: union, subtracting, zipping one RDD with another;
    249     \item Broadcasting shared variables among executors;
    250     \item Accumulators and task metrics based on them;
    251     \item Spark Streaming with rerunning nodes (master included) in case of failure.
    252 \end{itemize}
    253 The work of Spark with Ascheduler and any of Spark packages except Spark Streaming is not guaranteed. With those exceptions, any Spark application is expected to work with Ascheduler as a task scheduling base.
    254 
    255 \section{Evaluation}
    256 
    257 The application used for evaluation is an example of real-time micro-batch processing using Ascheduler and Apache Spark. The application consists of two entities: a periodic signal generator and its processor. The generator creates batches of values of a superposition of harmonic signals and sends them for processing via a network socket and for output via a websocket. The processor receives the batches from the raw socket, applies adaptive Fast Fourier Transform (FFT) on the signal and sends the result into output via a WebSocket. Both outputs are available on the system monitoring page.
    258 
    259 In this experiment we benchmark two implementations of FFT demo application on two platforms using two schedulers (fig.~\ref{fig:comparison}). The first implementation is based on Spark Streaming API, the second is based on Ascheduler API. The first platform (left column) is Intel Edison, the second (right column) is commodity Intel Core i5. The first scheduler is Spark Standalone in local mode, the second scheduler is Ascheduler in local mode. Cluster versions are not benchmarked in this experiment. In each run demo application computes spectrum of 25KHz signal in real time for 5 minutes. Time of each spectrum computation is recorded as a point in a graph. Since demo application automatically downsamples input signal when processing is slow, we measure overall throughput by dividing the number of processed points by time taken to process them. The results are presented in fig.~\ref{fig:comparison} for each run and summarised in table~\ref{fig:comparison-table}.
    260 
    261 \begin{figure}
    262 \centering
    263 %\includegraphics[width=11cm]{fig/plot-bench-all.png}
    264 \includegraphics{fig/plot-bench-all.pdf}
    265 \caption{Comparing performance of Ascheduler and Spark schedulers.}
    266 \label{fig:comparison}
    267 \end{figure}
    268 
    269 \begin{table}
    270 \centering
    271 \begin{tabular}{lllr}
    272     \toprule
    273     Platform & Scheduler & API & Average throughput, points/s \\
    274     \midrule
    275     Intel Edison & Spark & Spark & \numprint{375} \\
    276     Intel Edison & Ascheduler & Spark & \numprint{995} \\
    277     Intel Edison & Ascheduler & Ascheduler & \numprint{517676} \\
    278     Intel Core i5-4200H & Spark & Spark & \numprint{487594} \\
    279     Intel Core i5-4200H & Ascheduler & Spark & \numprint{511618} \\
    280     Intel Core i5-4200H & Ascheduler & Ascheduler & \numprint{5046540} \\
    281     \bottomrule
    282 \end{tabular}
    283 %\includegraphics[width=11cm]{fig/comparison-table.png}
    284 \caption{Comparing performance of Ascheduler and Spark schedulers.}
    285 \label{fig:comparison-table}
    286 \end{table}
    287 
    288 
    289 \section{Discussion}
    290 Graphs show that Spark API is incapable of processing 25KHz input signal on Intel Edison platform. Ascheduler scheduler outperforms Spark standalone by a factor of 3 on Intel Edison but still more performance is needed to process 25KHz signal. Direct use of Ashcheduler API on Intel Edison finally solves the problem, allowing to process 500KHz input signal.
    291 On commodity Intel Core i5 platform there is no significant difference between performance of Spark standalone scheduler and Ascheduler when using Spark API, however, direct use of Ascheduler API gives tenfold increase in performance: it is capable of processing 5GHz input signal.
    292 
    293 \section{Conclusions}
    294 
    295 The following was achieved as the final outcomes of the project:
    296 \begin{itemize}
    297 \item Ascheduler~--- fault-tolerant scheduler implemented in Java, running standalone or with Apache Spark (with Spark Streaming supported)
    298 \item Master-node fault tolerance is supported by Ascheduler.
    299 \item Dynamic resource discovery, composition and re-configuration of distributed cluster.
    300 \item Optimised for running on unreliable and resource-constrained microcomputer hardware.
    301 \item Running in heterogeneous and dynamic hardware and networking environment.
    302 \item Integrated microcomputer and cluster monitoring API.
    303 \item Transparent monitoring and visualization with web-based UI.
    304 \item Distributed FFT application (with GPGPU support if available) with streaming input and dynamic graphical output.
    305 \end{itemize}
    306 
    307 \section*{Acknowledgments}
    308 
    309 The research was supported by Siemens LLC.
    310 
    311 
    312 \bibliographystyle{splncs03}
    313 \bibliography{refs.bib}
    314 
    315 % \begin{thebibliography}{4}
    316 
    317 % \bibitem{gankevich2015subordination} I. Gankevich, Y. Tipikin, and V. Gaiduchok. Subordination: Cluster management without distributed consensus. In International Conference on High Performance Computing and Simulation (HPCS 2015), pages 639–642. IEEE, 2015.
    318 
    319 % \bibitem{gankevich2016factory} I. Gankevich, Y. Tipikin, V. Korkhov, V. Gaiduchok, A. Degtyarev, A. Bogdanov. Factory: Master node high-availability for big data applications and beyond. Lecture Notes in Computer Science, vol. 9787, pp. 379–389. Springer, 2016.
    320 
    321 % \bibitem{gankevich2016nonstop} Gankevich, I., Tipikin, Y., Korkhov, V., Gaiduchok, V. Factory: Non-stop batch jobs without checkpointing (2016) 2016 International Conference on High Performance Computing and Simulation, HPCS 2016, art. no. 7568441, pp. 979-984. DOI: 10.1109/HPCSim.2016.7568441
    322 
    323 % \bibitem{cox2014iridis} Simon J Cox, James T Cox, Richard P Boardman, Steven J Johnston, Mark Scott, and Neil S O’brien. Iridis-pi: a low-cost, compact demonstration cluster. Cluster Computing, 17(2):349–358, 2014.
    324 
    325 % \bibitem{fox2015raspberry} Kenneth Fox, William M Mongan, and Jeffrey Popyack. Raspberry hadoopi: a low-cost, hands-on laboratory in big data and analytics. In SIGCSE, page 687, 2015.
    326 
    327 % \bibitem{hajji2016understanding} Wajdi Hajji and Fung Po Tso. Understanding the performance of low power raspberry pi cloud for big data. Electronics, 5(2):29, 2016.
    328 
    329 % \bibitem{kaewkasi2014study} Chanwit Kaewkasi and Wichai Srisuruk. A study of big data processing constraints on a low-power Hadoop cluster. In Computer Science and Engineering Conference (ICSEC), 2014 International, pages 267–272. IEEE, 2014.
    330 
    331 % \bibitem{spark} Apache Spark official website. URL: http://spark.apache.org/
    332 
    333 % \bibitem{mastering-spark} Mastering Apache Spark 2.0, URL: https://www.gitbook.com/book/jaceklaskowski/mastering-apache-spark/details
    334 
    335 % \bibitem{BATMAN} B.A.T.M.A.N. official web page. URL: https://www.open-mesh.org/projects/open-mesh/wiki
    336 
    337 % \end{thebibliography}
    338 
    339 \end{document}