Describe results of the first experiment.

commit 6cbd9920b3f6a0c112e99881932f818fa7ddb19e
parent f0c69d7316432e0e71f5e9547ea280a952234c7a
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Tue, 21 Mar 2017 18:14:18 +0300

Describe results of the first experiment.

Diffstat:
src/body.tex  | 14 ++++++++++++++

1 file changed, 14 insertions(+), 0 deletions(-)
diff --git a/src/body.tex b/src/body.tex
@@ -227,6 +227,7 @@ configuration is presented in Table~\ref{tab:platform-configuration}.
       HDD & ST3250310NS, 7200rpm \\
       No. of nodes & 12 \\
       No. of CPU cores per node & 8 \\
+      Interconnect & 100Mbit ethernet \\
       \bottomrule
   \end{tabular}
 \end{table}
@@ -252,6 +253,19 @@ of the application to a large number of nodes.
 
 \section{Results}
 
+The first experiment showed that in terms of performance there are three
+possible outcomes when all nodes except one fail. The first case is failure of
+all kernels except the principal and its first subordinate. There is no
+communication with other nodes to find the survivor, so it takes the least time
+to recover from the failure. The second case is failure of all kernels except
+any subordinate kernel other than the first one. Here the survivor try to
+communicate with all subordinates that were created before the survivor, so the
+overhead of recovery is larger. The third case is failure of all kernels except
+the last subordinate. Here performance is different only in the test
+environment, because this is the node where output data and logs are gathered.
+So, the overhead is smaller, because there is no communication over the network
+for storing output.
+
 \begin{figure}
 	\centering
 	\includegraphics{test-1}

	hpcs-17-subord
	git clone https://git.igankevich.com/hpcs-17-subord.git
	Log \| Files \| Refs

hpcs-17-subord