git clone https://git.igankevich.com/hpcs-17-subord.git
Log | Files | Refs

commit 270e5e051414e1bc8572631ef94158bb4b4a1d17
parent 19eaea90b6231a99d5039643d082a66aa8900864
Author: Ivan Gankevich <igankevich@ya.ru>
Date:   Wed, 22 Mar 2017 12:22:57 +0300

Describe the second experiment results.

src/body.tex | 17+++++++++++++++++
1 file changed, 17 insertions(+), 0 deletions(-)

diff --git a/src/body.tex b/src/body.tex @@ -288,6 +288,23 @@ to recover, and failed kernel is executed on one of the remaining nodes. % TODO insert virtual version of the first experiment +The second experiment showed that overhead of multiple node failure handling +code increases exponentially with the number of nodes +(Figure~\ref{fig:test-3-virt-ndebug-226}), however, its absolute value is small +compared to the programme run time. Exponential increase in overhead is +attributed to the fact that for each subordinate kernel linear search +algorithms are used when sending or receiving it from other node to maintain a +list of its neighbours. When subordinate kernel is sent to remote node, all its +left neighbours IP addresses are added to the neighbours list without +duplication, and the kernel itself is appended to the global internal map which +stores principal kernels and theirs subordinates; when subordinate kernel +returns from the remote node, it is removed from the list of its principal +subordinates (retrieved from the internal map), which also requires linear +search. Performing linear search for each subordinate kernel makes overall +complexity exponential. It can be avoided by replacing lists with sets or maps, +but the total overhead is small, so we deemed this optimisation unnecessary +complication of the source code. + % TODO test-3-phys \begin{figure} \centering