commit 31d4e557add27b9fc2dbdf74691791c75f26da13
parent 7b76b41395bd7ecbe9df6601da540aa93b4639c0
Author: Ivan Gankevich <igankevich@ya.ru>
Date: Fri, 4 Aug 2017 14:09:53 +0300
Discuss LH model performance.
Diffstat:
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/arma-thesis.org b/arma-thesis.org
@@ -1586,9 +1586,9 @@ Generic form of YW equations is
0, \quad \text{if } \vec{k}\neq0,
\end{cases}
\end{equation}
-where \(\gamma\)\nbsp{}--- ACF of process \(\zeta\), \(\Var{\epsilon}\)\nbsp{}--- white noise
-variance. Matrix form of three-dimensional YW equations, which is used in the
-present work, is
+where \(\gamma\)\nbsp{}--- ACF of process \(\zeta\),
+\(\Var{\epsilon}\)\nbsp{}--- white noise variance. Matrix form of
+three-dimensional YW equations, which is used in the present work, is
\begin{equation*}
\Gamma
\left[
@@ -3562,6 +3562,18 @@ arma.print_openmp_vs_opencl(model_names, row_names)
| Compute velocity potentials | 0.02 | 0.02 | 0.02 | 0.01 | 0.01 |
#+END_SRC
+In contrast to AR model, LH model exhibits the best performance on GPU and the
+worst performance on GPU. The reasons for that are
+- the large number of transcendental functions in its formula which help offset
+ high memory latency,
+- linear memory access pattern which help vectorise calculations and coalesce
+ memory accesses by different hardware threads,
+- and no information dependencies between output grid points.
+Despite the fact that GPU on the test platform is more performant than CPU (in
+terms of floating point operations per second), the overall performance of LH
+model compared to AR model is lower. The reason for that is higher number of
+coefficients needed for LH model to discretise spectrum and eliminate
+periodicity from the realisation.
**** Performance of load balancing algorithm.