# waves-16-arma

Simulation of Standing and Propagating Sea Waves with Three-Dimensional ARMA Model
git clone https://git.igankevich.com/waves-16-arma.git
Log | Files | Refs

arma.org (73047B)

      1 # Local Variables:
2 # org-ref-default-bibliography ("refs.bib")
3 # org-latex-image-default-width nil
4 # org-latex-caption-above nil
5 # org-latex-hyperref-template "\\hypersetup{\n pdfauthor={%a},\n pdftitle={%t},\n pdfkeywords={%k},\n pdfsubject={%d},\n pdfcreator={%c},\n pdflang={%L},\n unicode={true}\n}\n\\setdefaultlanguage{%l}\n"
6 # org-export-latex-tables-hline "\\midrule"
7 # org-export-latex-tables-tstart "\\toprule"
8 # org-export-latex-tables-tend "\\bottomrule"
9 # End:
10
11 #+TITLE: Simulation of standing and propagating sea waves with three-dimensional ARMA model
12 #+AUTHOR: Ivan Gankevich, Alexander Degtyarev
13 #+LANGUAGE: en
14 #+LATEX_CLASS: svmult
15 #+LATEX_CLASS_OPTIONS: [graybox]
17 #+LATEX_HEADER_EXTRA: \titlerunning{Simulation of sea waves with ARMA model}
18 #+LATEX_HEADER_EXTRA: \authorrunning{I.\:Gankevich, A.\:Degtyarev}
19 #+OPTIONS: H:5 todo:nil toc:nil
20 #+PROPERTY: header-args:R :results graphics :exports results :family Times
21
22 #+begin_export latex
23 \abstract*{%
24 Simulation of sea waves is a problem appearing in the framework of developing
25 software-based ship motion modelling applications. These applications generally
26 use linear wave theory to generate small-amplitude waves programmatically and
27 determine impact of external excitations on a ship hull. Using linear wave
28 theory is feasible for ocean waves, but is not accurate for shallow-water and
29 storm waves. To cope with these shortcomings we introduce autoregressive
30 moving-average (ARMA) model, which is widely known in oceanography, but rarely
31 used for sea wave modelling. The new model allows to generate waves of arbitrary
32 amplitudes, is accurate for both shallow and deep water, and its software
33 implementation shows superior performance by relying on fast Fourier transform
34 family of algorithms. Integral characteristics of wavy surface produced by ARMA
35 model are verified against the ones of real sea surface. Despite all its
36 advantages, ARMA model requires a new method to determine wave pressures, an
37 instance of which is included in the chapter.%
38 }
39 #+end_export
40
41 * Introduction
42 Studying behaviour of a ship at sea is often based on some model of external
43 excitations\nbsp{}--- any disturbance that displaces the vessel from
44 equilibrium\nbsp{}--- major component of which is wind waves. Currently, the
45 most popular sea wave simulation models are based on the linear expansion of a
46 stochastic moving surface as a system of independent random variables. Such
47 models were studied by St. Denis & Pearson\nbsp{}cite:st1953motions,
48 Rosenblatt\nbsp{}cite:rosenblatt1956random,
49 Sveshnikov\nbsp{}cite:sveshnikov1959determination, and
50 Longuet---Higgins\nbsp{}cite:longuet1957statistical. The most popular model is
51 that of Longuet---Higgins (LH), which approximates propagating sea waves as a
52 superposition of elementary harmonic waves with random phases $$\epsilon_n$$ and
53 random amplitudes $$c_n$$:
54
55   \label{eq-lhmodel}
56   \zeta(x,y,t) = \sum\limits_n c_n \cos(u_n x + v_n y - \omega_n t + \epsilon_n),
57
58 where the wave number $$(u_n,v_n)$$ is continuously distributed on the $$(u,v)$$
59 plane, i.e. the unit area $$du \times dv$$ contains an infinite number of wave
60 numbers. The frequency $$\omega_n$$ associated with wave numbers $$(u_n,v_n)$$
61 is given by a dispersion relation
62 \begin{equation*}
63   \omega_n = \omega(u_n,v_n).
64 \end{equation*}
65 The phase $$\epsilon_n$$ are jointly independent random variables uniformly
66 distributed in the interval $$[0,2\pi]$$.
67
68 Longuet---Higgins showed that under the above conditions, the function
69 $$\zeta(x,y,t)$$ is a three-dimensional steady-state homogeneous ergodic
70 Gaussian field, defined by
71 \begin{equation*}
72   2 E_{\zeta}(u,v) du dv = \sum\limits_{n} c_n^2,
73 \end{equation*}
74 where $$E_{\zeta}(u,v)$$ is two-dimensional spectral density of wave energy.
75
76 Formula\nbsp{}eqref:eq-lhmodel is derived from equation of continuity and
77 equation of motion for incompressible inviscid fluid. For ocean waves
78 incompressibility and isotropy of a fluid is assumed; since the motion of ocean
79 waves is due to gravitational forces, irrotational motion of the fluid is
80 assumed which let us introduce the velocity potential $$\phi$$. Under these
81 assumptions the equation of continuity reduces to Laplace equation:
82 \begin{equation*}
83   \label{eq-continuity}
84   \Delta\phi =
85   \frac{\partial^2{\phi_x}}{\partial{x^2}} +
86   \frac{\partial^2{\phi_y}}{\partial{y^2}} +
87   \frac{\partial^2{\phi_z}}{\partial{z^2}} = 0.
88 \end{equation*}
89 The Laplace equation is linear and its solution can be found using Fourier
90 transforms. Thus, for plane waves a well-known solution is given in the form of
91 a definite integral\nbsp{}cite:kochin1966theoretical:
92 \begin{equation*}
93   \phi(x,z,t) =
94   \int\limits_{0}^{\infty}
95   e^{kz} \left[ A(k,t) \cos{kx} + B(k,t) \sin{kx} \right] dk.
96 \end{equation*}
97
98 A similar, but slightly more complicated solution is obtained for the
99 three-dimensional case. The constants $$A$$ and $$B$$ are determined from the boundary
100 conditions on the surface. In the linear formulation the equation of the wave
101 profile (which is derived from linearised kinematic boundary condition and
102 equation of motion, see sec.\nbsp{}[[#sec-pressures]]) is
103 \begin{align}
104   \zeta(x,t) &= -\frac{1}{g} \frac{\partial{\phi(x,0,t)}}{\partial{t}} \label{eq-integ-zeta}\\
105              &= \int\limits_{0}^{\infty}
106                 \left[ \frac{\partial{A(k,t)}}{\partial{t}} \cos{kx} + \frac{\partial{B(k,t)}}{\partial{t}} \sin{kx} \right] dk \nonumber \\
107              &= \int\limits_{0}^{\infty} C_t(k,t) \cos\left(kx + \epsilon(k,t)\right).\nonumber
108 \end{align}
109 If we set $$c_n = C_t(k_n, t) dk$$, then wave model\nbsp{}eqref:eq-lhmodel may
110 be associated with an approximation of integral\nbsp{}eqref:eq-integ-zeta.
111
112 Although, LH model is based on simple linear wave theory and has straightforward
113 computational algorithm, it has some serious shortcomings.
114 - LH model is designed to represent a stationary Gaussian field. Normal
115   distribution of the simulated process\nbsp{}eqref:eq-lhmodel is a consequence
116   of the central limit theorem: its application to the analysis of
117   storm or shallow water waves represents a significant challenge.
118 - LH model is periodic and need a large set of frequencies to perform long-term
119   simulation.
120 - In the numerical implementation of the LH model, it appears that convergence
121   rate of\nbsp{}eqref:eq-lhmodel is slow. This leads to a skewed simulated wave
122   energy spectrum and skewed cumulative distribution functions of various wave
123   parameters (heights, lengths, etc.). This problem becomes especially
124   significant when simulating complex sea waves that have a wide spectrum with
125   multiple peaks.
126
127 The latter point becomes particularly critical in long-term numerical
128 simulation. In a time domain computation of the responses of a vessel in a
129 random seaway, the repeated evaluation of the apparently simple
130 eq.\nbsp{}eqref:eq-lhmodel at hundreds of points on the hull for thousands of
131 time steps becomes a major factor determining the execution speed of the
132 code\nbsp{}cite:beck2001modern. So, finding a less computationally intensive
133 method for modelling ocean waves has the potential to increase performance of
134 long-term simulation.
135
136 * Related work
137 ** Ocean wave modelling
138 Another approach to simulating sea waves involves representing stochastic moving
139 surface as a linear transformation of white noise with memory, which allows to
140 model stationary ergodic Gaussian random process with given correlation
141 characteristics\nbsp{}cite:box1976time. The first attempts to model
142 two-dimensional disturbances were undertaken
143 in\nbsp{}cite:kostecki1972stochastic, which resulted in the development of the
144 resonance theory of wind waves, and the formal mathematical framework was
145 developed in\nbsp{}cite:rozhkov1990,gurgenidze1988 --- the authors built a
146 one-dimensional model of ocean waves based on autoregressive-moving
147 average (ARMA) model.
148
149 One-dimensional ARMA model does not have some of the LH model deficiencies: it
150 is both computationally efficient and requires less number of coefficients to
151 converge. In\nbsp{}cite:spanos1982arma ARMA model is used to generate time
152 series spectrum of which is compatible with Pierson---Moskowitz (PM)
153 approximation of ocean wave spectrum. The authors carry out experiments for
154 one-dimensional AR, MA and ARMA models. They mention excellent agreement between
155 target and initial spectra and higher performance of ARMA model compared to
156 models based on summing large number of harmonic components with random phases.
157 They also mention that in order to reach agreement between target and initial
158 spectrum MA model require lesser number of coefficients than AR model.
159 In\nbsp{}cite:spanos1996efficient the authors generalise ARMA model coefficients
160 determination formulae for multi-variate case.
161
162 AR model was successfully applied to predict evolution of propagating wave
163 profiles based on instantaneous wave recordings. In\nbsp{}cite:fusco2010short AR
164 model is used to predict swell waves to control wave-energy converters (WEC) in
165 real-time. In order to make WEC more efficient its internal oscillator frequency
166 should match the one of ocean waves. The authors treat wave elevation as time
167 series and compare performance of AR model, neural networks and cyclical models
168 in forecasting time series future values. AR model gives the most accurate
169 prediction of low-frequency swell waves for up to two typical wave periods. It
170 is an example of successful application of AR process to ocean wave modelling.
171
172 The feature that distinguishes present work with respect to afore-mentioned ones
173 is the study of three-dimensional (2D in space and 1D in time) ARMA model, which
174 is mostly a different problem.
175 1. Yule---Walker system of equations, which are used to determine AR
176    coefficients, has complex block-block structure.
177 2. Optimal model order (in a sense that target spectrum agrees with initial) is
178    determined manually.
179 3. Instead of PM spectrum, analytic formulae for standing and propagating
180    waves ACF are used as the model input.
181 4. Three-dimensional wavy surface should be compatible with real ocean surface
182    not only in terms of spectral characteristics, but also in the shape of wave
183    profiles. So, model verification includes distributions of various parameters
184    of generated waves (lengths, heights, periods etc.).
185 Multi-dimensionality of investigated model not only complexifies the task, but
186 also allows to carry out visual validation of generated wavy surface. It is the
187 opportunity to visualise output of the programme that allowed to ensure that
188 generated surface is compatible with real ocean surface, and is not abstract
189 multi-dimensional stochastic process that is real only statistically.
190
191 ** Pressure field determination formulae
192 **** Small amplitude waves theory
193 In\nbsp{}cite:stab2012,degtyarev1998modelling,degtyarev1997analysis the authors
194 propose a solution for inverse problem of hydrodynamics of potential flow within
195 the framework of small-amplitude wave theory (under assumption that wave length
196 is much larger than height: $$\lambda{}\gg{}h$$). In that case inverse problem
197 is linear and reduces to Laplace equation with mixed boundary conditions, and
198 equation of motion is solely used to determine pressures for calculated velocity
199 potential derivatives. The assumption of small amplitudes means the slow decay
200 of wind wave coherence function, i.e.\nbsp{}small change of local wave number in
201 time and space compared to the wavy surface elevation ($$z$$ coordinate). This
202 assumption allows to calculate elevation $$z$$ derivative as $$\zeta_z=k\zeta$$,
203 where $$k$$ is wave number. In two-dimensional case the solution is written
204 explicitly as
205 \begin{align}
206     \left.\frac{\partial\phi}{\partial x}\right|_{x,t}= &
207         -\frac{1}{\sqrt{1+\alpha^{2}}}e^{-I(x)}
208             \int\limits_{0}^x\frac{\partial\dot{\zeta}/\partial
209                 z+\alpha\dot{\alpha}}{\sqrt{1+\alpha^{2}}}e^{I(x)}dx,\label{eq-old-sol-2d}\\
210     I(x)= & \int\limits_{0}^x\frac{\partial\alpha/\partial z}{1+\alpha^{2}}dx,\nonumber
211 \end{align}
212 where $$\alpha$$ is wave slope. In three-dimensional case solution is written in
213 the form of elliptic partial differential equation (PDE):
214 \begin{align*}
215     & \frac{\partial^2 \phi}{\partial x^2} \left( 1 + \alpha_x^2 \right) +
216     \frac{\partial^2 \phi}{\partial y^2} \left( 1 + \alpha_y^2 \right) +
217     2\alpha_x\alpha_y \frac{\partial^2 \phi}{\partial x \partial y} + \\
218     & \left(
219         \frac{\partial \alpha_x}{\partial z} +
220         \alpha_x \frac{\partial \alpha_x}{\partial x} +
221         \alpha_y \frac{\partial \alpha_x}{\partial y}
222     \right) \frac{\partial \phi}{\partial x} + \\
223     & \left(
224         \frac{\partial \alpha_y}{\partial z} +
225         \alpha_x \frac{\partial \alpha_y}{\partial x} +
226         \alpha_y \frac{\partial \alpha_y}{\partial y}
227     \right) \frac{\partial \phi}{\partial y} + \\
228     & \frac{\partial \dot{\zeta}}{\partial z} +
229     \alpha_x \dot{\alpha_x} + \alpha_y \dot{\alpha_y} = 0.
230 \end{align*}
231 The authors suggest transforming this equation to finite differences and solve
232 it numerically.
233
234 As will be shown in sec.\nbsp{}[[#sec:compare-formulae]]
235 that\nbsp{}eqref:eq-old-sol-2d diverges when attempted to calculate velocity
236 field for large amplitude waves, and this is the reason that it can not be used
237 together with ARMA model, that generates arbitrary amplitude waves.
238
239 **** Linearisation of boundary condition
240 :PROPERTIES:
241 :CUSTOM_ID: linearisation
242 :END:
243
244 LH model allows to derive an explicit formula for velocity field by linearising
245 kinematic boundary condition. Velocity potential formula is written as
246 \begin{equation*}
247 \phi(x,y,z,t) = \sum_n \frac{c_n g}{\omega_n}
248      e^{\sqrt{u_n^2+v_n^2} z}
249      \sin(u_n x + v_n y - \omega_n t + \epsilon_n).
250 \end{equation*}
251 This formula is differentiated to obtain velocity potential derivatives, which
252 are plugged to dynamic boundary condition to obtain pressures.
253
254 * Three-dimensional ARMA process as a sea wave simulation model
255
256 ARMA ocean simulation model defines wavy surface as three-dimensional (two
257 dimensions in space and one in time) autoregressive moving average process:
258 every surface point is represented as a weighted sum of previous in time and
259 space points plus weighted sum of previous in time and space normally
260 distributed random impulses. The governing equation for 3-D ARMA process is
261
262     \zeta_{\vec i}
263     =
264     \sum\limits_{\vec j = \vec 0}^{\vec N}
265     \Phi_{\vec j} \zeta_{\vec i - \vec j}
266     +
267     \sum\limits_{\vec j = \vec 0}^{\vec M}
268     \Theta_{\vec j} \epsilon_{\vec i - \vec j}
269     ,
270     \label{eq-arma-process}
271
272 where $$\zeta$$\nbsp{}--- wave elevation, $$\Phi$$\nbsp{}--- AR process
273 coefficients, $$\Theta$$\nbsp{}--- MA process coefficients,
274 $$\epsilon$$\nbsp{}--- white noise with Gaussian distribution,
275 $$\vec{N}$$\nbsp{}--- AR process order, $$\vec{M}$$\nbsp{}--- MA process order,
276 and $$\Phi_{\vec{0}}\equiv{0}$$, $$\Theta_{\vec{0}}\equiv{0}$$. Here arrows
277 denote multi-component indices with a component for each dimension. In general,
278 any scalar quantity can be a component (temperature, salinity, concentration of
279 some substance in water etc.). Equation parameters are AR and MA process
280 coefficients and order.
281
282 Any ARMA process can be uniquely represented as either MA or AR process of
283 infinite order\nbsp{}cite:gurgenidze1988, and the parameters of the spectral
284 representation are defined by the rule of division of power series (in a
285 rational factorized form\nbsp{}cite:rozhkov1990):
286 \begin{equation*}
287   S(\omega) =
288   \frac{\Delta\sigma^2}{\pi}
289   \frac{\prod\limits_m (1 - z_m e^{-im\omega\Delta})(1 - z_m e^{im\omega\Delta})}
290        {\prod\limits_n (1 - p_n e^{-in\omega\Delta})(1 - p_n e^{in\omega\Delta})},
291 \end{equation*}
292 where $$z_m$$ and $$p_n$$ are the zeros of numerator (MA), and denominator (AR),
293 respectively, which form a pair of mutually conjugate numbers. If some of the
294 zeros are located near the unit circle, then the spectral density will have
295 pronounced dips.
296
297 ** Autoregressive (AR) process
298 AR process is ARMA process with only one random impulse instead of their
299 weighted sum:
300
301     \zeta_{\vec i}
302     =
303     \sum\limits_{\vec j = \vec 0}^{\vec N}
304     \Phi_{\vec j} \zeta_{\vec i - \vec j}
305     +
306     \epsilon_{i,j,k}
307     .
308     \label{eq-ar-process}
309
310 The coefficients $$\Phi$$ are calculated from auto-covariate function (ACF) via
311 three-dimensional Yule---Walker (YW) equations, which are obtained after
312 multiplying both parts of the previous equation by $$\zeta_{\vec{i}-\vec{k}}$$
313 and computing the expected value. Generic form of YW equations is
314
315     \label{eq-yule-walker}
316     \gamma_{\vec k}
317     =
318     \sum\limits_{\vec j = \vec 0}^{\vec N}
319     \Phi_{\vec j}
320     \text{ }\gamma_{\vec{k}-\vec{j}}
321     +
322     \Var{\epsilon} \delta_{\vec{k}},
324     \delta_{\vec{k}} =
325     \begin{cases}
326         1, \quad \text{if } \vec{k}=0 \\
327         0, \quad \text{if } \vec{k}\neq0,
328     \end{cases}
329
330 where $$\gamma$$\nbsp{}--- ACF of process $$\zeta$$, $$\Var{\epsilon}$$\nbsp{}--- white noise
331 variance. Matrix form of three-dimensional YW equations, which is used in the
332 present work, is
333 \begin{equation*}
334     \Gamma
335     \left[
336         \begin{array}{l}
337             \Phi_{\vec 0}\\
338             \Phi_{0,0,1}\\
339             \vdotswithin{\Phi_{\vec 0}}\\
340             \Phi_{\vec N}
341         \end{array}
342     \right]
343     =
344     \left[
345         \begin{array}{l}
346             \gamma_{0,0,0}-\Var{\epsilon}\\
347             \gamma_{0,0,1}\\
348             \vdotswithin{\gamma_{\vec 0}}\\
349             \gamma_{\vec N}
350         \end{array}
351     \right],
353     \Gamma=
354     \left[
355         \begin{array}{llll}
356             \Gamma_0 & \Gamma_1 & \cdots & \Gamma_{N_1} \\
357             \Gamma_1 & \Gamma_0 & \ddots & \vdotswithin{\Gamma_0} \\
358             \vdotswithin{\Gamma_0} & \ddots & \ddots & \Gamma_1 \\
359             \Gamma_{N_1} & \cdots & \Gamma_1 & \Gamma_0
360         \end{array}
361     \right],
362 \end{equation*}
363 where $$\vec N = \left( p_1, p_2, p_3 \right)$$ and
364 \begin{equation*}
365     \Gamma_i =
366     \left[
367     \begin{array}{llll}
368         \Gamma^0_i & \Gamma^1_i & \cdots & \Gamma^{N_2}_i \\
369         \Gamma^1_i & \Gamma^0_i & \ddots & \vdotswithin{\Gamma^0_i} \\
370         \vdotswithin{\Gamma^0_i} & \ddots & \ddots & \Gamma^1_i \\
371         \Gamma^{N_2}_i & \cdots & \Gamma^1_i & \Gamma^0_i
372     \end{array}
373     \right]
375     \Gamma_i^j=
376     \left[
377     \begin{array}{llll}
378         \gamma_{i,j,0} & \gamma_{i,j,1} & \cdots & \gamma_{i,j,N_3} \\
379         \gamma_{i,j,1} & \gamma_{i,j,0} & \ddots &x \vdotswithin{\gamma_{i,j,0}} \\
380         \vdotswithin{\gamma_{i,j,0}} & \ddots & \ddots & \gamma_{i,j,1} \\
381         \gamma_{i,j,N_3} & \cdots & \gamma_{i,j,1} & \gamma_{i,j,0}
382     \end{array}
383     \right],
384 \end{equation*}
385 Since $$\Phi_{\vec 0}\equiv0$$, the first row and column of $$\Gamma$$ can be
386 eliminated. Matrix $$\Gamma$$ is block-toeplitz, positive definite and symmetric,
387 hence the system is efficiently solved by Cholesky decomposition, which is
388 particularly suitable for these types of matrices.
389
390 After solving this system of equations white noise variance is estimated
391 from\nbsp{}eqref:eq-yule-walker by plugging $$\vec k = \vec 0$$:
392 \begin{equation*}
393     \Var{\epsilon} =
394     \Var{\zeta}
395     -
396     \sum\limits_{\vec j = \vec 0}^{\vec N}
397     \Phi_{\vec j}
398     \text{ }\gamma_{\vec{j}}.
399 \end{equation*}
400
401 ** Moving average (MA) process
402 MA process is ARMA process with $$\Phi\equiv0$$:
403
404     \zeta_{\vec i}
405     =
406     \sum\limits_{\vec j = \vec 0}^{\vec M}
407     \Theta_{\vec j} \epsilon_{\vec i - \vec j}
408     .
409     \label{eq-ma-process}
410
411 MA coefficients $$\Theta$$ are defined implicitly via the following non-linear
412 system of equations:
413 \begin{equation*}
414   \gamma_{\vec i} =
415   \left[
416     \displaystyle
417     \sum\limits_{\vec j = \vec i}^{\vec M}
418     \Theta_{\vec j}\Theta_{\vec j - \vec i}
419   \right]
420   \Var{\epsilon}.
421 \end{equation*}
422 The system is solved numerically by fixed-point iteration method via the
423 following formulae
424 \begin{equation*}
425   \Theta_{\vec i} =
426     -\frac{\gamma_{\vec 0}}{\Var{\epsilon}}
427     +
428     \sum\limits_{\vec j = \vec i}^{\vec M}
429     \Theta_{\vec j} \Theta_{\vec j - \vec i}.
430 \end{equation*}
431 Here coefficients $$\Theta$$ are calculated from back to front: from
432 $$\vec{i}=\vec{M}$$ to $$\vec{i}=\vec{0}$$. White noise variance is estimated by
433 \begin{equation*}
434     \Var{\epsilon} = \frac{\gamma_{\vec 0}}{
435     1
436     +
437     \sum\limits_{\vec j = \vec 0}^{\vec M}
438     \Theta_{\vec j}^2
439     }.
440 \end{equation*}
441 Authors of\nbsp{}cite:box1976time suggest using Newton---Raphson method to solve this
442 equation with higher precision, however, this method does not work in three
443 dimensions. Using slower method does not have dramatic effect on the overall
444 programme performance, because the number of coefficients is small and most of
445 the time is spent generating wavy surface.
446
447 ** Mixed autoregressive moving average (ARMA) process
448 :PROPERTIES:
449 :CUSTOM_ID: sec:how-to-mix-ARMA
450 :END:
451 Generally speaking, ARMA process is obtained by plugging MA generated wavy
452 surface as random impulse to AR process, however, in order to get the process
453 with desired ACF one should re-compute AR coefficients before plugging. There
454 are several approaches to "mix" AR and MA processes.
455 - The approach proposed in\nbsp{}cite:box1976time which involves dividing ACF into MA
456   and AR part along each dimension is not applicable here, because in three
457   dimensions such division is not possible: there always be parts of the ACF
458   that are not taken into account by AR and MA process.
459 - The alternative approach is to use the same (undivided) ACF for both AR and MA
460   processes but use different process order, however, then realisation
461   characteristics (mean, variance etc.) become skewed: these are characteristics
462   of the two overlapped processes.
463 For the first approach there is a formula to re-compute ACF for AR process, but
464 there is no such formula for the second approach. So, the best solution for now
465 is to simply use AR and MA process exclusively for different types of waves.
466
467 ** Process selection criteria for different wave profiles
468 :PROPERTIES:
469 :CUSTOM_ID: sec-process-selection
470 :END:
471 One problem of ARMA model application to ocean wave generation is that for
472 different types of wave profiles different processes /must/ be used: standing
473 waves are modelled by AR process, and propagating waves by MA process. This
474 statement comes from practice: if one tries to use the processes the other way
475 round, the resulting realisation either diverges or does not correspond to real
476 ocean waves. So, the best way to apply ARMA model to ocean wave generation is to
477 use AR process for standing waves and MA process for progressive waves.
478
479 The other problem is inability to automatically determine optimal number of
480 coefficients for three-dimensional AR and MA processes. For one-dimensional
481 processes this can be achieved via iterative methods\nbsp{}cite:box1976time, but
482 they diverge in three-dimensional case.
483
484 The final problem, which is discussed in [[#sec:how-to-mix-ARMA]], is inability to
485 "mix" AR and MA process in three dimensions.
486
487 In practice some statements made for AR and MA processes in\nbsp{}cite:box1976time
488 should be flipped for three-dimensional case. For example, the authors say that
489 ACF of MA process cuts at $$q$$ and ACF of AR process decays to nought infinitely,
490 but in practice making ACF of 3-dimensional MA process not decay results in it
491 being non-invertible and producing realisation that does not look like real
492 ocean waves, whereas doing the same for ACF of AR process results in stationary
493 process and adequate realisation. Also, the authors say that one
494 should allocate the first $$q$$ points of ACF to MA process (as it often needed to
495 describe the peaks in ACF) and leave the rest points to AR process, but in
496 practice in case of ACF of a propagating wave AR process is stationary only for
497 the first time slice of the ACF, and the rest is left to MA process.
498
499 To summarise, the only established scenario of applying ARMA model to ocean wave
500 generation is to use AR process for standing waves and MA process for
501 propagating waves. With a new formulae for 3 dimensions a single mixed ARMA
502 process might increase model precision, which is one of the objectives of the
503 future research.
504
505 ** The shape of ACF for different types of waves
506 **** Analytic method of finding the ACF
507 The straightforward way to find ACF for a given ocean wave profile is to apply
508 Wiener---Khinchin theorem. According to this theorem the autocorrelation $$K$$ of
509 a function $$\zeta$$ is given by the Fourier transform of the absolute square of
510 the function:
511
512   K(t) = \Fourier{\left| \zeta(t) \right|^2}.
513   \label{eq-wiener-khinchin}
514
515 When $$\zeta$$ is replaced with actual wave profile, this formula gives you
516 analytic formula for the corresponding ACF.
517
518 For three-dimensional wave profile (2D in space and 1D in time) analytic formula
519 is a polynomial of high order and is best obtained via symbolic computation
520 programme. Then for practical usage it can be approximated by superposition of
521 exponentially decaying cosines (which is how ACF of a stationary ARMA process
522 looks like\nbsp{}cite:box1976time).
523
524 **** Empirical method of finding the ACF
525 However, for three-dimensional case there exists simpler empirical method which
526 does not require sophisticated software to determine shape of the ACF. It is
527 known that ACF represented by exponentially decaying cosines satisfies first
528 order Stokes' equations for gravity waves\nbsp{}cite:boccotti1983wind. So, if the
529 shape of the wave profile is the only concern in the simulation, then one can
530 simply multiply it by a decaying exponent to get appropriate ACF. This ACF does
531 not reflect other wave profile parameters, such as wave height and period, but
532 opens possibility to simulate waves of a particular non-analytic shape by
533 "drawing" their profile, then multiplying it by an exponent and using the
534 resulting function as ACF. So, this empirical method is imprecise but offers
535 simpler alternative to Wiener---Khinchin theorem approach; it is mainly useful
536 to test ARMA model.
537
538 **** Standing wave ACF
539 For three-dimensional plain standing wave the profile is given by
540
541   \zeta(t, x, y) = A \sin (k_x x + k_y y) \sin (\sigma t).
542   \label{eq-standing-wave}
543
544 Find ACF via analytic method. Multiplying the formula by a decaying exponent
545 (because Fourier transform is defined for a function $$f$$ that
546 $$f\underset{x\rightarrow\pm\infty}{\longrightarrow}0$$) yields
547
548   \zeta(t, x, y) =
549   A
550   \exp\left[-\alpha (|t|+|x|+|y|) \right]
551   \sin (k_x x + k_y y) \sin (\sigma t).
552   \label{eq-decaying-standing-wave}
553
554 Then, apply 3D Fourier transform to the both sides of the equation via symbolic
555 computation programme, fit the resulting polynomial to the following
556 approximation:
557
558   K(t,x,y) =
559   \gamma
560   \exp\left[-\alpha (|t|+|x|+|y|) \right]
561   \cos \beta t
562   \cos \left[ \beta x + \beta y \right].
563   \label{eq-standing-wave-acf}
564
565 So, after applying Wiener---Khinchin theorem we get initial formula but with
566 cosines instead of sines. This difference is important because the value of ACF
567 at $$(0,0,0)$$ equals to the ARMA process variance, and if one used sines the
568 value would be wrong.
569
570 If one tries to replicate the same formula via empirical method, the usual way
571 is to adapt\nbsp{}eqref:eq-decaying-standing-wave to
572 match\nbsp{}eqref:eq-standing-wave-acf. This can be done either by changing the
573 phase of the sine, or by substituting sine with cosine to move the maximum of
574 the function to the origin of coordinates.
575
576 **** Propagating wave ACF
577 Three-dimensional profile of plain propagating wave is given by
578
579   \zeta(t, x, y) = A \cos (\sigma t + k_x x + k_y y).
580   \label{eq-propagating-wave}
581
582 For the analytic method repeating steps from the previous two paragraphs yields
583
584   K(t,x,y) =
585   \gamma
586   \exp\left[-\alpha (|t|+|x|+|y|) \right]
587   \cos\left[\beta (t+x+y) \right].
588   \label{eq-propagating-wave-acf}
589
590 For the empirical method the wave profile is simply multiplied by a decaying
591 exponent without need to adapt the maximum value of ACF (as it is required for
592 standing wave).
593
594 **** Comparison of studied methods
595 To summarise, the analytic method of finding ocean wave's ACF reduces to the
596 following steps.
597 - Make wave profile decay when approaching $$\pm\infty$$ by multiplying it by
598   a decaying exponent.
599 - Apply Fourier transform to the absolute square of the resulting equation using
600   symbolic computation programme.
601 - Fit the resulting polynomial to the appropriate ACF approximation.
602
603 Two examples in this section showed that in case of standing and propagating
604 waves their decaying profiles resemble the corresponding ACFs with the exception
605 that the ACF's maximum should be moved to the origin to preserve simulated
606 process variance. Empirical method of finding ACF reduces to the following
607 steps.
608 - Make wave profile decay when approaching $$\pm\infty$$ by multiplying it by
609   a decaying exponent.
610 - Move maximum value of the resulting function to the origin by using
611   trigonometric identities to shift the phase.
612
613 ** Evaluation and discussion
614 In\nbsp{}cite:degtyarev2011modelling,degtyarev2013synoptic,boukhanovsky1997thesis
615 for AR model the following items were verified experimentally:
616 - probability distributions of different wave characteristics (wave heights,
617   lengths, crests, periods, slopes, three-dimensionality),
618 - dispersion relation,
619 - retention of integral characteristics for mixed wave sea state.
620 In this work we repeat probability distribution tests for three-dimensional AR
621 and MA model.
622
623 In\nbsp{}cite:rozhkov1990 the authors show that several ocean wave
624 characteristics (listed in table\nbsp{}[[tab-weibull-shape]]) have Weibull
625 distribution, and wavy surface elevation has Gaussian distribution. In order to
626 verify that distributions corresponding to generated realisation are correct,
627 quantile-quantile plots are used (plots where analytic quantile values are used
628 for $$OX$$ axis and estimated quantile values for $$OY$$ axis). If the estimated
629 distribution matches analytic then the graph has the form of the straight line.
630 Tails of the graph may diverge from the straight line, because they can not be
631 reliably estimated from the finite-size realisation. Different methods of
632 extracting waves from realisation produce variations in quantile function tails,
633 it is probably impractical to extract every possible wave from realisation since
634 they may (and often) overlap.
635
636 #+name: tab-weibull-shape
637 #+caption: Values of Weibull shape parameter for different wave characteristics.
638 #+attr_latex: :booktabs t
639 | Characteristic       | Weibull shape ($$k$$) |
640 |----------------------+---------------------|
641 | Wave height          |                   2 |
642 | Wave length          |                 2.3 |
643 | Crest length         |                 2.3 |
644 | Wave period          |                   3 |
645 | Wave slope           |                 2.5 |
646 | Three-dimensionality |                 2.5 |
647
648 Verification was performed for standing and propagating waves. The corresponding
649 ACFs and quantile-quantile plots of wave characteristics distributions are shown
650 in
651 fig.\nbsp{}[[propagating-wave-distributions]],\nbsp{}[[standing-wave-distributions]],\nbsp{}[[acf-slices]].
652
653 #+name: propagating-wave-distributions
654 #+header: :width 4.5 :height 5
655 #+begin_src R :file build/propagating-wave-qqplots.eps
656 source(file.path("R", "common.R"))
657 par(pty="s", mfrow=c(2, 2), mai = c(1, 0.2, 0.2, 0.2))
658 arma.qqplot_grid(
659   file.path("build", "arma-benchmarks", "verification", "propagating_wave"),
660   c("elevation", "heights_y", "lengths_y", "periods"),
661   c("elevation", "height Y", "length Y", "period"),
662   xlab="x",
663   ylab="y"
664 )
665 #+end_src
666
667 #+caption: Quantile-quantile plots for propagating waves.
668 #+label: propagating-wave-distributions
669 #+RESULTS: propagating-wave-distributions
670 [[file:build/propagating-wave-qqplots.eps]]
671
672 #+name: standing-wave-distributions
673 #+header: :width 4.5 :height 5
674 #+begin_src R :file build/standing-wave-qqplots.eps
675 source(file.path("R", "common.R"))
676 par(pty="s", mfrow=c(2, 2), mai = c(1, 0.2, 0.2, 0.2))
677 arma.qqplot_grid(
678   file.path("build", "arma-benchmarks", "verification", "standing_wave"),
679   c("elevation", "heights_y", "lengths_y", "periods"),
680   c("elevation", "height Y", "length Y", "period"),
681   xlab="x",
682   ylab="y"
683 )
684 #+end_src
685
686 #+caption: Quantile-quantile plots for standing waves.
687 #+label: standing-wave-distributions
688 #+RESULTS: standing-wave-distributions
689 [[file:build/standing-wave-qqplots.eps]]
690
691 #+name: acf-slices
692 #+header: :width 4.5 :height 7
693 #+begin_src R :file build/acf-slices.eps
694 source(file.path("R", "common.R"))
695 propagating_acf <- read.csv(file.path("build", "arma-benchmarks", "verification", "propagating_wave", "acf.csv"))
696 standing_acf <- read.csv(file.path("build", "arma-benchmarks", "verification", "standing_wave", "acf.csv"))
697 par(mfrow=c(5, 2), mar=c(0,0,0,0), mai=c(0.1, 0.1, 0.1, 0.1))
698 for (i in seq(0, 4)) {
699   arma.wavy_plot(standing_acf, i, zlim=c(-1,1))
700   arma.wavy_plot(propagating_acf, i, zlim=c(-1,1))
701 }
702 #+end_src
703
704 #+caption: Time slices of ACF for standing (left column) and propagating waves (right column).
705 #+label: acf-slices
706 #+RESULTS: acf-slices
707 [[file:build/acf-slices.eps]]
708
709 Graph tails in fig.\nbsp{}[[propagating-wave-distributions]] deviate from original
710 distribution for individual wave characteristics, because every wave have to be
711 extracted from the resulting wavy surface to measure its length, period and
712 height. There is no algorithm that guarantees correct extraction of all waves,
713 because they may overlap each other. Weibull distribution right tail represents
714 infrequently occurring waves, so it deviates more than left tail.
715
716 Degree of correspondence for standing waves
717 (fig.\nbsp{}[[standing-wave-distributions]]) is lower for height and length, is
718 roughly the same for surface elevation and is higher for wave period
719 distribution tails. Lower correspondence degree for length and height may be
720 attributed to the fact that Weibull distributions were obtained empirically for
721 ocean waves which are typically propagating, and distributions may be different
722 for standings waves. Higher correspondence degree for wave periods is attributed
723 to the fact that wave periods of standing waves are extracted more precisely as
724 the waves do not move outside simulated wavy surface region. The same
725 correspondence degree for wave elevation is obtained, because this is the
726 characteristic of the wavy surface (and corresponding AR or MA process) and is
727 not affected by the type of waves.
728
729 ARMA model, owing to its non-physical nature, does not have the notion of ocean
730 wave; it simulates wavy surface as a whole instead. Motions of individual waves
731 and their shape are often rough, and the total number of waves can not be
732 determined precisely. However, integral characteristics of wavy surface match
733 the ones of real ocean waves.
734
735 Theoretically, ocean waves themselves can be chosen as ACFs, the only
736 pre-processing step is to make them decay exponentially. This may allow
737 to generate waves of arbitrary profiles, and is one of the directions of future
738 work.
739
740 * Determining wave pressures for discretely given wavy surface
741 :PROPERTIES:
742 :CUSTOM_ID: sec-pressures
743 :END:
744
745 Analytic solutions to boundary problems in classical equations are often used to
746 study different properties of the solution, and for that purpose general
747 solution formula is too difficult to study, as it contains integrals of unknown
748 functions. Fourier method is one of the methods to find analytic solutions to a
749 PDE. It is based on application of Fourier transform to each part of PDE, which
750 reduces the equation to algebraic, and the solution is written as inverse
751 Fourier transform of some function (which may contain Fourier transforms of
752 other functions). Since, it is not possible to write analytic forms of these
753 Fourier transforms in all cases, unique solutions are found and their behaviour
754 is studied in different domains instead. At the same time, computing discrete
755 Fourier transforms on the computer is possible for any discretely defined
756 function and efficient when using FFT algorithms. These algorithms use symmetry
757 of complex exponentials to decrease asymptotic complexity from
758 $$\mathcal{O}(n^2)$$ to $$\mathcal{O}(n\log_{2}n)$$. So, even if general
759 solution contains Fourier transforms of unknown functions, they still can be
760 computed numerically, and FFT family of algorithms makes this approach
761 efficient.
762
763 Alternative approach to solve a PDE is to reduce it to difference equations, which
764 are solved by constructing various numerical schemes. This approach leads to
765 approximate solution, and asymptotic complexity of corresponding algorithms is
766 comparable to that of FFT. For example, stationary elliptic PDE transforms to
767 implicit numerical scheme which is solved by iterative method on each step of
768 which a tridiagonal or five-diagonal system of algebraic equations is solved via
769 Thomas algorithm. Asymptotic complexity of this approach is
770 $$\mathcal{O}({n}{m})$$, where $$n$$\nbsp{}--- number of wavy surface grid
771 points, $$m$$\nbsp{}--- number of iterations. Despite their wide spread,
772 iterative algorithms are inefficient on parallel computer architectures; in
773 particular, their mapping to co-processors may involve copying data in and out
774 of the co-processor in each iteration, which negatively affects their
775 performance. At the same time, high number of Fourier transforms in the solution
776 is an advantage, rather than a disadvantage. First, solutions obtained by
777 Fourier method are explicit, hence their implementations scales with the large
778 number of parallel computer cores. Second, there are implementations of FFT
779 optimised for different processor architectures as well as co-processors (GPU,
780 MIC) which makes it easy to get high performance on any computing platform.
781 These advantages substantiate the choice of Fourier method to obtain explicit
782 analytic solution to the problem of determining pressures under wavy ocean
783 surface.
784
785 The problem of finding pressure field under wavy sea surface represents inverse
786 problem of hydrodynamics for incompressible inviscid fluid. System of equations
787 for it in general case is written as\nbsp{}cite:kochin1966theoretical
788 \begin{align}
789     & \nabla^2\phi = 0,\nonumber\\
790     & \phi_t+\frac{1}{2} |\vec{\upsilon}|^2 + g\zeta=-\frac{p}{\rho}, & \text{at }z=\zeta(x,y,t),\label{eq-problem}\\
791     & D\zeta = \nabla \phi \cdot \vec{n}, & \text{at }z=\zeta(x,y,t),\nonumber
792 \end{align}
793 where $$\phi$$\nbsp{}--- velocity potential, $$\zeta$$\nbsp{}--- elevation
794 ($$z$$ coordinate) of wavy surface, $$p$$\nbsp{}--- wave pressure,
795 $$\rho$$\nbsp{}--- fluid density,
796 $$\vec{\upsilon}=(\phi_x,\phi_y,\phi_z)$$\nbsp{}--- velocity vector,
797 $$g$$\nbsp{}--- acceleration of gravity, and $$D$$\nbsp{}--- substantial
798 (Lagrange) derivative. The first equation is called continuity (Laplace)
799 equation, the second one is the conservation of momentum law (the so called
800 dynamic boundary condition); the third one is kinematic boundary condition for
801 free wavy surface, which states that rate of change of wavy surface elevation
802 ($$D\zeta$$) equals to the change of velocity potential derivative along the
803 wavy surface normal ($$\nabla\phi\cdot\vec{n}$$).
804
805 Inverse problem of hydrodynamics consists in solving this system of equations
806 for $$\phi$$. In this formulation dynamic boundary condition becomes an explicit
807 formula to determine pressure field using velocity potential derivatives
808 obtained from the remaining equations. So, from mathematical point of view
809 inverse problem of hydrodynamics reduces to Laplace equation with mixed boundary
810 condition\nbsp{}--- Robin problem.
811
812 ** Two-dimensional case
813 :PROPERTIES:
814 :CUSTOM_ID: sec:pressure-2d
815 :END:
816 **** Formula for infinite depth fluid
817 Two-dimensional Laplace equation with Robin boundary condition is written as
818 \begin{align}
819     \label{eq-problem-2d}
820     & \phi_{xx}+\phi_{zz}=0,\\
821     & \zeta_t + \zeta_x\phi_x = \frac{\zeta_x}{\sqrt{1 + \zeta_x^2}} \phi_x - \phi_z, & \text{at }z=\zeta(x,t).\nonumber
822 \end{align}
823 Use Fourier method to solve this problem. Applying Fourier transform to both
824 sides of the equation yields
825 \begin{equation*}
826     -4 \pi^2 \left( u^2 + v^2 \right)
827     \FourierY{\phi(x,z)}{u,v} = 0,
828 \end{equation*}
829 hence $$v = \pm i u$$. Hereinafter we use the following symmetric form of Fourier
830 transform:
831 \begin{equation*}
832     \FourierY{f(x,y)}{u,v} =
833     \iint\limits_{-\infty}^{\phantom{--}\infty}
834     f(x,y)
835     e^{-2\pi i (x u + y v)}
836     dx dy.
837 \end{equation*}
838 We seek solution in the form of inverse Fourier transform
839 $$\phi(x,z)=\InverseFourierY{E(u,v)}{x,z}$$. Plugging[fn::$$v={-i}{u}$$ is not
840 applicable because velocity potential must go to nought when depth goes to
841 infinity.] $$v={i}{u}$$ into the formula yields
842
843     \label{eq-guessed-sol-2d}
844     \phi(x,z) = \InverseFourierY{e^{2\pi u z}E(u)}{x}.
845
846 In order to make substitution $$z=\zeta(x,t)$$ not interfere with Fourier
847 transforms, we rewrite\nbsp{}eqref:eq-guessed-sol-2d as a convolution:
848 \begin{equation*}
849     \phi(x,z)
850     =
851     \Fun{z}
852     \ast
853     \InverseFourierY{E(u)}{x},
854 \end{equation*}
855 where $$\Fun{z}$$\nbsp{}--- a function, form of which is defined in
856 sec.\nbsp{}[[#sec:compute-delta]] and which satisfies equation
857 $$\FourierY{\Fun{z}}{u}=e^{2\pi{u}{z}}$$. Plugging formula $$\phi$$ into the
858 boundary condition yields
859 \begin{equation*}
860     \zeta_t
861     =
862     \left( i f(x) - 1 \right)
863     \left[
864         \Fun{z}
865         \ast
866         \InverseFourierY{2\pi u E(u)}{x}
867     \right],
868 \end{equation*}
869 where $$f(x)={\zeta_x}/{\sqrt{1+\zeta_x^2}}-\zeta_x$$. Applying Fourier transform
870 to both sides of this equation yields formula for coefficients $$E$$:
871 \begin{equation*}
872     E(u) =
873     \frac{1}{2\pi u}
874     \frac{
875     \FourierY{\zeta_t / \left(i f(x) - 1\right)}{u}
876     }{
877     \FourierY{\Fun{z}}{u}
878     }
879 \end{equation*}
880 Finally, substituting $$z$$ for $$\zeta(x,t)$$ and plugging resulting equation
881 into\nbsp{}eqref:eq-guessed-sol-2d yields formula for $$\phi(x,z)$$:
882
883     \label{eq-solution-2d}
884     \boxed{
885         \phi(x,z)
886         =
887         \InverseFourierY{
888             \frac{e^{2\pi u z}}{2\pi u}
889             \frac{
890             \FourierY{ \zeta_t / \left(i f(x) - 1\right) }{u}
891             }{
892             \FourierY{ \Fun{\zeta(x,t)} }{u}
893             }
894         }{x}.
895     }
896
897
898 Multiplier $$e^{2\pi{u}{z}}/(2\pi{u})$$ makes a graph of a function to which
899 Fourier transform is applied asymmetric with respect to $$OY$$ axis. This makes
900 it difficult to apply FFT which expects periodic function with nought on both
901 ends of the interval. Using numerical integration instead of FFT is not faster
902 than solving the initial system of equations with numerical schemes. This
903 problem is alleviated by using formula\nbsp{}eqref:eq-solution-2d-full for finite
904 depth fluid with wittingly large depth $$h$$. This formula is derived in the
905 following section.
906
907 **** Formula for finite depth fluid
908 On the sea bottom vertical fluid velocity component equals nought: $$\phi_z=0$$ on
909 $$z=-h$$, where $$h$$\nbsp{}--- water depth. In this case equation $$v=-{i}{u}$$, which came
910 from Laplace equation, can not be neglected, hence the solution is sought in the
911 following form:
912
913     \phi(x,z)
914     =
915     \InverseFourierY{
916         \left( C_1 e^{2\pi u z} + C_2 e^{-2\pi u z} \right)
917         E(u)
918     }{x}.
919     \label{eq-guessed-sol-2d-full}
920
921 Plugging $$\phi$$ into the boundary condition on the sea bottom yields
922 \begin{equation*}
923     C_1 e^{-2\pi u h} - C_2 e^{2\pi u h} = 0,
924 \end{equation*}
925 hence $$C_1=\frac{1}{2}C{e}^{2\pi{u}{h}}$$ and
926 $$C_2=-\frac{1}{2}C{e}^{-2\pi{u}{h}}$$. Constant $$C$$ may take arbitrary value
927 here, because after plugging it becomes part of unknown coefficients $$E(u)$$.
928 Plugging formulae for $$C_1$$ and $$C_2$$
929 into\nbsp{}eqref:eq-guessed-sol-2d-full yields
930 \begin{equation*}
931     \phi(x,z) = \InverseFourierY{ \Sinh{2\pi u (z+h)} E(u) }{x}.
932 \end{equation*}
933 Plugging $$\phi$$ into the boundary condition on the free surface yields
934 \begin{equation*}
935     \zeta_t = f(x) \InverseFourierY{ 2\pi i u \Sinh{2\pi u (z+h)} E(u) }{x}
936             - \InverseFourierY{ 2\pi u \SinhX{2\pi u (z+h)} E(u) }{x}.
937 \end{equation*}
938 Here $$\sinh$$ and $$\cosh$$ give similar results near free surface, and since this
939 is the main area of interest in practical applications, we assume that
940 $$\Sinh{2\pi{u}(z+h)}\approx\SinhX{2\pi{u}(z+h)}$$. Performing analogous to the
941 previous section transformations yields final formula for $$\phi(x,z)$$:
942
943 \boxed{
944     \phi(x,z,t)
945     =
946   \InverseFourierY{
947         \frac{\Sinh{2\pi u (z+h)}}{2\pi u}
948         \frac{
949             \FourierY{ \zeta_t / \left(i f(x) - 1\right) }{u}
950         }{
951             \FourierY{ \FunSecond{\zeta(x,t)} }{u}
952         }
953     }{x},
954 }
955     \label{eq-solution-2d-full}
956
957 where $$\FunSecond{z}$$\nbsp{}--- a function, form of which is defined in
958 sec.\nbsp{}[[#sec:compute-delta]] and which satisfies equation
959 $$\FourierY{\FunSecond{z}}{u}=\Sinh{2\pi{u}{z}}$$.
960
961 **** Reducing to the formulae from linear wave theory
962 Check the validity of derived formulae by substituting $$\zeta(x,t)$$ with known
963 analytic formula for plain waves. Symbolic computation of Fourier transforms in
964 this section were performed in Mathematica\nbsp{}cite:mathematica10. In the
965 framework of linear wave theory assume that waves have small amplitude compared
966 to their lengths, which allows us to simplify initial system of
967 equations\nbsp{}eqref:eq-problem-2d to
968 \begin{align*}
969     & \phi_{xx}+\phi_{zz}=0,\\
970     & \zeta_t = -\phi_z & \text{at }z=\zeta(x,t),
971 \end{align*}
972 solution to which is written as
973 \begin{equation*}
974     \phi(x,z,t)
975     =
976     -\InverseFourierY{
977         \frac{e^{2\pi u z}}{2\pi u}
978         \FourierY{\zeta_t}{u}
979     }{x}
980     .
981 \end{equation*}
982 Propagating wave profile is defined as $$\zeta(x,t)=A\cos(2\pi(kx-t))$$.
983 Plugging this formula into\nbsp{}eqref:eq-solution-2d yields
984 $$\phi(x,z,t)=-\frac{A}{k}\sin(2\pi(kx-t))\Sinh{2\pi{k}{z}}$$. In order to
985 reduce it to the formula from linear wave theory, rewrite hyperbolic sine in
986 exponential form, discard the term containing $$e^{-2\pi{k}{z}}$$ as
988 $$\phi\underset{z\rightarrow-\infty}{\longrightarrow}0$$. Taking real part of
989 the resulting formula yields
990 $$\phi(x,z,t)=\frac{A}{k}e^{2\pi{k}{z}}\sin(2\pi(kx-t))$$, which corresponds to
991 the known formula from linear wave theory. Similarly, under small-amplitude
992 waves assumption the formula for finite depth
993 fluid\nbsp{}eqref:eq-solution-2d-full is reduced to
994 \begin{equation*}
995     \phi(x,z,t)
996     =
997     -\InverseFourierY{
998         \frac{\Sinh{2\pi u (z+h)}}{2\pi u \Sinh{2\pi u h}}
999         \FourierY{\zeta_t}{u}
1000     }{x}.
1001 \end{equation*}
1002 Substituting $$\zeta(x,t)$$ with propagating plain wave profile formula yields
1003
1004     \label{eq-solution-2d-linear}
1005     \phi(x,z,t)=\frac{A}{k}
1006     \frac{\Sinh{2 \pi k (z+h)}}{ \Sinh{2 \pi k h} }
1007     \sin(2 \pi (k x-t)),
1008
1009 which corresponds to the formula from linear wave theory for finite depth fluid.
1010
1011 Different forms of Laplace equation solutions, in which decaying exponent is
1012 written with either "+" or "-" signs, may cause incompatibilities between
1013 formulae from linear wave theory and formulae derived in this work, where
1014 $$\sinh$$ is used instead of $$\cosh$$. Equality
1015 $$\frac{\Sinh{2\pi{k}(z+h)}}{\Sinh{2\pi{k}{h}}}\approx\frac{\sinh(2\pi{k}(z+h))}{\sinh(2\pi{k}{h})}$$
1016 becomes strict on the free surface, and difference between left-hand and
1017 right-hand sides increases when approaching sea bottom (for sufficiently large
1018 depth difference near free surface is negligible). So, for sufficiently large
1019 depth any function ($$\cosh$$ or $$\sinh$$) may be used for velocity potential
1020 computation near free surface.
1021
1022 Reducing\nbsp{}eqref:eq-solution-2d and\nbsp{}eqref:eq-solution-2d-full to the
1023 known formulae from linear wave theory shows, that formula for infinite
1024 depth\nbsp{}eqref:eq-solution-2d is not suitable to compute velocity potentials
1025 with Fourier method, because it does not have symmetry, which is required for
1026 Fourier transform. However, formula for finite depth can be used instead by
1027 setting $$h$$ to some characteristic water depth. For standing wave reducing to
1028 linear wave theory formulae is made under the same assumptions.
1029
1030 ** Three-dimensional case
1031 Three-dimensional version of\nbsp{}eqref:eq-problem is written as
1032 \begin{align}
1033     \label{eq-problem-3d}
1034     & \phi_{xx} + \phi_{yy} + \phi_{zz} = 0,\\
1035     & \zeta_t + \zeta_x\phi_x + \zeta_y\phi_y
1036     =
1037     \frac{\zeta_x}{\SqrtZeta{1 + \zeta_x^2 + \zeta_y^2}} \phi_x
1038     +\frac{\zeta_y}{\SqrtZeta{1 + \zeta_x^2 + \zeta_y^2}} \phi_y
1039     - \phi_z, & \text{at }z=\zeta(x,y,t).\nonumber
1040 \end{align}
1041 Again, use Fourier method to solve it. Applying Fourier transform to both sides
1042 of Laplace equation yields
1043 \begin{equation*}
1044     -4 \pi^2 \left( u^2 + v^2 + w^2 \right)
1045     \FourierY{\phi(x,y,z)}{u,v,w} = 0,
1046 \end{equation*}
1047 hence $$w=\pm{i}\sqrt{u^2+v^2}$$. We seek solution in the form of inverse Fourier
1048 transform $$\phi(x,y,z)=\InverseFourierY{E(u,v,w)}{x,y,z}$$. Plugging
1049 $$w=i\sqrt{u^2+v^2}=i\Kveclen$$ into the formula yields
1050 \begin{equation*}
1051     \phi(x,y,z) = \InverseFourierY{
1052         \left(
1053             C_1 e^{2\pi \Kveclen z}
1054             -C_2 e^{-2\pi \Kveclen z}
1055         \right)
1056         E(u,v)
1057     }{x,y}.
1058 \end{equation*}
1059 Plugging $$\phi$$ into the boundary condition on the sea bottom (analogous to
1060 two-dimensional case) yields
1061
1062     \label{eq-guessed-sol-3d}
1063     \phi(x,y,z) = \InverseFourierY{
1064         \Sinh{2\pi \Kveclen (z+h)} E(u,v)
1065     }{x,y}.
1066
1067 Plugging $$\phi$$ into the boundary condition on the free surface yields
1068 \begin{equation*}
1069     \arraycolsep=1.4pt
1070     \begin{array}{rl}
1071         \zeta_t = & i f_1(x,y) \InverseFourierY{2 \pi u \Sinh{2\pi \Kveclen (z+h)}E(u,v)}{x,y} \\
1072         + & i f_2(x,y) \InverseFourierY{2 \pi v \Sinh{2\pi \Kveclen (z+h)}E(u,v)}{x,y} \\
1073         - & \InverseFourierY{2 \pi \Kveclen \SinhX{2\pi \Kveclen (z+h)}E(u,v)}{x,y}
1074     \end{array}
1075 \end{equation*}
1076 where $$f_1(x,y)={\zeta_x}/{\SqrtZeta{1+\zeta_x^2+\zeta_y^2}}-\zeta_x$$ and
1077 $$f_2(x,y)={\zeta_y}/{\SqrtZeta{1+\zeta_x^2+\zeta_y^2}}-\zeta_y$$.
1078
1079 Like in sec.\nbsp{}[[#sec:pressure-2d]] we assume that
1080 $$\Sinh{2\pi{u}(z+h)}\approx\SinhX{2\pi{u}(z+h)}$$ near free surface, but in
1081 three-dimensional case this is not enough to solve the problem. In order to get
1082 analytic formula for coefficients $$E$$ we need to assume, that all Fourier
1083 transforms in the equation have radially symmetric kernels, i.e. replace $$u$$
1084 and $$v$$ with $$\Kveclen$$. There are two points supporting this assumption.
1085 First, in numerical implementation integration is done over positive wave
1086 numbers, so the sign of $$u$$ and $$v$$ does not affect the solution. Second,
1087 the rate growth of $$\cosh$$ term of the integral kernel is much higher than the
1088 one of $$u$$ or $$\Kveclen$$, so the substitution has small effect on the
1089 magnitude of the solution. Despite these two points, a use of more
1090 mathematically rigorous approach would be preferable.
1091
1092 Making the replacement, applying Fourier transform to both sides of the equation
1093 and plugging the result into\nbsp{}eqref:eq-guessed-sol-3d yields formula for
1094 $$\phi$$:
1095 \begin{equation*}
1096     \phi(x,y,z,t) = \InverseFourierY{
1097         \frac{ \Sinh{\smash{2\pi \Kveclen (z+h)}} }{ 2\pi\Kveclen }
1098         \frac{ \FourierY{ \zeta_t / \left( i f_1(x,y) + i f_2(x,y) - 1 \right)}{u,v} }
1099         { \FourierY{\mathcal{D}_3\left( x,y,\zeta\left(x,y\right) \right)}{u,v} }
1100     }{x,y}\label{eq-phi-high-amp},
1101 \end{equation*}
1102 where $$\FourierY{\mathcal{D}_3\left(x,y,z\right)}{u,v}=\Sinh{\smash{2\pi\Kveclen{}z}}$$.
1103
1104 ** Evaluation and discussion
1105 :PROPERTIES:
1106 :CUSTOM_ID: sec:compare-formulae
1107 :END:
1108
1109 Comparing obtained generic formulae\nbsp{}eqref:eq-solution-2d
1110 and\nbsp{}eqref:eq-solution-2d-full to the known formulae from linear wave
1111 theory allows to see the difference between velocity fields for both large and
1112 small amplitude waves. In general analytic formula for velocity potential in not
1113 known, even for plain waves, so comparison is done numerically. Taking into
1114 account conclusions of\nbsp{}[[#sec:pressure-2d]], only finite depth formulae are
1115 compared.
1116
1117 **** The difference with linear wave theory formulae
1118 In order to obtain velocity potential fields, ocean wavy surface was generated
1119 by AR model with varying wave amplitude. In numerical implementation wave
1120 numbers in Fourier transforms were chosen on the interval from $$0$$ to the
1121 maximal wave number determined numerically from the obtained wavy surface.
1122 Experiments were conducted for waves of both small and large amplitudes.
1123
1124 The experiment showed that velocity potential fields produced by formula
1125 eqref:eq-solution-2d-full for finite depth fluid and formula
1126 eqref:eq-solution-2d-linear from linear wave theory are qualitatively different
1127 (fig.\nbsp{}[[fig-potential-field-nonlinear]]). First, velocity potential contours
1128 have sinusoidal shape, which is different from oval shape described by linear
1129 wave theory. Second, velocity potential decays more rapidly than in linear wave
1130 theory as getting closer to the bottom, and the region where the majority of
1131 wave energy is concentrated is closer to the wave crest. Similar numerical
1132 experiment, in which all terms of eqref:eq-solution-2d-full that are neglected
1133 in the framework of linear wave theory are eliminated, shows no difference (as
1134 much as machine precision allows) in resulting velocity potential fields.
1135
1136 #+name: fig-potential-field-nonlinear
1137 #+header: :width 8 :height 11
1138 #+begin_src R :file build/plain-wave-velocity-field-comparison.eps
1139 source(file.path("R", "velocity-potentials.R"))
1140 par(pty="s")
1141 nlevels <- 41
1142 levels <- pretty(c(-200,200), nlevels)
1143 palette <- colorRampPalette(c("blue", "lightyellow", "red"))
1144 #palette <- colorRampPalette(c("black", "white"))
1145 col <- palette(nlevels-1)
1146
1147 # linear solver
1148 par(fig=c(0,0.95,0,0.5),new=TRUE)
1149 arma.plot_velocity_potential_field(
1150   file.path("build", "arma-benchmarks", "verification", "plain_wave_linear_solver"),
1151   levels=levels,
1152   col=col
1153 )
1154
1155 # high-amplitude solver
1156 par(fig=c(0,0.95,0.5,1),new=TRUE)
1157 arma.plot_velocity_potential_field(
1158   file.path("build", "arma-benchmarks", "verification", "plain_wave_high_amplitude_solver"),
1159   levels=levels,
1160   col=col
1161 )
1162
1163 # legend 1
1164 par(pty="m",fig=c(0.80,1,0.5,1), new=TRUE)
1165 arma.plot_velocity_potential_field_legend(
1166   levels=levels,
1167   col=col
1168 )
1169
1170 # legend 2
1171 par(pty="m",fig=c(0.80,1,0,0.5), new=TRUE)
1172 arma.plot_velocity_potential_field_legend(
1173   levels=levels,
1174   col=col
1175 )
1176 #+end_src
1177
1178 #+caption: Velocity potential field of propagating wave $$\zeta(x,y,t) = \cos(2\pi x - t/2)$$. Field produced by formula eqref:eq-solution-2d-full (top) and linear wave theory formula (bottom).
1179 #+label: fig-potential-field-nonlinear
1180 #+attr_latex: :width \textwidth
1181 #+RESULTS: fig-potential-field-nonlinear
1182 [[file:build/plain-wave-velocity-field-comparison.eps]]
1183
1184 **** The difference with small-amplitude wave theory
1185 The experiment, in which velocity fields produced numerically by different
1186 formulae were compared, shows that velocity fields produced by formula
1187 eqref:eq-solution-2d-full and eqref:eq-old-sol-2d correspond to each other for
1188 small-amplitude waves. Two ocean wavy surface realisations were made by AR
1189 model: one containing small-amplitude waves, other containing large-amplitude
1190 waves. Integration in formula eqref:eq-solution-2d-full was done over wave
1191 numbers range extracted from the generated wavy surface. For small-amplitude
1192 waves both formulae showed comparable results (the difference in the velocity is
1193 attributed to the stochastic nature of AR model), whereas for large-amplitude
1194 waves stable velocity field was produced only by formula
1195 eqref:eq-solution-2d-full (fig.\nbsp{}[[fig-velocity-field-2d]]). So, generic
1196 formula eqref:eq-solution-2d-full gives satisfactory results without restriction
1197 on wave amplitudes.
1198
1199 #+name: fig-velocity-field-2d
1200 #+header: :width 8 :height 11
1201 #+begin_src R :file build/large-and-small-amplitude-velocity-field-comparison.eps
1202 source(file.path("R", "velocity-potentials.R"))
1203 linetypes = c("solid", "dashed")
1204 par(mfrow=c(2, 1))
1205 arma.plot_velocity(
1206   file.path("data", "velocity", "low-amp"),
1207   file.path("data", "velocity", "low-amp-0"),
1208   linetypes=linetypes,
1209   ylim=c(-2,2)
1210 )
1211 arma.plot_velocity(
1212   file.path("data", "velocity", "high-amp"),
1213   file.path("data", "velocity", "high-amp-0"),
1214   linetypes=linetypes,
1215   ylim=c(-2,2)
1216 )
1217 #+end_src
1218
1219 #+label: fig-velocity-field-2d
1220 #+caption: Comparison of velocity field on the ocean wavy surface obtained by generic formula ($$u_1$$) and formula for small-amplitude waves ($$u_2$$). Velocity field for realisations containing small-amplitude (top) and large-amplitude (bottom) waves.
1221 #+attr_latex: :width \textwidth
1222 #+RESULTS: fig-velocity-field-2d
1223 [[file:build/large-and-small-amplitude-velocity-field-comparison.eps]]
1224
1225 * High-performance software implementation for heterogeneous platforms
1226 :PROPERTIES:
1227 :CUSTOM_ID: sec:arma-algorithms
1228 :END:
1229 ** White noise generation
1230 In order to eliminate periodicity from generated wavy surface, it is imperative
1231 to use PRNG with sufficiently large period to generate white noise. Parallel
1232 Mersenne Twister\nbsp{}cite:matsumoto1998mersenne with a period of
1233 $$2^{19937}-1$$ is used as a generator in this work. It allows to produce
1234 aperiodic ocean wavy surface realisations in any practical usage scenarios.
1235
1236 There is no guarantee that multiple Mersenne Twisters executed in parallel
1237 threads with distinct initial states produce uncorrelated pseudo-random number
1238 sequences, however, algorithm of dynamic creation of Mersenne
1239 Twisters\nbsp{}cite:matsumoto1998dynamic may be used to provide such a
1240 guarantee. The essence of the algorithm is to find matrices of initial generator
1241 states, that give maximally uncorrelated pseudo-random number sequences when
1242 Mersenne Twisters are executed in parallel with these initial states. Since
1243 finding such initial states consumes considerable amount of processor time,
1244 vector of initial states is created preliminary with knowingly larger number of
1245 parallel threads and saved to a file, which is then read before starting white
1246 noise generation.
1247
1248 ** Wavy surface generation
1249 In ARMA model value of wavy surface elevation at a particular point depends on
1250 previous in space and time points, as a result the so called /ramp-up interval/
1251 (see fig.\nbsp{}[[fig-ramp-up-interval]]), in which realisation does not correspond to
1252 specified ACF, forms at the beginning of the realisation. There are several
1253 solutions to this problem which depend on the simulation context.
1254
1255 If the realisation is used in the context of ship stability simulation without
1256 manoeuvring, ramp-up interval will not affect results of the simulation, because
1257 it is located on the border (too far away from the studied marine object). If
1258 ship stability with manoeuvring is studied, then the interval may be simply
1259 discarded from the realisation (the size of the interval approximately equals
1260 the number of AR coefficients in each dimension). However, this may lead to a
1261 loss of a very large number of points, because discarding is done for each
1262 dimension. Alternative approach is to generate ocean wavy surface on ramp-up
1263 interval with LH model and generate the rest of the realisation with ARMA model.
1264
1265 Algorithm of wavy surface generation is data-parallel: the realisation is
1266 divided into equal parts along the time axis each of which is generated
1267 independently, however, in the beginning of each realisation there is ramp-up
1268 interval. To eliminate it for MA process, /overlap-add/
1269 method\nbsp{}cite:oppenheim1989discrete,svoboda2011efficient,pavel2013algorithms
1270 (a popular method in signal processing) is used. The essence of the method is to
1271 add another interval, size of which is equal to the ramp-up interval size, to
1272 the end of each part. Then wavy surface is generated in each point of each part
1273 (including points from the added interval), the interval at the end of part
1274 $$N$$ is superimposed on the ramp-up interval at the beginning of the part
1275 $$N+1$$, and values in corresponding points are added. To eliminate the ramp-up
1276 interval for AR process, the realisation is divided into part along each
1277 dimension, and each part is computed only when all dependent parts are ready.
1278 For that purpose, an array of current part states is maintained in the
1279 programme, and all the parts are put into a queue. A parallel thread acquires a
1280 shared lock, finds the first part in the queue, for which all dependent parts
1281 have "completed" state, removes this part for the queue, frees the lock and
1282 generates the part. After that a thread updates the state of the part, and
1283 repeats the same steps until the queue becomes empty. This algorithm eliminates
1284 all ramp-up intervals except the one at the beginning of the realisation, and
1285 the size of the parts should be sufficiently small to balance the load on all
1286 processor cores.
1287
1288 #+name: fig-ramp-up-interval
1289 #+header: :width 4.5 :height 5
1290 #+begin_src R :file build/ramp-up-interval.eps
1291 source(file.path("R", "common.R"))
1292 par(mar=c(0,1,0,0))
1293 arma.plot_ramp_up_interval()
1294 #+end_src
1295
1296 #+caption: Ramp-up interval at the beginning of the $$OX$$ axis of the realisation.
1297 #+label: fig-ramp-up-interval
1298 #+RESULTS: fig-ramp-up-interval
1299 [[file:build/ramp-up-interval.eps]]
1300
1301 ** Velocity potential computation
1302 :PROPERTIES:
1303 :CUSTOM_ID: sec:compute-delta
1304 :END:
1305
1306 In solutions eqref:eq-solution-2d and eqref:eq-solution-2d-full to
1307 two-dimensional problem there are functions
1308 $$\Fun{z}=\InverseFourierY{e^{2\pi{u}{z}}}{x}$$ and
1309 $$\FunSecond{z}=\InverseFourierY{\Sinh{2\pi{u}{z}}}{x}$$ which has multiple
1310 analytic representations and are difficult to compute. Each function is a
1311 Fourier transform of linear combination of exponents which reduces to poorly
1312 defined Dirac delta function of a complex argument (see
1313 table\nbsp{}[[tab-delta-functions]]). The usual way of handling this type of
1314 functions is to write them as multiplication of Dirac delta functions of real
1315 and imaginary part, however, this approach does not work here, because applying
1316 inverse Fourier transform to this representation does not produce exponent,
1317 which severely warp resulting velocity field. In order to get unique analytic
1318 definition, normalisation factor $$1/\Sinh{2\pi{u}{h}}$$ (which is also included
1319 in the formula for $$E(u)$$) may be used. Despite the fact that normalisation
1320 allows to obtain adequate velocity potential field, numerical experiments show
1321 that there is little difference between this field and the one produced by
1322 formulae, in which terms with $$\zeta$$ are omitted. As a result, we do not use
1323 normalisation factors in the formula.
1324
1325 #+name: tab-delta-functions
1326 #+caption: Formulae for computing $$\Fun{z}$$ and $$\FunSecond{z}$$ from [[#sec:pressure-2d]], that use normalisation to eliminate uncertainty from definition of Dirac delta function of complex argument.
1327 #+attr_latex: :booktabs t
1328 | Function          | Without normalisation                                        | Normalised                                                                                                                             |
1329 |-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------|
1330 | $$\Fun{z}$$       | $$\delta (x+i z)$$                                           | $$\frac{1}{2 h}\mathrm{sech}\left(\frac{\pi (x-i (h+z))}{2 h}\right)$$                                                                |
1331 | $$\FunSecond{z}$$ | $$\frac{1}{2}\left[\delta (x-i z) + \delta (x+i z) \right]$$ | $$\frac{1}{4 h}\left[\text{sech}\left(\frac{\pi (x-i (h+z))}{2 h}\right)+\text{sech}\left(\frac{\pi (x+i(h+z))}{2 h}\right)\right]$$ |
1332
1333 ** Evaluation
1334 ARMA model does not require highly optimised software implementation to be
1335 efficient, its performance is high even without use of co-processors; there are
1336 two main causes of that. First, ARMA model itself does not use transcendental
1337 functions (sines, cosines and exponents) as opposed to LH model. All
1338 calculations, except model coefficients, are done via polynomials, which can be
1339 efficiently computed on modern processors using a series of fused multiply-add
1340 (FMA) instructions. Second, pressure computation is done via explicit analytic
1341 formula using nested FFTs. Since two-dimensional FFT of the same size is
1342 repeatedly applied to every time slice, its coefficients (complex exponents) are
1343 pre-computed for all slices, and computations are performed with only a few
1344 transcendental functions. In case of MA model, performance is also increased by
1345 doing convolution with FFT. So, high performance of ARMA model is due to scarce
1346 use of transcendental functions and heavy use of FFT, not to mention that high
1347 convergence rate and non-existence of periodicity allows to use far fewer
1348 coefficients compared to LH model.
1349
1350 ARMA implementation uses several libraries of reusable mathematical functions
1351 and numerical algorithms (listed in table\nbsp{}[[tab-arma-libs]]), and was
1352 implemented using OpenMP and OpenCL parallel programming technologies, that
1353 allow to use the most efficient implementation for a particular algorithm.
1354
1355 #+name: tab-arma-libs
1356 #+caption: A list of mathematical libraries used in ARMA model implementation.
1357 #+attr_latex: :booktabs t
1358 | Library                                                      | What it is used for             |
1359 |--------------------------------------------------------------+---------------------------------|
1360 | DCMT\nbsp{}cite:matsumoto1998dynamic                         | parallel PRNG                   |
1361 | Blitz\nbsp{}cite:veldhuizen1997will,veldhuizen2000techniques | multidimensional arrays         |
1362 | GSL\nbsp{}cite:galassi2015gnu                                | PDF, CDF, FFT computation       |
1363 |                                                              | checking process stationarity   |
1364 | LAPACK, GotoBLAS\nbsp{}cite:goto2008high,goto2008anatomy     | finding AR coefficients         |
1365 | GL, GLUT\nbsp{}cite:kilgard1996opengl                        | three-dimensional visualisation |
1366 | CGAL\nbsp{}cite:fabri2009cgal                                | wave numbers triangulation      |
1367
1368 For the purpose of evaluation we use simplified version of\nbsp{}eqref:eq-phi-high-amp:
1369 \begin{align}
1370     \label{eq-phi-linear}
1371     \phi(x,y,z,t) &= \InverseFourierY{
1372         \frac{ \Sinh{\smash{2\pi \Kveclen (z+h)}} }
1373         { 2\pi\Kveclen \Sinh{\smash{2\pi \Kveclen h}} }
1374         \FourierY{ \zeta_t }{u,v}
1375     }{x,y}\nonumber \\
1376     &= \InverseFourierY{ g_1(u,v) \FourierY{ g_2(x,y) }{u,v} }{x,y}.
1377 \end{align}
1378 This formula is particularly suitable for computation on GPUs:
1379 - it contains transcendental mathematical functions (hyperbolic cosines and
1380   complex exponents);
1381 - it is computed over large four-dimensional ($t$, $x$, $y$, $z$) region;
1382 - it is analytic with no information dependencies between individual data points
1383   in $t$ and $z$ dimensions.
1384 Since standing sea wave generator does not allow efficient GPU implementation
1385 due to autoregressive dependencies between wavy surface points, only velocity
1386 potential solver was rewritten in OpenCL and its performance was compared to
1387 existing OpenMP implementation.
1388
1389 For each implementation the overall performance of the solver for a particular
1390 time instant was measured. Velocity field was computed for one $t$ point, for
1391 128 $z$ points below wavy surface and for each $x$ and $y$ point of
1392 four-dimensional $(t,x,y,z)$ grid. The only parameter that was varied between
1393 subsequent programme runs is the size of the grid along $x$ dimension. A total
1394 of 10 runs were performed and an average time of each stage was computed.
1395
1396 A different FFT library was used for each version of the solver. For OpenMP
1397 version FFT routines from GNU Scientific Library (GSL)\nbsp{}cite:galassi2015gnu
1398 were used, and for OpenCL version clFFT library\nbsp{}cite:clfft was used instead.
1399 There are two major differences in the routines from these libraries.
1400 - The order of frequencies in Fourier transforms is different and clFFT library
1401   requires reordering the result of\nbsp{}eqref:eq-phi-linear whereas GSL does not.
1402 - Discontinuity at $(x,y) = (0,0)$ of velocity potential field grid is handled
1403   automatically by clFFT library, whereas GSL library produce skewed values at
1404   this point.
1405
1406 For GSL library an additional interpolation from neighbouring points was used to
1407 smooth velocity potential field at these points. We have not spotted other
1408 differences in FFT implementations that have impact on the overall performance.
1409
1410 In the course of the numerical experiments we have measured how much time each
1411 solver's implementation spends in each computation stage to explain how
1412 efficient data copying between host and device is in OpenCL implementation, and
1413 how one implementation corresponds to the other in terms of performance.
1414
1415 ** Results
1416 The experiments showed that GPU implementation outperforms CPU implementation by
1417 a factor of 10--15 (fig.\nbsp{}[[fig-bench-cpu-gpu]]), however, distribution of time
1418 between computation stages is different for each implementation
1419 (fig.\nbsp{}[[fig-breakdown-cpu-gpu]]). The major time consumer in CPU
1420 implementation is computation of $$g_1$$, whereas in GPU implementation its
1421 running time is comparable to computation of $$g_2$$. GPU computes $$g_1$$ much
1422 faster than CPU due to a large amount of modules for transcendental mathematical
1423 function computation. In both implementations $$g_2$$ is computed on CPU, but for
1424 GPU implementation the result is duplicated for each $$z$$ grid point in order to
1425 perform multiplication of all $$XYZ$$ planes along $$z$$ dimension in single OpenCL
1426 kernel, and, subsequently copied to GPU memory which severely hinders overall
1427 stage performance. Copying the resulting velocity potential field between CPU
1428 and GPU consumes $$\approx{}20\%$$ of velocity potential solver execution time.
1429
1430 #+name: fig-bench-cpu-gpu
1431 #+caption: Performance comparison of CPU (OpenMP) and GPU (OpenCL) versions of velocity potential solver.
1432 [[file:figures/bench-cpu-gpu.pdf]]
1433
1434 #+name: fig-breakdown-cpu-gpu
1435 #+caption: Performance breakdown for GPU (OpenCL) and CPU (OpenMP) versions of velocity potential solver.
1436 [[file:figures/breakdown-cpu-gpu.pdf]]
1437
1438 * Conclusion
1439 Three-dimensional ARMA ocean simulation model coupled with analytic formula for
1440 determining pressures under wavy sea surface is computationally efficient of
1441 performing long-term ship behaviour simulations on the computer. Possible
1442 applications of the approach include studying ship behaviour in storm and
1443 shallow water waves. Its validity was visually and statistically verified in a
1444 number of experiments: distribution of characteristics of waves, produced by
1445 ARMA model, match the ones of real ocean waves, and velocity potential field,
1446 produced by the analytic formula correspond to the one produced by the formula
1447 for small-amplitude waves, and the formula itself reduces to the known one from
1448 linear wave theory.
1449
1450 Numerical experiments showed that wavy surface generation is efficient on CPU as
1451 it involves no transcendental mathematical functions, and velocity potential
1452 field computation is efficient on GPU due to heavy use of Fourier transforms.
1453 The use of dynamically generated Mersenne Twister PRNGs allows to produce
1454 uncorrelated sequences of pseudo-random numbers with no practical limitation on
1455 the realisation period, which in turn allows to perform long simulation sessions
1456 on parallel machines.
1457
1458 The future work is to make ARMA mathematical apparatus and its numerical
1459 implementation a base of virtual testbed for marine objects dynamics studies.
1460
1461 bibliographystyle:styles/spphys.bst
1462 bibliography:refs.bib
1463
1464 * Config                                                           :noexport:
1465 ** Clone data repository
1466 #+begin_src sh :exports none :results verbatim
1467 set -e
1468 dir=build/arma-benchmarks
1469 mkdir -p $dir 1470 if ! test -d "$dir/.git"
1471 then
1472     git clone https://github.com/igankevich/arma-benchmarks $dir 1473 fi 1474 cd$dir
1475 git checkout master
1476 git pull
1477 git checkout 730ef79a9f49df9c0c22d0a0f06e4d0b69b5bd99
1478 #+end_src
1479
1480 #+RESULTS:
1481 : Ваша ветка обновлена в соответствии с «origin/master».
1482 : Already up-to-date.
1483 ** Configure org export
1484 #+begin_src elisp
1485 (add-to-list 'org-babel-R-graphics-devices '(:eps "cairo_ps" "file"))
1486 #+end_src
1487
1488 #+RESULTS:
1489 | :eps        | cairo_ps   | file     |
1490 | :bmp        | bmp        | filename |
1491 | :jpg        | jpeg       | filename |
1492 | :jpeg       | jpeg       | filename |
1493 | :tikz       | tikz       | file     |
1494 | :tiff       | tiff       | filename |
1495 | :png        | png        | filename |
1496 | :svg        | svg        | file     |
1497 | :pdf        | pdf        | file     |
1498 | :ps         | postscript | file     |
1499 | :postscript | postscript | file     |