waves-16-arma

Simulation of Standing and Propagating Sea Waves with Three-Dimensional ARMA Model
git clone https://git.igankevich.com/waves-16-arma.git
Log | Files | Refs

arma.org (73047B)


      1 # Local Variables:
      2 # org-ref-default-bibliography ("refs.bib")
      3 # org-latex-image-default-width nil
      4 # org-latex-caption-above nil
      5 # org-latex-hyperref-template "\\hypersetup{\n pdfauthor={%a},\n pdftitle={%t},\n pdfkeywords={%k},\n pdfsubject={%d},\n pdfcreator={%c},\n pdflang={%L},\n unicode={true}\n}\n\\setdefaultlanguage{%l}\n"
      6 # org-export-latex-tables-hline "\\midrule"
      7 # org-export-latex-tables-tstart "\\toprule"
      8 # org-export-latex-tables-tend "\\bottomrule"
      9 # End:
     10 
     11 #+TITLE: Simulation of standing and propagating sea waves with three-dimensional ARMA model
     12 #+AUTHOR: Ivan Gankevich, Alexander Degtyarev
     13 #+LANGUAGE: en
     14 #+LATEX_CLASS: svmult
     15 #+LATEX_CLASS_OPTIONS: [graybox]
     16 #+LATEX_HEADER_EXTRA: \input{preamble}
     17 #+LATEX_HEADER_EXTRA: \titlerunning{Simulation of sea waves with ARMA model}
     18 #+LATEX_HEADER_EXTRA: \authorrunning{I.\:Gankevich, A.\:Degtyarev}
     19 #+OPTIONS: H:5 todo:nil toc:nil
     20 #+PROPERTY: header-args:R :results graphics :exports results :family Times
     21 
     22 #+begin_export latex
     23 \abstract*{%
     24 Simulation of sea waves is a problem appearing in the framework of developing
     25 software-based ship motion modelling applications. These applications generally
     26 use linear wave theory to generate small-amplitude waves programmatically and
     27 determine impact of external excitations on a ship hull. Using linear wave
     28 theory is feasible for ocean waves, but is not accurate for shallow-water and
     29 storm waves. To cope with these shortcomings we introduce autoregressive
     30 moving-average (ARMA) model, which is widely known in oceanography, but rarely
     31 used for sea wave modelling. The new model allows to generate waves of arbitrary
     32 amplitudes, is accurate for both shallow and deep water, and its software
     33 implementation shows superior performance by relying on fast Fourier transform
     34 family of algorithms. Integral characteristics of wavy surface produced by ARMA
     35 model are verified against the ones of real sea surface. Despite all its
     36 advantages, ARMA model requires a new method to determine wave pressures, an
     37 instance of which is included in the chapter.%
     38 }
     39 #+end_export
     40 
     41 * Introduction
     42 Studying behaviour of a ship at sea is often based on some model of external
     43 excitations\nbsp{}--- any disturbance that displaces the vessel from
     44 equilibrium\nbsp{}--- major component of which is wind waves. Currently, the
     45 most popular sea wave simulation models are based on the linear expansion of a
     46 stochastic moving surface as a system of independent random variables. Such
     47 models were studied by St. Denis & Pearson\nbsp{}cite:st1953motions,
     48 Rosenblatt\nbsp{}cite:rosenblatt1956random,
     49 Sveshnikov\nbsp{}cite:sveshnikov1959determination, and
     50 Longuet---Higgins\nbsp{}cite:longuet1957statistical. The most popular model is
     51 that of Longuet---Higgins (LH), which approximates propagating sea waves as a
     52 superposition of elementary harmonic waves with random phases \(\epsilon_n\) and
     53 random amplitudes \(c_n\):
     54 \begin{equation}
     55   \label{eq-lhmodel}
     56   \zeta(x,y,t) = \sum\limits_n c_n \cos(u_n x + v_n y - \omega_n t + \epsilon_n),
     57 \end{equation}
     58 where the wave number \((u_n,v_n)\) is continuously distributed on the \((u,v)\)
     59 plane, i.e. the unit area \(du \times dv\) contains an infinite number of wave
     60 numbers. The frequency \(\omega_n\) associated with wave numbers \((u_n,v_n)\)
     61 is given by a dispersion relation
     62 \begin{equation*}
     63   \omega_n = \omega(u_n,v_n).
     64 \end{equation*}
     65 The phase \(\epsilon_n\) are jointly independent random variables uniformly
     66 distributed in the interval \([0,2\pi]\).
     67 
     68 Longuet---Higgins showed that under the above conditions, the function
     69 \(\zeta(x,y,t)\) is a three-dimensional steady-state homogeneous ergodic
     70 Gaussian field, defined by
     71 \begin{equation*}
     72   2 E_{\zeta}(u,v) du dv = \sum\limits_{n} c_n^2,
     73 \end{equation*}
     74 where \(E_{\zeta}(u,v)\) is two-dimensional spectral density of wave energy.
     75 
     76 Formula\nbsp{}eqref:eq-lhmodel is derived from equation of continuity and
     77 equation of motion for incompressible inviscid fluid. For ocean waves
     78 incompressibility and isotropy of a fluid is assumed; since the motion of ocean
     79 waves is due to gravitational forces, irrotational motion of the fluid is
     80 assumed which let us introduce the velocity potential \(\phi\). Under these
     81 assumptions the equation of continuity reduces to Laplace equation:
     82 \begin{equation*}
     83   \label{eq-continuity}
     84   \Delta\phi =
     85   \frac{\partial^2{\phi_x}}{\partial{x^2}} +
     86   \frac{\partial^2{\phi_y}}{\partial{y^2}} +
     87   \frac{\partial^2{\phi_z}}{\partial{z^2}} = 0.
     88 \end{equation*}
     89 The Laplace equation is linear and its solution can be found using Fourier
     90 transforms. Thus, for plane waves a well-known solution is given in the form of
     91 a definite integral\nbsp{}cite:kochin1966theoretical:
     92 \begin{equation*}
     93   \phi(x,z,t) =
     94   \int\limits_{0}^{\infty}
     95   e^{kz} \left[ A(k,t) \cos{kx} + B(k,t) \sin{kx} \right] dk.
     96 \end{equation*}
     97 
     98 A similar, but slightly more complicated solution is obtained for the
     99 three-dimensional case. The constants \(A\) and \(B\) are determined from the boundary
    100 conditions on the surface. In the linear formulation the equation of the wave
    101 profile (which is derived from linearised kinematic boundary condition and
    102 equation of motion, see sec.\nbsp{}[[#sec-pressures]]) is
    103 \begin{align}
    104   \zeta(x,t) &= -\frac{1}{g} \frac{\partial{\phi(x,0,t)}}{\partial{t}} \label{eq-integ-zeta}\\
    105              &= \int\limits_{0}^{\infty}
    106                 \left[ \frac{\partial{A(k,t)}}{\partial{t}} \cos{kx} + \frac{\partial{B(k,t)}}{\partial{t}} \sin{kx} \right] dk \nonumber \\
    107              &= \int\limits_{0}^{\infty} C_t(k,t) \cos\left(kx + \epsilon(k,t)\right).\nonumber
    108 \end{align}
    109 If we set \(c_n = C_t(k_n, t) dk\), then wave model\nbsp{}eqref:eq-lhmodel may
    110 be associated with an approximation of integral\nbsp{}eqref:eq-integ-zeta.
    111 
    112 Although, LH model is based on simple linear wave theory and has straightforward
    113 computational algorithm, it has some serious shortcomings.
    114 - LH model is designed to represent a stationary Gaussian field. Normal
    115   distribution of the simulated process\nbsp{}eqref:eq-lhmodel is a consequence
    116   of the central limit theorem: its application to the analysis of
    117   storm or shallow water waves represents a significant challenge.
    118 - LH model is periodic and need a large set of frequencies to perform long-term
    119   simulation.
    120 - In the numerical implementation of the LH model, it appears that convergence
    121   rate of\nbsp{}eqref:eq-lhmodel is slow. This leads to a skewed simulated wave
    122   energy spectrum and skewed cumulative distribution functions of various wave
    123   parameters (heights, lengths, etc.). This problem becomes especially
    124   significant when simulating complex sea waves that have a wide spectrum with
    125   multiple peaks.
    126 
    127 The latter point becomes particularly critical in long-term numerical
    128 simulation. In a time domain computation of the responses of a vessel in a
    129 random seaway, the repeated evaluation of the apparently simple
    130 eq.\nbsp{}eqref:eq-lhmodel at hundreds of points on the hull for thousands of
    131 time steps becomes a major factor determining the execution speed of the
    132 code\nbsp{}cite:beck2001modern. So, finding a less computationally intensive
    133 method for modelling ocean waves has the potential to increase performance of
    134 long-term simulation.
    135 
    136 * Related work
    137 ** Ocean wave modelling
    138 Another approach to simulating sea waves involves representing stochastic moving
    139 surface as a linear transformation of white noise with memory, which allows to
    140 model stationary ergodic Gaussian random process with given correlation
    141 characteristics\nbsp{}cite:box1976time. The first attempts to model
    142 two-dimensional disturbances were undertaken
    143 in\nbsp{}cite:kostecki1972stochastic, which resulted in the development of the
    144 resonance theory of wind waves, and the formal mathematical framework was
    145 developed in\nbsp{}cite:rozhkov1990,gurgenidze1988 --- the authors built a
    146 one-dimensional model of ocean waves based on autoregressive-moving
    147 average (ARMA) model.
    148 
    149 One-dimensional ARMA model does not have some of the LH model deficiencies: it
    150 is both computationally efficient and requires less number of coefficients to
    151 converge. In\nbsp{}cite:spanos1982arma ARMA model is used to generate time
    152 series spectrum of which is compatible with Pierson---Moskowitz (PM)
    153 approximation of ocean wave spectrum. The authors carry out experiments for
    154 one-dimensional AR, MA and ARMA models. They mention excellent agreement between
    155 target and initial spectra and higher performance of ARMA model compared to
    156 models based on summing large number of harmonic components with random phases.
    157 They also mention that in order to reach agreement between target and initial
    158 spectrum MA model require lesser number of coefficients than AR model.
    159 In\nbsp{}cite:spanos1996efficient the authors generalise ARMA model coefficients
    160 determination formulae for multi-variate case.
    161 
    162 AR model was successfully applied to predict evolution of propagating wave
    163 profiles based on instantaneous wave recordings. In\nbsp{}cite:fusco2010short AR
    164 model is used to predict swell waves to control wave-energy converters (WEC) in
    165 real-time. In order to make WEC more efficient its internal oscillator frequency
    166 should match the one of ocean waves. The authors treat wave elevation as time
    167 series and compare performance of AR model, neural networks and cyclical models
    168 in forecasting time series future values. AR model gives the most accurate
    169 prediction of low-frequency swell waves for up to two typical wave periods. It
    170 is an example of successful application of AR process to ocean wave modelling.
    171 
    172 The feature that distinguishes present work with respect to afore-mentioned ones
    173 is the study of three-dimensional (2D in space and 1D in time) ARMA model, which
    174 is mostly a different problem.
    175 1. Yule---Walker system of equations, which are used to determine AR
    176    coefficients, has complex block-block structure.
    177 2. Optimal model order (in a sense that target spectrum agrees with initial) is
    178    determined manually.
    179 3. Instead of PM spectrum, analytic formulae for standing and propagating
    180    waves ACF are used as the model input.
    181 4. Three-dimensional wavy surface should be compatible with real ocean surface
    182    not only in terms of spectral characteristics, but also in the shape of wave
    183    profiles. So, model verification includes distributions of various parameters
    184    of generated waves (lengths, heights, periods etc.).
    185 Multi-dimensionality of investigated model not only complexifies the task, but
    186 also allows to carry out visual validation of generated wavy surface. It is the
    187 opportunity to visualise output of the programme that allowed to ensure that
    188 generated surface is compatible with real ocean surface, and is not abstract
    189 multi-dimensional stochastic process that is real only statistically.
    190 
    191 ** Pressure field determination formulae
    192 **** Small amplitude waves theory
    193 In\nbsp{}cite:stab2012,degtyarev1998modelling,degtyarev1997analysis the authors
    194 propose a solution for inverse problem of hydrodynamics of potential flow within
    195 the framework of small-amplitude wave theory (under assumption that wave length
    196 is much larger than height: \(\lambda{}\gg{}h\)). In that case inverse problem
    197 is linear and reduces to Laplace equation with mixed boundary conditions, and
    198 equation of motion is solely used to determine pressures for calculated velocity
    199 potential derivatives. The assumption of small amplitudes means the slow decay
    200 of wind wave coherence function, i.e.\nbsp{}small change of local wave number in
    201 time and space compared to the wavy surface elevation (\(z\) coordinate). This
    202 assumption allows to calculate elevation \(z\) derivative as \(\zeta_z=k\zeta\),
    203 where \(k\) is wave number. In two-dimensional case the solution is written
    204 explicitly as
    205 \begin{align}
    206     \left.\frac{\partial\phi}{\partial x}\right|_{x,t}= &
    207         -\frac{1}{\sqrt{1+\alpha^{2}}}e^{-I(x)}
    208             \int\limits_{0}^x\frac{\partial\dot{\zeta}/\partial
    209                 z+\alpha\dot{\alpha}}{\sqrt{1+\alpha^{2}}}e^{I(x)}dx,\label{eq-old-sol-2d}\\
    210     I(x)= & \int\limits_{0}^x\frac{\partial\alpha/\partial z}{1+\alpha^{2}}dx,\nonumber
    211 \end{align}
    212 where \(\alpha\) is wave slope. In three-dimensional case solution is written in
    213 the form of elliptic partial differential equation (PDE):
    214 \begin{align*}
    215     & \frac{\partial^2 \phi}{\partial x^2} \left( 1 + \alpha_x^2 \right) +
    216     \frac{\partial^2 \phi}{\partial y^2} \left( 1 + \alpha_y^2 \right) +
    217     2\alpha_x\alpha_y \frac{\partial^2 \phi}{\partial x \partial y} + \\
    218     & \left(
    219         \frac{\partial \alpha_x}{\partial z} +
    220         \alpha_x \frac{\partial \alpha_x}{\partial x} +
    221         \alpha_y \frac{\partial \alpha_x}{\partial y}
    222     \right) \frac{\partial \phi}{\partial x} + \\
    223     & \left(
    224         \frac{\partial \alpha_y}{\partial z} +
    225         \alpha_x \frac{\partial \alpha_y}{\partial x} +
    226         \alpha_y \frac{\partial \alpha_y}{\partial y}
    227     \right) \frac{\partial \phi}{\partial y} + \\
    228     & \frac{\partial \dot{\zeta}}{\partial z} +
    229     \alpha_x \dot{\alpha_x} + \alpha_y \dot{\alpha_y} = 0.
    230 \end{align*}
    231 The authors suggest transforming this equation to finite differences and solve
    232 it numerically.
    233 
    234 As will be shown in sec.\nbsp{}[[#sec:compare-formulae]]
    235 that\nbsp{}eqref:eq-old-sol-2d diverges when attempted to calculate velocity
    236 field for large amplitude waves, and this is the reason that it can not be used
    237 together with ARMA model, that generates arbitrary amplitude waves.
    238 
    239 **** Linearisation of boundary condition
    240 :PROPERTIES:
    241 :CUSTOM_ID: linearisation
    242 :END:
    243 
    244 LH model allows to derive an explicit formula for velocity field by linearising
    245 kinematic boundary condition. Velocity potential formula is written as
    246 \begin{equation*}
    247 \phi(x,y,z,t) = \sum_n \frac{c_n g}{\omega_n}
    248      e^{\sqrt{u_n^2+v_n^2} z}
    249      \sin(u_n x + v_n y - \omega_n t + \epsilon_n).
    250 \end{equation*}
    251 This formula is differentiated to obtain velocity potential derivatives, which
    252 are plugged to dynamic boundary condition to obtain pressures.
    253 
    254 * Three-dimensional ARMA process as a sea wave simulation model
    255 
    256 ARMA ocean simulation model defines wavy surface as three-dimensional (two
    257 dimensions in space and one in time) autoregressive moving average process:
    258 every surface point is represented as a weighted sum of previous in time and
    259 space points plus weighted sum of previous in time and space normally
    260 distributed random impulses. The governing equation for 3-D ARMA process is
    261 \begin{equation}
    262     \zeta_{\vec i}
    263     =
    264     \sum\limits_{\vec j = \vec 0}^{\vec N}
    265     \Phi_{\vec j} \zeta_{\vec i - \vec j}
    266     +
    267     \sum\limits_{\vec j = \vec 0}^{\vec M}
    268     \Theta_{\vec j} \epsilon_{\vec i - \vec j}
    269     ,
    270     \label{eq-arma-process}
    271 \end{equation}
    272 where \(\zeta\)\nbsp{}--- wave elevation, \(\Phi\)\nbsp{}--- AR process
    273 coefficients, \(\Theta\)\nbsp{}--- MA process coefficients,
    274 \(\epsilon\)\nbsp{}--- white noise with Gaussian distribution,
    275 \(\vec{N}\)\nbsp{}--- AR process order, \(\vec{M}\)\nbsp{}--- MA process order,
    276 and \(\Phi_{\vec{0}}\equiv{0}\), \(\Theta_{\vec{0}}\equiv{0}\). Here arrows
    277 denote multi-component indices with a component for each dimension. In general,
    278 any scalar quantity can be a component (temperature, salinity, concentration of
    279 some substance in water etc.). Equation parameters are AR and MA process
    280 coefficients and order.
    281 
    282 Any ARMA process can be uniquely represented as either MA or AR process of
    283 infinite order\nbsp{}cite:gurgenidze1988, and the parameters of the spectral
    284 representation are defined by the rule of division of power series (in a
    285 rational factorized form\nbsp{}cite:rozhkov1990):
    286 \begin{equation*}
    287   S(\omega) =
    288   \frac{\Delta\sigma^2}{\pi}
    289   \frac{\prod\limits_m (1 - z_m e^{-im\omega\Delta})(1 - z_m e^{im\omega\Delta})}
    290        {\prod\limits_n (1 - p_n e^{-in\omega\Delta})(1 - p_n e^{in\omega\Delta})},
    291 \end{equation*}
    292 where \(z_m\) and \(p_n\) are the zeros of numerator (MA), and denominator (AR),
    293 respectively, which form a pair of mutually conjugate numbers. If some of the
    294 zeros are located near the unit circle, then the spectral density will have
    295 pronounced dips.
    296 
    297 ** Autoregressive (AR) process
    298 AR process is ARMA process with only one random impulse instead of their
    299 weighted sum:
    300 \begin{equation}
    301     \zeta_{\vec i}
    302     =
    303     \sum\limits_{\vec j = \vec 0}^{\vec N}
    304     \Phi_{\vec j} \zeta_{\vec i - \vec j}
    305     +
    306     \epsilon_{i,j,k}
    307     .
    308     \label{eq-ar-process}
    309 \end{equation}
    310 The coefficients \(\Phi\) are calculated from auto-covariate function (ACF) via
    311 three-dimensional Yule---Walker (YW) equations, which are obtained after
    312 multiplying both parts of the previous equation by \(\zeta_{\vec{i}-\vec{k}}\)
    313 and computing the expected value. Generic form of YW equations is
    314 \begin{equation}
    315     \label{eq-yule-walker}
    316     \gamma_{\vec k}
    317     =
    318     \sum\limits_{\vec j = \vec 0}^{\vec N}
    319     \Phi_{\vec j}
    320     \text{ }\gamma_{\vec{k}-\vec{j}}
    321     +
    322     \Var{\epsilon} \delta_{\vec{k}},
    323     \qquad
    324     \delta_{\vec{k}} =
    325     \begin{cases}
    326         1, \quad \text{if } \vec{k}=0 \\
    327         0, \quad \text{if } \vec{k}\neq0,
    328     \end{cases}
    329 \end{equation}
    330 where \(\gamma\)\nbsp{}--- ACF of process \(\zeta\), \(\Var{\epsilon}\)\nbsp{}--- white noise
    331 variance. Matrix form of three-dimensional YW equations, which is used in the
    332 present work, is
    333 \begin{equation*}
    334     \Gamma
    335     \left[
    336         \begin{array}{l}
    337             \Phi_{\vec 0}\\
    338             \Phi_{0,0,1}\\
    339             \vdotswithin{\Phi_{\vec 0}}\\
    340             \Phi_{\vec N}
    341         \end{array}
    342     \right]
    343     =
    344     \left[
    345         \begin{array}{l}
    346             \gamma_{0,0,0}-\Var{\epsilon}\\
    347             \gamma_{0,0,1}\\
    348             \vdotswithin{\gamma_{\vec 0}}\\
    349             \gamma_{\vec N}
    350         \end{array}
    351     \right],
    352     \qquad
    353     \Gamma=
    354     \left[
    355         \begin{array}{llll}
    356             \Gamma_0 & \Gamma_1 & \cdots & \Gamma_{N_1} \\
    357             \Gamma_1 & \Gamma_0 & \ddots & \vdotswithin{\Gamma_0} \\
    358             \vdotswithin{\Gamma_0} & \ddots & \ddots & \Gamma_1 \\
    359             \Gamma_{N_1} & \cdots & \Gamma_1 & \Gamma_0
    360         \end{array}
    361     \right],
    362 \end{equation*}
    363 where \(\vec N = \left( p_1, p_2, p_3 \right)\) and
    364 \begin{equation*}
    365     \Gamma_i =
    366     \left[
    367     \begin{array}{llll}
    368         \Gamma^0_i & \Gamma^1_i & \cdots & \Gamma^{N_2}_i \\
    369         \Gamma^1_i & \Gamma^0_i & \ddots & \vdotswithin{\Gamma^0_i} \\
    370         \vdotswithin{\Gamma^0_i} & \ddots & \ddots & \Gamma^1_i \\
    371         \Gamma^{N_2}_i & \cdots & \Gamma^1_i & \Gamma^0_i
    372     \end{array}
    373     \right]
    374     \qquad
    375     \Gamma_i^j=
    376     \left[
    377     \begin{array}{llll}
    378         \gamma_{i,j,0} & \gamma_{i,j,1} & \cdots & \gamma_{i,j,N_3} \\
    379         \gamma_{i,j,1} & \gamma_{i,j,0} & \ddots &x \vdotswithin{\gamma_{i,j,0}} \\
    380         \vdotswithin{\gamma_{i,j,0}} & \ddots & \ddots & \gamma_{i,j,1} \\
    381         \gamma_{i,j,N_3} & \cdots & \gamma_{i,j,1} & \gamma_{i,j,0}
    382     \end{array}
    383     \right],
    384 \end{equation*}
    385 Since \(\Phi_{\vec 0}\equiv0\), the first row and column of \(\Gamma\) can be
    386 eliminated. Matrix \(\Gamma\) is block-toeplitz, positive definite and symmetric,
    387 hence the system is efficiently solved by Cholesky decomposition, which is
    388 particularly suitable for these types of matrices.
    389 
    390 After solving this system of equations white noise variance is estimated
    391 from\nbsp{}eqref:eq-yule-walker by plugging \(\vec k = \vec 0\):
    392 \begin{equation*}
    393     \Var{\epsilon} =
    394     \Var{\zeta}
    395     -
    396     \sum\limits_{\vec j = \vec 0}^{\vec N}
    397     \Phi_{\vec j}
    398     \text{ }\gamma_{\vec{j}}.
    399 \end{equation*}
    400 
    401 ** Moving average (MA) process
    402 MA process is ARMA process with \(\Phi\equiv0\):
    403 \begin{equation}
    404     \zeta_{\vec i}
    405     =
    406     \sum\limits_{\vec j = \vec 0}^{\vec M}
    407     \Theta_{\vec j} \epsilon_{\vec i - \vec j}
    408     .
    409     \label{eq-ma-process}
    410 \end{equation}
    411 MA coefficients \(\Theta\) are defined implicitly via the following non-linear
    412 system of equations:
    413 \begin{equation*}
    414   \gamma_{\vec i} =
    415   \left[
    416     \displaystyle
    417     \sum\limits_{\vec j = \vec i}^{\vec M}
    418     \Theta_{\vec j}\Theta_{\vec j - \vec i}
    419   \right]
    420   \Var{\epsilon}.
    421 \end{equation*}
    422 The system is solved numerically by fixed-point iteration method via the
    423 following formulae
    424 \begin{equation*}
    425   \Theta_{\vec i} =
    426     -\frac{\gamma_{\vec 0}}{\Var{\epsilon}}
    427     +
    428     \sum\limits_{\vec j = \vec i}^{\vec M}
    429     \Theta_{\vec j} \Theta_{\vec j - \vec i}.
    430 \end{equation*}
    431 Here coefficients \(\Theta\) are calculated from back to front: from
    432 \(\vec{i}=\vec{M}\) to \(\vec{i}=\vec{0}\). White noise variance is estimated by
    433 \begin{equation*}
    434     \Var{\epsilon} = \frac{\gamma_{\vec 0}}{
    435     1
    436     +
    437     \sum\limits_{\vec j = \vec 0}^{\vec M}
    438     \Theta_{\vec j}^2
    439     }.
    440 \end{equation*}
    441 Authors of\nbsp{}cite:box1976time suggest using Newton---Raphson method to solve this
    442 equation with higher precision, however, this method does not work in three
    443 dimensions. Using slower method does not have dramatic effect on the overall
    444 programme performance, because the number of coefficients is small and most of
    445 the time is spent generating wavy surface.
    446 
    447 ** Mixed autoregressive moving average (ARMA) process
    448 :PROPERTIES:
    449 :CUSTOM_ID: sec:how-to-mix-ARMA
    450 :END:
    451 Generally speaking, ARMA process is obtained by plugging MA generated wavy
    452 surface as random impulse to AR process, however, in order to get the process
    453 with desired ACF one should re-compute AR coefficients before plugging. There
    454 are several approaches to "mix" AR and MA processes.
    455 - The approach proposed in\nbsp{}cite:box1976time which involves dividing ACF into MA
    456   and AR part along each dimension is not applicable here, because in three
    457   dimensions such division is not possible: there always be parts of the ACF
    458   that are not taken into account by AR and MA process.
    459 - The alternative approach is to use the same (undivided) ACF for both AR and MA
    460   processes but use different process order, however, then realisation
    461   characteristics (mean, variance etc.) become skewed: these are characteristics
    462   of the two overlapped processes.
    463 For the first approach there is a formula to re-compute ACF for AR process, but
    464 there is no such formula for the second approach. So, the best solution for now
    465 is to simply use AR and MA process exclusively for different types of waves.
    466 
    467 ** Process selection criteria for different wave profiles
    468 :PROPERTIES:
    469 :CUSTOM_ID: sec-process-selection
    470 :END:
    471 One problem of ARMA model application to ocean wave generation is that for
    472 different types of wave profiles different processes /must/ be used: standing
    473 waves are modelled by AR process, and propagating waves by MA process. This
    474 statement comes from practice: if one tries to use the processes the other way
    475 round, the resulting realisation either diverges or does not correspond to real
    476 ocean waves. So, the best way to apply ARMA model to ocean wave generation is to
    477 use AR process for standing waves and MA process for progressive waves.
    478 
    479 The other problem is inability to automatically determine optimal number of
    480 coefficients for three-dimensional AR and MA processes. For one-dimensional
    481 processes this can be achieved via iterative methods\nbsp{}cite:box1976time, but
    482 they diverge in three-dimensional case.
    483 
    484 The final problem, which is discussed in [[#sec:how-to-mix-ARMA]], is inability to
    485 "mix" AR and MA process in three dimensions.
    486 
    487 In practice some statements made for AR and MA processes in\nbsp{}cite:box1976time
    488 should be flipped for three-dimensional case. For example, the authors say that
    489 ACF of MA process cuts at \(q\) and ACF of AR process decays to nought infinitely,
    490 but in practice making ACF of 3-dimensional MA process not decay results in it
    491 being non-invertible and producing realisation that does not look like real
    492 ocean waves, whereas doing the same for ACF of AR process results in stationary
    493 process and adequate realisation. Also, the authors say that one
    494 should allocate the first \(q\) points of ACF to MA process (as it often needed to
    495 describe the peaks in ACF) and leave the rest points to AR process, but in
    496 practice in case of ACF of a propagating wave AR process is stationary only for
    497 the first time slice of the ACF, and the rest is left to MA process.
    498 
    499 To summarise, the only established scenario of applying ARMA model to ocean wave
    500 generation is to use AR process for standing waves and MA process for
    501 propagating waves. With a new formulae for 3 dimensions a single mixed ARMA
    502 process might increase model precision, which is one of the objectives of the
    503 future research.
    504 
    505 ** The shape of ACF for different types of waves
    506 **** Analytic method of finding the ACF
    507 The straightforward way to find ACF for a given ocean wave profile is to apply
    508 Wiener---Khinchin theorem. According to this theorem the autocorrelation \(K\) of
    509 a function \(\zeta\) is given by the Fourier transform of the absolute square of
    510 the function:
    511 \begin{equation}
    512   K(t) = \Fourier{\left| \zeta(t) \right|^2}.
    513   \label{eq-wiener-khinchin}
    514 \end{equation}
    515 When \(\zeta\) is replaced with actual wave profile, this formula gives you
    516 analytic formula for the corresponding ACF.
    517 
    518 For three-dimensional wave profile (2D in space and 1D in time) analytic formula
    519 is a polynomial of high order and is best obtained via symbolic computation
    520 programme. Then for practical usage it can be approximated by superposition of
    521 exponentially decaying cosines (which is how ACF of a stationary ARMA process
    522 looks like\nbsp{}cite:box1976time).
    523 
    524 **** Empirical method of finding the ACF
    525 However, for three-dimensional case there exists simpler empirical method which
    526 does not require sophisticated software to determine shape of the ACF. It is
    527 known that ACF represented by exponentially decaying cosines satisfies first
    528 order Stokes' equations for gravity waves\nbsp{}cite:boccotti1983wind. So, if the
    529 shape of the wave profile is the only concern in the simulation, then one can
    530 simply multiply it by a decaying exponent to get appropriate ACF. This ACF does
    531 not reflect other wave profile parameters, such as wave height and period, but
    532 opens possibility to simulate waves of a particular non-analytic shape by
    533 "drawing" their profile, then multiplying it by an exponent and using the
    534 resulting function as ACF. So, this empirical method is imprecise but offers
    535 simpler alternative to Wiener---Khinchin theorem approach; it is mainly useful
    536 to test ARMA model.
    537 
    538 **** Standing wave ACF
    539 For three-dimensional plain standing wave the profile is given by
    540 \begin{equation}
    541   \zeta(t, x, y) = A \sin (k_x x + k_y y) \sin (\sigma t).
    542   \label{eq-standing-wave}
    543 \end{equation}
    544 Find ACF via analytic method. Multiplying the formula by a decaying exponent
    545 (because Fourier transform is defined for a function \(f\) that
    546 \(f\underset{x\rightarrow\pm\infty}{\longrightarrow}0\)) yields
    547 \begin{equation}
    548   \zeta(t, x, y) =
    549   A
    550   \exp\left[-\alpha (|t|+|x|+|y|) \right]
    551   \sin (k_x x + k_y y) \sin (\sigma t).
    552   \label{eq-decaying-standing-wave}
    553 \end{equation}
    554 Then, apply 3D Fourier transform to the both sides of the equation via symbolic
    555 computation programme, fit the resulting polynomial to the following
    556 approximation:
    557 \begin{equation}
    558   K(t,x,y) =
    559   \gamma
    560   \exp\left[-\alpha (|t|+|x|+|y|) \right]
    561   \cos \beta t
    562   \cos \left[ \beta x + \beta y \right].
    563   \label{eq-standing-wave-acf}
    564 \end{equation}
    565 So, after applying Wiener---Khinchin theorem we get initial formula but with
    566 cosines instead of sines. This difference is important because the value of ACF
    567 at \((0,0,0)\) equals to the ARMA process variance, and if one used sines the
    568 value would be wrong.
    569 
    570 If one tries to replicate the same formula via empirical method, the usual way
    571 is to adapt\nbsp{}eqref:eq-decaying-standing-wave to
    572 match\nbsp{}eqref:eq-standing-wave-acf. This can be done either by changing the
    573 phase of the sine, or by substituting sine with cosine to move the maximum of
    574 the function to the origin of coordinates.
    575 
    576 **** Propagating wave ACF
    577 Three-dimensional profile of plain propagating wave is given by
    578 \begin{equation}
    579   \zeta(t, x, y) = A \cos (\sigma t + k_x x + k_y y).
    580   \label{eq-propagating-wave}
    581 \end{equation}
    582 For the analytic method repeating steps from the previous two paragraphs yields
    583 \begin{equation}
    584   K(t,x,y) =
    585   \gamma
    586   \exp\left[-\alpha (|t|+|x|+|y|) \right]
    587   \cos\left[\beta (t+x+y) \right].
    588   \label{eq-propagating-wave-acf}
    589 \end{equation}
    590 For the empirical method the wave profile is simply multiplied by a decaying
    591 exponent without need to adapt the maximum value of ACF (as it is required for
    592 standing wave).
    593 
    594 **** Comparison of studied methods
    595 To summarise, the analytic method of finding ocean wave's ACF reduces to the
    596 following steps.
    597 - Make wave profile decay when approaching \(\pm\infty\) by multiplying it by
    598   a decaying exponent.
    599 - Apply Fourier transform to the absolute square of the resulting equation using
    600   symbolic computation programme.
    601 - Fit the resulting polynomial to the appropriate ACF approximation.
    602 
    603 Two examples in this section showed that in case of standing and propagating
    604 waves their decaying profiles resemble the corresponding ACFs with the exception
    605 that the ACF's maximum should be moved to the origin to preserve simulated
    606 process variance. Empirical method of finding ACF reduces to the following
    607 steps.
    608 - Make wave profile decay when approaching \(\pm\infty\) by multiplying it by
    609   a decaying exponent.
    610 - Move maximum value of the resulting function to the origin by using
    611   trigonometric identities to shift the phase.
    612 
    613 ** Evaluation and discussion
    614 In\nbsp{}cite:degtyarev2011modelling,degtyarev2013synoptic,boukhanovsky1997thesis
    615 for AR model the following items were verified experimentally:
    616 - probability distributions of different wave characteristics (wave heights,
    617   lengths, crests, periods, slopes, three-dimensionality),
    618 - dispersion relation,
    619 - retention of integral characteristics for mixed wave sea state.
    620 In this work we repeat probability distribution tests for three-dimensional AR
    621 and MA model.
    622 
    623 In\nbsp{}cite:rozhkov1990 the authors show that several ocean wave
    624 characteristics (listed in table\nbsp{}[[tab-weibull-shape]]) have Weibull
    625 distribution, and wavy surface elevation has Gaussian distribution. In order to
    626 verify that distributions corresponding to generated realisation are correct,
    627 quantile-quantile plots are used (plots where analytic quantile values are used
    628 for \(OX\) axis and estimated quantile values for \(OY\) axis). If the estimated
    629 distribution matches analytic then the graph has the form of the straight line.
    630 Tails of the graph may diverge from the straight line, because they can not be
    631 reliably estimated from the finite-size realisation. Different methods of
    632 extracting waves from realisation produce variations in quantile function tails,
    633 it is probably impractical to extract every possible wave from realisation since
    634 they may (and often) overlap.
    635 
    636 #+name: tab-weibull-shape
    637 #+caption: Values of Weibull shape parameter for different wave characteristics.
    638 #+attr_latex: :booktabs t
    639 | Characteristic       | Weibull shape (\(k\)) |
    640 |----------------------+---------------------|
    641 | Wave height          |                   2 |
    642 | Wave length          |                 2.3 |
    643 | Crest length         |                 2.3 |
    644 | Wave period          |                   3 |
    645 | Wave slope           |                 2.5 |
    646 | Three-dimensionality |                 2.5 |
    647 
    648 Verification was performed for standing and propagating waves. The corresponding
    649 ACFs and quantile-quantile plots of wave characteristics distributions are shown
    650 in
    651 fig.\nbsp{}[[propagating-wave-distributions]],\nbsp{}[[standing-wave-distributions]],\nbsp{}[[acf-slices]].
    652 
    653 #+name: propagating-wave-distributions
    654 #+header: :width 4.5 :height 5
    655 #+begin_src R :file build/propagating-wave-qqplots.eps
    656 source(file.path("R", "common.R"))
    657 par(pty="s", mfrow=c(2, 2), mai = c(1, 0.2, 0.2, 0.2))
    658 arma.qqplot_grid(
    659   file.path("build", "arma-benchmarks", "verification", "propagating_wave"),
    660   c("elevation", "heights_y", "lengths_y", "periods"),
    661   c("elevation", "height Y", "length Y", "period"),
    662   xlab="x",
    663   ylab="y"
    664 )
    665 #+end_src
    666 
    667 #+caption: Quantile-quantile plots for propagating waves.
    668 #+label: propagating-wave-distributions
    669 #+RESULTS: propagating-wave-distributions
    670 [[file:build/propagating-wave-qqplots.eps]]
    671 
    672 #+name: standing-wave-distributions
    673 #+header: :width 4.5 :height 5
    674 #+begin_src R :file build/standing-wave-qqplots.eps
    675 source(file.path("R", "common.R"))
    676 par(pty="s", mfrow=c(2, 2), mai = c(1, 0.2, 0.2, 0.2))
    677 arma.qqplot_grid(
    678   file.path("build", "arma-benchmarks", "verification", "standing_wave"),
    679   c("elevation", "heights_y", "lengths_y", "periods"),
    680   c("elevation", "height Y", "length Y", "period"),
    681   xlab="x",
    682   ylab="y"
    683 )
    684 #+end_src
    685 
    686 #+caption: Quantile-quantile plots for standing waves.
    687 #+label: standing-wave-distributions
    688 #+RESULTS: standing-wave-distributions
    689 [[file:build/standing-wave-qqplots.eps]]
    690 
    691 #+name: acf-slices
    692 #+header: :width 4.5 :height 7
    693 #+begin_src R :file build/acf-slices.eps
    694 source(file.path("R", "common.R"))
    695 propagating_acf <- read.csv(file.path("build", "arma-benchmarks", "verification", "propagating_wave", "acf.csv"))
    696 standing_acf <- read.csv(file.path("build", "arma-benchmarks", "verification", "standing_wave", "acf.csv"))
    697 par(mfrow=c(5, 2), mar=c(0,0,0,0), mai=c(0.1, 0.1, 0.1, 0.1))
    698 for (i in seq(0, 4)) {
    699   arma.wavy_plot(standing_acf, i, zlim=c(-1,1))
    700   arma.wavy_plot(propagating_acf, i, zlim=c(-1,1))
    701 }
    702 #+end_src
    703 
    704 #+caption: Time slices of ACF for standing (left column) and propagating waves (right column).
    705 #+label: acf-slices
    706 #+RESULTS: acf-slices
    707 [[file:build/acf-slices.eps]]
    708 
    709 Graph tails in fig.\nbsp{}[[propagating-wave-distributions]] deviate from original
    710 distribution for individual wave characteristics, because every wave have to be
    711 extracted from the resulting wavy surface to measure its length, period and
    712 height. There is no algorithm that guarantees correct extraction of all waves,
    713 because they may overlap each other. Weibull distribution right tail represents
    714 infrequently occurring waves, so it deviates more than left tail.
    715 
    716 Degree of correspondence for standing waves
    717 (fig.\nbsp{}[[standing-wave-distributions]]) is lower for height and length, is
    718 roughly the same for surface elevation and is higher for wave period
    719 distribution tails. Lower correspondence degree for length and height may be
    720 attributed to the fact that Weibull distributions were obtained empirically for
    721 ocean waves which are typically propagating, and distributions may be different
    722 for standings waves. Higher correspondence degree for wave periods is attributed
    723 to the fact that wave periods of standing waves are extracted more precisely as
    724 the waves do not move outside simulated wavy surface region. The same
    725 correspondence degree for wave elevation is obtained, because this is the
    726 characteristic of the wavy surface (and corresponding AR or MA process) and is
    727 not affected by the type of waves.
    728 
    729 ARMA model, owing to its non-physical nature, does not have the notion of ocean
    730 wave; it simulates wavy surface as a whole instead. Motions of individual waves
    731 and their shape are often rough, and the total number of waves can not be
    732 determined precisely. However, integral characteristics of wavy surface match
    733 the ones of real ocean waves.
    734 
    735 Theoretically, ocean waves themselves can be chosen as ACFs, the only
    736 pre-processing step is to make them decay exponentially. This may allow
    737 to generate waves of arbitrary profiles, and is one of the directions of future
    738 work.
    739 
    740 * Determining wave pressures for discretely given wavy surface
    741 :PROPERTIES:
    742 :CUSTOM_ID: sec-pressures
    743 :END:
    744 
    745 Analytic solutions to boundary problems in classical equations are often used to
    746 study different properties of the solution, and for that purpose general
    747 solution formula is too difficult to study, as it contains integrals of unknown
    748 functions. Fourier method is one of the methods to find analytic solutions to a
    749 PDE. It is based on application of Fourier transform to each part of PDE, which
    750 reduces the equation to algebraic, and the solution is written as inverse
    751 Fourier transform of some function (which may contain Fourier transforms of
    752 other functions). Since, it is not possible to write analytic forms of these
    753 Fourier transforms in all cases, unique solutions are found and their behaviour
    754 is studied in different domains instead. At the same time, computing discrete
    755 Fourier transforms on the computer is possible for any discretely defined
    756 function and efficient when using FFT algorithms. These algorithms use symmetry
    757 of complex exponentials to decrease asymptotic complexity from
    758 \(\mathcal{O}(n^2)\) to \(\mathcal{O}(n\log_{2}n)\). So, even if general
    759 solution contains Fourier transforms of unknown functions, they still can be
    760 computed numerically, and FFT family of algorithms makes this approach
    761 efficient.
    762 
    763 Alternative approach to solve a PDE is to reduce it to difference equations, which
    764 are solved by constructing various numerical schemes. This approach leads to
    765 approximate solution, and asymptotic complexity of corresponding algorithms is
    766 comparable to that of FFT. For example, stationary elliptic PDE transforms to
    767 implicit numerical scheme which is solved by iterative method on each step of
    768 which a tridiagonal or five-diagonal system of algebraic equations is solved via
    769 Thomas algorithm. Asymptotic complexity of this approach is
    770 \(\mathcal{O}({n}{m})\), where \(n\)\nbsp{}--- number of wavy surface grid
    771 points, \(m\)\nbsp{}--- number of iterations. Despite their wide spread,
    772 iterative algorithms are inefficient on parallel computer architectures; in
    773 particular, their mapping to co-processors may involve copying data in and out
    774 of the co-processor in each iteration, which negatively affects their
    775 performance. At the same time, high number of Fourier transforms in the solution
    776 is an advantage, rather than a disadvantage. First, solutions obtained by
    777 Fourier method are explicit, hence their implementations scales with the large
    778 number of parallel computer cores. Second, there are implementations of FFT
    779 optimised for different processor architectures as well as co-processors (GPU,
    780 MIC) which makes it easy to get high performance on any computing platform.
    781 These advantages substantiate the choice of Fourier method to obtain explicit
    782 analytic solution to the problem of determining pressures under wavy ocean
    783 surface.
    784 
    785 The problem of finding pressure field under wavy sea surface represents inverse
    786 problem of hydrodynamics for incompressible inviscid fluid. System of equations
    787 for it in general case is written as\nbsp{}cite:kochin1966theoretical
    788 \begin{align}
    789     & \nabla^2\phi = 0,\nonumber\\
    790     & \phi_t+\frac{1}{2} |\vec{\upsilon}|^2 + g\zeta=-\frac{p}{\rho}, & \text{at }z=\zeta(x,y,t),\label{eq-problem}\\
    791     & D\zeta = \nabla \phi \cdot \vec{n}, & \text{at }z=\zeta(x,y,t),\nonumber
    792 \end{align}
    793 where \(\phi\)\nbsp{}--- velocity potential, \(\zeta\)\nbsp{}--- elevation
    794 (\(z\) coordinate) of wavy surface, \(p\)\nbsp{}--- wave pressure,
    795 \(\rho\)\nbsp{}--- fluid density,
    796 \(\vec{\upsilon}=(\phi_x,\phi_y,\phi_z)\)\nbsp{}--- velocity vector,
    797 \(g\)\nbsp{}--- acceleration of gravity, and \(D\)\nbsp{}--- substantial
    798 (Lagrange) derivative. The first equation is called continuity (Laplace)
    799 equation, the second one is the conservation of momentum law (the so called
    800 dynamic boundary condition); the third one is kinematic boundary condition for
    801 free wavy surface, which states that rate of change of wavy surface elevation
    802 (\(D\zeta\)) equals to the change of velocity potential derivative along the
    803 wavy surface normal (\(\nabla\phi\cdot\vec{n}\)).
    804 
    805 Inverse problem of hydrodynamics consists in solving this system of equations
    806 for \(\phi\). In this formulation dynamic boundary condition becomes an explicit
    807 formula to determine pressure field using velocity potential derivatives
    808 obtained from the remaining equations. So, from mathematical point of view
    809 inverse problem of hydrodynamics reduces to Laplace equation with mixed boundary
    810 condition\nbsp{}--- Robin problem.
    811 
    812 ** Two-dimensional case
    813 :PROPERTIES:
    814 :CUSTOM_ID: sec:pressure-2d
    815 :END:
    816 **** Formula for infinite depth fluid
    817 Two-dimensional Laplace equation with Robin boundary condition is written as
    818 \begin{align}
    819     \label{eq-problem-2d}
    820     & \phi_{xx}+\phi_{zz}=0,\\
    821     & \zeta_t + \zeta_x\phi_x = \frac{\zeta_x}{\sqrt{1 + \zeta_x^2}} \phi_x - \phi_z, & \text{at }z=\zeta(x,t).\nonumber
    822 \end{align}
    823 Use Fourier method to solve this problem. Applying Fourier transform to both
    824 sides of the equation yields
    825 \begin{equation*}
    826     -4 \pi^2 \left( u^2 + v^2 \right)
    827     \FourierY{\phi(x,z)}{u,v} = 0,
    828 \end{equation*}
    829 hence \(v = \pm i u\). Hereinafter we use the following symmetric form of Fourier
    830 transform:
    831 \begin{equation*}
    832     \FourierY{f(x,y)}{u,v} =
    833     \iint\limits_{-\infty}^{\phantom{--}\infty}
    834     f(x,y)
    835     e^{-2\pi i (x u + y v)}
    836     dx dy.
    837 \end{equation*}
    838 We seek solution in the form of inverse Fourier transform
    839 \(\phi(x,z)=\InverseFourierY{E(u,v)}{x,z}\). Plugging[fn::\(v={-i}{u}\) is not
    840 applicable because velocity potential must go to nought when depth goes to
    841 infinity.] \(v={i}{u}\) into the formula yields
    842 \begin{equation}
    843     \label{eq-guessed-sol-2d}
    844     \phi(x,z) = \InverseFourierY{e^{2\pi u z}E(u)}{x}.
    845 \end{equation}
    846 In order to make substitution \(z=\zeta(x,t)\) not interfere with Fourier
    847 transforms, we rewrite\nbsp{}eqref:eq-guessed-sol-2d as a convolution:
    848 \begin{equation*}
    849     \phi(x,z)
    850     =
    851     \Fun{z}
    852     \ast
    853     \InverseFourierY{E(u)}{x},
    854 \end{equation*}
    855 where \(\Fun{z}\)\nbsp{}--- a function, form of which is defined in
    856 sec.\nbsp{}[[#sec:compute-delta]] and which satisfies equation
    857 \(\FourierY{\Fun{z}}{u}=e^{2\pi{u}{z}}\). Plugging formula \(\phi\) into the
    858 boundary condition yields
    859 \begin{equation*}
    860     \zeta_t
    861     =
    862     \left( i f(x) - 1 \right)
    863     \left[
    864         \Fun{z}
    865         \ast
    866         \InverseFourierY{2\pi u E(u)}{x}
    867     \right],
    868 \end{equation*}
    869 where \(f(x)={\zeta_x}/{\sqrt{1+\zeta_x^2}}-\zeta_x\). Applying Fourier transform
    870 to both sides of this equation yields formula for coefficients \(E\):
    871 \begin{equation*}
    872     E(u) =
    873     \frac{1}{2\pi u}
    874     \frac{
    875     \FourierY{\zeta_t / \left(i f(x) - 1\right)}{u}
    876     }{
    877     \FourierY{\Fun{z}}{u}
    878     }
    879 \end{equation*}
    880 Finally, substituting \(z\) for \(\zeta(x,t)\) and plugging resulting equation
    881 into\nbsp{}eqref:eq-guessed-sol-2d yields formula for \(\phi(x,z)\):
    882 \begin{equation}
    883     \label{eq-solution-2d}
    884     \boxed{
    885         \phi(x,z)
    886         =
    887         \InverseFourierY{
    888             \frac{e^{2\pi u z}}{2\pi u}
    889             \frac{
    890             \FourierY{ \zeta_t / \left(i f(x) - 1\right) }{u}
    891             }{
    892             \FourierY{ \Fun{\zeta(x,t)} }{u}
    893             }
    894         }{x}.
    895     }
    896 \end{equation}
    897 
    898 Multiplier \(e^{2\pi{u}{z}}/(2\pi{u})\) makes a graph of a function to which
    899 Fourier transform is applied asymmetric with respect to \(OY\) axis. This makes
    900 it difficult to apply FFT which expects periodic function with nought on both
    901 ends of the interval. Using numerical integration instead of FFT is not faster
    902 than solving the initial system of equations with numerical schemes. This
    903 problem is alleviated by using formula\nbsp{}eqref:eq-solution-2d-full for finite
    904 depth fluid with wittingly large depth \(h\). This formula is derived in the
    905 following section.
    906 
    907 **** Formula for finite depth fluid
    908 On the sea bottom vertical fluid velocity component equals nought: \(\phi_z=0\) on
    909 \(z=-h\), where \(h\)\nbsp{}--- water depth. In this case equation \(v=-{i}{u}\), which came
    910 from Laplace equation, can not be neglected, hence the solution is sought in the
    911 following form:
    912 \begin{equation}
    913     \phi(x,z)
    914     =
    915     \InverseFourierY{
    916         \left( C_1 e^{2\pi u z} + C_2 e^{-2\pi u z} \right)
    917         E(u)
    918     }{x}.
    919     \label{eq-guessed-sol-2d-full}
    920 \end{equation}
    921 Plugging \(\phi\) into the boundary condition on the sea bottom yields
    922 \begin{equation*}
    923     C_1 e^{-2\pi u h} - C_2 e^{2\pi u h} = 0,
    924 \end{equation*}
    925 hence \(C_1=\frac{1}{2}C{e}^{2\pi{u}{h}}\) and
    926 \(C_2=-\frac{1}{2}C{e}^{-2\pi{u}{h}}\). Constant \(C\) may take arbitrary value
    927 here, because after plugging it becomes part of unknown coefficients \(E(u)\).
    928 Plugging formulae for \(C_1\) and \(C_2\)
    929 into\nbsp{}eqref:eq-guessed-sol-2d-full yields
    930 \begin{equation*}
    931     \phi(x,z) = \InverseFourierY{ \Sinh{2\pi u (z+h)} E(u) }{x}.
    932 \end{equation*}
    933 Plugging \(\phi\) into the boundary condition on the free surface yields
    934 \begin{equation*}
    935     \zeta_t = f(x) \InverseFourierY{ 2\pi i u \Sinh{2\pi u (z+h)} E(u) }{x}
    936             - \InverseFourierY{ 2\pi u \SinhX{2\pi u (z+h)} E(u) }{x}.
    937 \end{equation*}
    938 Here \(\sinh\) and \(\cosh\) give similar results near free surface, and since this
    939 is the main area of interest in practical applications, we assume that
    940 \(\Sinh{2\pi{u}(z+h)}\approx\SinhX{2\pi{u}(z+h)}\). Performing analogous to the
    941 previous section transformations yields final formula for \(\phi(x,z)\):
    942 \begin{equation}
    943 \boxed{
    944     \phi(x,z,t)
    945     =
    946   \InverseFourierY{
    947         \frac{\Sinh{2\pi u (z+h)}}{2\pi u}
    948         \frac{
    949             \FourierY{ \zeta_t / \left(i f(x) - 1\right) }{u}
    950         }{
    951             \FourierY{ \FunSecond{\zeta(x,t)} }{u}
    952         }
    953     }{x},
    954 }
    955     \label{eq-solution-2d-full}
    956 \end{equation}
    957 where \(\FunSecond{z}\)\nbsp{}--- a function, form of which is defined in
    958 sec.\nbsp{}[[#sec:compute-delta]] and which satisfies equation
    959 \(\FourierY{\FunSecond{z}}{u}=\Sinh{2\pi{u}{z}}\).
    960 
    961 **** Reducing to the formulae from linear wave theory
    962 Check the validity of derived formulae by substituting \(\zeta(x,t)\) with known
    963 analytic formula for plain waves. Symbolic computation of Fourier transforms in
    964 this section were performed in Mathematica\nbsp{}cite:mathematica10. In the
    965 framework of linear wave theory assume that waves have small amplitude compared
    966 to their lengths, which allows us to simplify initial system of
    967 equations\nbsp{}eqref:eq-problem-2d to
    968 \begin{align*}
    969     & \phi_{xx}+\phi_{zz}=0,\\
    970     & \zeta_t = -\phi_z & \text{at }z=\zeta(x,t),
    971 \end{align*}
    972 solution to which is written as
    973 \begin{equation*}
    974     \phi(x,z,t)
    975     =
    976     -\InverseFourierY{
    977         \frac{e^{2\pi u z}}{2\pi u}
    978         \FourierY{\zeta_t}{u}
    979     }{x}
    980     .
    981 \end{equation*}
    982 Propagating wave profile is defined as \(\zeta(x,t)=A\cos(2\pi(kx-t))\).
    983 Plugging this formula into\nbsp{}eqref:eq-solution-2d yields
    984 \(\phi(x,z,t)=-\frac{A}{k}\sin(2\pi(kx-t))\Sinh{2\pi{k}{z}}\). In order to
    985 reduce it to the formula from linear wave theory, rewrite hyperbolic sine in
    986 exponential form, discard the term containing \(e^{-2\pi{k}{z}}\) as
    987 contradicting condition
    988 \(\phi\underset{z\rightarrow-\infty}{\longrightarrow}0\). Taking real part of
    989 the resulting formula yields
    990 \(\phi(x,z,t)=\frac{A}{k}e^{2\pi{k}{z}}\sin(2\pi(kx-t))\), which corresponds to
    991 the known formula from linear wave theory. Similarly, under small-amplitude
    992 waves assumption the formula for finite depth
    993 fluid\nbsp{}eqref:eq-solution-2d-full is reduced to
    994 \begin{equation*}
    995     \phi(x,z,t)
    996     =
    997     -\InverseFourierY{
    998         \frac{\Sinh{2\pi u (z+h)}}{2\pi u \Sinh{2\pi u h}}
    999         \FourierY{\zeta_t}{u}
   1000     }{x}.
   1001 \end{equation*}
   1002 Substituting \(\zeta(x,t)\) with propagating plain wave profile formula yields
   1003 \begin{equation}
   1004     \label{eq-solution-2d-linear}
   1005     \phi(x,z,t)=\frac{A}{k}
   1006     \frac{\Sinh{2 \pi k (z+h)}}{ \Sinh{2 \pi k h} }
   1007     \sin(2 \pi (k x-t)),
   1008 \end{equation}
   1009 which corresponds to the formula from linear wave theory for finite depth fluid.
   1010 
   1011 Different forms of Laplace equation solutions, in which decaying exponent is
   1012 written with either "+" or "-" signs, may cause incompatibilities between
   1013 formulae from linear wave theory and formulae derived in this work, where
   1014 \(\sinh\) is used instead of \(\cosh\). Equality
   1015 \(\frac{\Sinh{2\pi{k}(z+h)}}{\Sinh{2\pi{k}{h}}}\approx\frac{\sinh(2\pi{k}(z+h))}{\sinh(2\pi{k}{h})}\)
   1016 becomes strict on the free surface, and difference between left-hand and
   1017 right-hand sides increases when approaching sea bottom (for sufficiently large
   1018 depth difference near free surface is negligible). So, for sufficiently large
   1019 depth any function (\(\cosh\) or \(\sinh\)) may be used for velocity potential
   1020 computation near free surface.
   1021 
   1022 Reducing\nbsp{}eqref:eq-solution-2d and\nbsp{}eqref:eq-solution-2d-full to the
   1023 known formulae from linear wave theory shows, that formula for infinite
   1024 depth\nbsp{}eqref:eq-solution-2d is not suitable to compute velocity potentials
   1025 with Fourier method, because it does not have symmetry, which is required for
   1026 Fourier transform. However, formula for finite depth can be used instead by
   1027 setting \(h\) to some characteristic water depth. For standing wave reducing to
   1028 linear wave theory formulae is made under the same assumptions.
   1029 
   1030 ** Three-dimensional case
   1031 Three-dimensional version of\nbsp{}eqref:eq-problem is written as
   1032 \begin{align}
   1033     \label{eq-problem-3d}
   1034     & \phi_{xx} + \phi_{yy} + \phi_{zz} = 0,\\
   1035     & \zeta_t + \zeta_x\phi_x + \zeta_y\phi_y
   1036     =
   1037     \frac{\zeta_x}{\SqrtZeta{1 + \zeta_x^2 + \zeta_y^2}} \phi_x
   1038     +\frac{\zeta_y}{\SqrtZeta{1 + \zeta_x^2 + \zeta_y^2}} \phi_y
   1039     - \phi_z, & \text{at }z=\zeta(x,y,t).\nonumber
   1040 \end{align}
   1041 Again, use Fourier method to solve it. Applying Fourier transform to both sides
   1042 of Laplace equation yields
   1043 \begin{equation*}
   1044     -4 \pi^2 \left( u^2 + v^2 + w^2 \right)
   1045     \FourierY{\phi(x,y,z)}{u,v,w} = 0,
   1046 \end{equation*}
   1047 hence \(w=\pm{i}\sqrt{u^2+v^2}\). We seek solution in the form of inverse Fourier
   1048 transform \(\phi(x,y,z)=\InverseFourierY{E(u,v,w)}{x,y,z}\). Plugging
   1049 \(w=i\sqrt{u^2+v^2}=i\Kveclen\) into the formula yields
   1050 \begin{equation*}
   1051     \phi(x,y,z) = \InverseFourierY{
   1052         \left(
   1053             C_1 e^{2\pi \Kveclen z}
   1054             -C_2 e^{-2\pi \Kveclen z}
   1055         \right)
   1056         E(u,v)
   1057     }{x,y}.
   1058 \end{equation*}
   1059 Plugging \(\phi\) into the boundary condition on the sea bottom (analogous to
   1060 two-dimensional case) yields
   1061 \begin{equation}
   1062     \label{eq-guessed-sol-3d}
   1063     \phi(x,y,z) = \InverseFourierY{
   1064         \Sinh{2\pi \Kveclen (z+h)} E(u,v)
   1065     }{x,y}.
   1066 \end{equation}
   1067 Plugging \(\phi\) into the boundary condition on the free surface yields
   1068 \begin{equation*}
   1069     \arraycolsep=1.4pt
   1070     \begin{array}{rl}
   1071         \zeta_t = & i f_1(x,y) \InverseFourierY{2 \pi u \Sinh{2\pi \Kveclen (z+h)}E(u,v)}{x,y} \\
   1072         + & i f_2(x,y) \InverseFourierY{2 \pi v \Sinh{2\pi \Kveclen (z+h)}E(u,v)}{x,y} \\
   1073         - & \InverseFourierY{2 \pi \Kveclen \SinhX{2\pi \Kveclen (z+h)}E(u,v)}{x,y}
   1074     \end{array}
   1075 \end{equation*}
   1076 where \(f_1(x,y)={\zeta_x}/{\SqrtZeta{1+\zeta_x^2+\zeta_y^2}}-\zeta_x\) and
   1077 \(f_2(x,y)={\zeta_y}/{\SqrtZeta{1+\zeta_x^2+\zeta_y^2}}-\zeta_y\).
   1078 
   1079 Like in sec.\nbsp{}[[#sec:pressure-2d]] we assume that
   1080 \(\Sinh{2\pi{u}(z+h)}\approx\SinhX{2\pi{u}(z+h)}\) near free surface, but in
   1081 three-dimensional case this is not enough to solve the problem. In order to get
   1082 analytic formula for coefficients \(E\) we need to assume, that all Fourier
   1083 transforms in the equation have radially symmetric kernels, i.e. replace \(u\)
   1084 and \(v\) with \(\Kveclen\). There are two points supporting this assumption.
   1085 First, in numerical implementation integration is done over positive wave
   1086 numbers, so the sign of \(u\) and \(v\) does not affect the solution. Second,
   1087 the rate growth of \(\cosh\) term of the integral kernel is much higher than the
   1088 one of \(u\) or \(\Kveclen\), so the substitution has small effect on the
   1089 magnitude of the solution. Despite these two points, a use of more
   1090 mathematically rigorous approach would be preferable.
   1091 
   1092 Making the replacement, applying Fourier transform to both sides of the equation
   1093 and plugging the result into\nbsp{}eqref:eq-guessed-sol-3d yields formula for
   1094 \(\phi\):
   1095 \begin{equation*}
   1096     \phi(x,y,z,t) = \InverseFourierY{
   1097         \frac{ \Sinh{\smash{2\pi \Kveclen (z+h)}} }{ 2\pi\Kveclen }
   1098         \frac{ \FourierY{ \zeta_t / \left( i f_1(x,y) + i f_2(x,y) - 1 \right)}{u,v} }
   1099         { \FourierY{\mathcal{D}_3\left( x,y,\zeta\left(x,y\right) \right)}{u,v} }
   1100     }{x,y}\label{eq-phi-high-amp},
   1101 \end{equation*}
   1102 where \(\FourierY{\mathcal{D}_3\left(x,y,z\right)}{u,v}=\Sinh{\smash{2\pi\Kveclen{}z}}\).
   1103 
   1104 ** Evaluation and discussion
   1105 :PROPERTIES:
   1106 :CUSTOM_ID: sec:compare-formulae
   1107 :END:
   1108 
   1109 Comparing obtained generic formulae\nbsp{}eqref:eq-solution-2d
   1110 and\nbsp{}eqref:eq-solution-2d-full to the known formulae from linear wave
   1111 theory allows to see the difference between velocity fields for both large and
   1112 small amplitude waves. In general analytic formula for velocity potential in not
   1113 known, even for plain waves, so comparison is done numerically. Taking into
   1114 account conclusions of\nbsp{}[[#sec:pressure-2d]], only finite depth formulae are
   1115 compared.
   1116 
   1117 **** The difference with linear wave theory formulae
   1118 In order to obtain velocity potential fields, ocean wavy surface was generated
   1119 by AR model with varying wave amplitude. In numerical implementation wave
   1120 numbers in Fourier transforms were chosen on the interval from \(0\) to the
   1121 maximal wave number determined numerically from the obtained wavy surface.
   1122 Experiments were conducted for waves of both small and large amplitudes.
   1123 
   1124 The experiment showed that velocity potential fields produced by formula
   1125 eqref:eq-solution-2d-full for finite depth fluid and formula
   1126 eqref:eq-solution-2d-linear from linear wave theory are qualitatively different
   1127 (fig.\nbsp{}[[fig-potential-field-nonlinear]]). First, velocity potential contours
   1128 have sinusoidal shape, which is different from oval shape described by linear
   1129 wave theory. Second, velocity potential decays more rapidly than in linear wave
   1130 theory as getting closer to the bottom, and the region where the majority of
   1131 wave energy is concentrated is closer to the wave crest. Similar numerical
   1132 experiment, in which all terms of eqref:eq-solution-2d-full that are neglected
   1133 in the framework of linear wave theory are eliminated, shows no difference (as
   1134 much as machine precision allows) in resulting velocity potential fields.
   1135 
   1136 #+name: fig-potential-field-nonlinear
   1137 #+header: :width 8 :height 11
   1138 #+begin_src R :file build/plain-wave-velocity-field-comparison.eps
   1139 source(file.path("R", "velocity-potentials.R"))
   1140 par(pty="s")
   1141 nlevels <- 41
   1142 levels <- pretty(c(-200,200), nlevels)
   1143 palette <- colorRampPalette(c("blue", "lightyellow", "red"))
   1144 #palette <- colorRampPalette(c("black", "white"))
   1145 col <- palette(nlevels-1)
   1146 
   1147 # linear solver
   1148 par(fig=c(0,0.95,0,0.5),new=TRUE)
   1149 arma.plot_velocity_potential_field(
   1150   file.path("build", "arma-benchmarks", "verification", "plain_wave_linear_solver"),
   1151   levels=levels,
   1152   col=col
   1153 )
   1154 
   1155 # high-amplitude solver
   1156 par(fig=c(0,0.95,0.5,1),new=TRUE)
   1157 arma.plot_velocity_potential_field(
   1158   file.path("build", "arma-benchmarks", "verification", "plain_wave_high_amplitude_solver"),
   1159   levels=levels,
   1160   col=col
   1161 )
   1162 
   1163 # legend 1
   1164 par(pty="m",fig=c(0.80,1,0.5,1), new=TRUE)
   1165 arma.plot_velocity_potential_field_legend(
   1166   levels=levels,
   1167   col=col
   1168 )
   1169 
   1170 # legend 2
   1171 par(pty="m",fig=c(0.80,1,0,0.5), new=TRUE)
   1172 arma.plot_velocity_potential_field_legend(
   1173   levels=levels,
   1174   col=col
   1175 )
   1176 #+end_src
   1177 
   1178 #+caption: Velocity potential field of propagating wave \(\zeta(x,y,t) = \cos(2\pi x - t/2)\). Field produced by formula eqref:eq-solution-2d-full (top) and linear wave theory formula (bottom).
   1179 #+label: fig-potential-field-nonlinear
   1180 #+attr_latex: :width \textwidth
   1181 #+RESULTS: fig-potential-field-nonlinear
   1182 [[file:build/plain-wave-velocity-field-comparison.eps]]
   1183 
   1184 **** The difference with small-amplitude wave theory
   1185 The experiment, in which velocity fields produced numerically by different
   1186 formulae were compared, shows that velocity fields produced by formula
   1187 eqref:eq-solution-2d-full and eqref:eq-old-sol-2d correspond to each other for
   1188 small-amplitude waves. Two ocean wavy surface realisations were made by AR
   1189 model: one containing small-amplitude waves, other containing large-amplitude
   1190 waves. Integration in formula eqref:eq-solution-2d-full was done over wave
   1191 numbers range extracted from the generated wavy surface. For small-amplitude
   1192 waves both formulae showed comparable results (the difference in the velocity is
   1193 attributed to the stochastic nature of AR model), whereas for large-amplitude
   1194 waves stable velocity field was produced only by formula
   1195 eqref:eq-solution-2d-full (fig.\nbsp{}[[fig-velocity-field-2d]]). So, generic
   1196 formula eqref:eq-solution-2d-full gives satisfactory results without restriction
   1197 on wave amplitudes.
   1198 
   1199 #+name: fig-velocity-field-2d
   1200 #+header: :width 8 :height 11
   1201 #+begin_src R :file build/large-and-small-amplitude-velocity-field-comparison.eps
   1202 source(file.path("R", "velocity-potentials.R"))
   1203 linetypes = c("solid", "dashed")
   1204 par(mfrow=c(2, 1))
   1205 arma.plot_velocity(
   1206   file.path("data", "velocity", "low-amp"),
   1207   file.path("data", "velocity", "low-amp-0"),
   1208   linetypes=linetypes,
   1209   ylim=c(-2,2)
   1210 )
   1211 arma.plot_velocity(
   1212   file.path("data", "velocity", "high-amp"),
   1213   file.path("data", "velocity", "high-amp-0"),
   1214   linetypes=linetypes,
   1215   ylim=c(-2,2)
   1216 )
   1217 #+end_src
   1218 
   1219 #+label: fig-velocity-field-2d
   1220 #+caption: Comparison of velocity field on the ocean wavy surface obtained by generic formula (\(u_1\)) and formula for small-amplitude waves (\(u_2\)). Velocity field for realisations containing small-amplitude (top) and large-amplitude (bottom) waves.
   1221 #+attr_latex: :width \textwidth
   1222 #+RESULTS: fig-velocity-field-2d
   1223 [[file:build/large-and-small-amplitude-velocity-field-comparison.eps]]
   1224 
   1225 * High-performance software implementation for heterogeneous platforms
   1226 :PROPERTIES:
   1227 :CUSTOM_ID: sec:arma-algorithms
   1228 :END:
   1229 ** White noise generation
   1230 In order to eliminate periodicity from generated wavy surface, it is imperative
   1231 to use PRNG with sufficiently large period to generate white noise. Parallel
   1232 Mersenne Twister\nbsp{}cite:matsumoto1998mersenne with a period of
   1233 \(2^{19937}-1\) is used as a generator in this work. It allows to produce
   1234 aperiodic ocean wavy surface realisations in any practical usage scenarios.
   1235 
   1236 There is no guarantee that multiple Mersenne Twisters executed in parallel
   1237 threads with distinct initial states produce uncorrelated pseudo-random number
   1238 sequences, however, algorithm of dynamic creation of Mersenne
   1239 Twisters\nbsp{}cite:matsumoto1998dynamic may be used to provide such a
   1240 guarantee. The essence of the algorithm is to find matrices of initial generator
   1241 states, that give maximally uncorrelated pseudo-random number sequences when
   1242 Mersenne Twisters are executed in parallel with these initial states. Since
   1243 finding such initial states consumes considerable amount of processor time,
   1244 vector of initial states is created preliminary with knowingly larger number of
   1245 parallel threads and saved to a file, which is then read before starting white
   1246 noise generation.
   1247 
   1248 ** Wavy surface generation
   1249 In ARMA model value of wavy surface elevation at a particular point depends on
   1250 previous in space and time points, as a result the so called /ramp-up interval/
   1251 (see fig.\nbsp{}[[fig-ramp-up-interval]]), in which realisation does not correspond to
   1252 specified ACF, forms at the beginning of the realisation. There are several
   1253 solutions to this problem which depend on the simulation context.
   1254 
   1255 If the realisation is used in the context of ship stability simulation without
   1256 manoeuvring, ramp-up interval will not affect results of the simulation, because
   1257 it is located on the border (too far away from the studied marine object). If
   1258 ship stability with manoeuvring is studied, then the interval may be simply
   1259 discarded from the realisation (the size of the interval approximately equals
   1260 the number of AR coefficients in each dimension). However, this may lead to a
   1261 loss of a very large number of points, because discarding is done for each
   1262 dimension. Alternative approach is to generate ocean wavy surface on ramp-up
   1263 interval with LH model and generate the rest of the realisation with ARMA model.
   1264 
   1265 Algorithm of wavy surface generation is data-parallel: the realisation is
   1266 divided into equal parts along the time axis each of which is generated
   1267 independently, however, in the beginning of each realisation there is ramp-up
   1268 interval. To eliminate it for MA process, /overlap-add/
   1269 method\nbsp{}cite:oppenheim1989discrete,svoboda2011efficient,pavel2013algorithms
   1270 (a popular method in signal processing) is used. The essence of the method is to
   1271 add another interval, size of which is equal to the ramp-up interval size, to
   1272 the end of each part. Then wavy surface is generated in each point of each part
   1273 (including points from the added interval), the interval at the end of part
   1274 \(N\) is superimposed on the ramp-up interval at the beginning of the part
   1275 \(N+1\), and values in corresponding points are added. To eliminate the ramp-up
   1276 interval for AR process, the realisation is divided into part along each
   1277 dimension, and each part is computed only when all dependent parts are ready.
   1278 For that purpose, an array of current part states is maintained in the
   1279 programme, and all the parts are put into a queue. A parallel thread acquires a
   1280 shared lock, finds the first part in the queue, for which all dependent parts
   1281 have "completed" state, removes this part for the queue, frees the lock and
   1282 generates the part. After that a thread updates the state of the part, and
   1283 repeats the same steps until the queue becomes empty. This algorithm eliminates
   1284 all ramp-up intervals except the one at the beginning of the realisation, and
   1285 the size of the parts should be sufficiently small to balance the load on all
   1286 processor cores.
   1287 
   1288 #+name: fig-ramp-up-interval
   1289 #+header: :width 4.5 :height 5
   1290 #+begin_src R :file build/ramp-up-interval.eps
   1291 source(file.path("R", "common.R"))
   1292 par(mar=c(0,1,0,0))
   1293 arma.plot_ramp_up_interval()
   1294 #+end_src
   1295 
   1296 #+caption: Ramp-up interval at the beginning of the \(OX\) axis of the realisation.
   1297 #+label: fig-ramp-up-interval
   1298 #+RESULTS: fig-ramp-up-interval
   1299 [[file:build/ramp-up-interval.eps]]
   1300 
   1301 ** Velocity potential computation
   1302 :PROPERTIES:
   1303 :CUSTOM_ID: sec:compute-delta
   1304 :END:
   1305 
   1306 In solutions eqref:eq-solution-2d and eqref:eq-solution-2d-full to
   1307 two-dimensional problem there are functions
   1308 \(\Fun{z}=\InverseFourierY{e^{2\pi{u}{z}}}{x}\) and
   1309 \(\FunSecond{z}=\InverseFourierY{\Sinh{2\pi{u}{z}}}{x}\) which has multiple
   1310 analytic representations and are difficult to compute. Each function is a
   1311 Fourier transform of linear combination of exponents which reduces to poorly
   1312 defined Dirac delta function of a complex argument (see
   1313 table\nbsp{}[[tab-delta-functions]]). The usual way of handling this type of
   1314 functions is to write them as multiplication of Dirac delta functions of real
   1315 and imaginary part, however, this approach does not work here, because applying
   1316 inverse Fourier transform to this representation does not produce exponent,
   1317 which severely warp resulting velocity field. In order to get unique analytic
   1318 definition, normalisation factor \(1/\Sinh{2\pi{u}{h}}\) (which is also included
   1319 in the formula for \(E(u)\)) may be used. Despite the fact that normalisation
   1320 allows to obtain adequate velocity potential field, numerical experiments show
   1321 that there is little difference between this field and the one produced by
   1322 formulae, in which terms with \(\zeta\) are omitted. As a result, we do not use
   1323 normalisation factors in the formula.
   1324 
   1325 #+name: tab-delta-functions
   1326 #+caption: Formulae for computing \(\Fun{z}\) and \(\FunSecond{z}\) from [[#sec:pressure-2d]], that use normalisation to eliminate uncertainty from definition of Dirac delta function of complex argument.
   1327 #+attr_latex: :booktabs t
   1328 | Function          | Without normalisation                                        | Normalised                                                                                                                             |
   1329 |-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------|
   1330 | \(\Fun{z}\)       | \(\delta (x+i z)\)                                           | \(\frac{1}{2 h}\mathrm{sech}\left(\frac{\pi  (x-i (h+z))}{2 h}\right)\)                                                                |
   1331 | \(\FunSecond{z}\) | \(\frac{1}{2}\left[\delta (x-i z) + \delta (x+i z) \right]\) | \(\frac{1}{4 h}\left[\text{sech}\left(\frac{\pi  (x-i (h+z))}{2 h}\right)+\text{sech}\left(\frac{\pi  (x+i(h+z))}{2 h}\right)\right]\) |
   1332 
   1333 ** Evaluation
   1334 ARMA model does not require highly optimised software implementation to be
   1335 efficient, its performance is high even without use of co-processors; there are
   1336 two main causes of that. First, ARMA model itself does not use transcendental
   1337 functions (sines, cosines and exponents) as opposed to LH model. All
   1338 calculations, except model coefficients, are done via polynomials, which can be
   1339 efficiently computed on modern processors using a series of fused multiply-add
   1340 (FMA) instructions. Second, pressure computation is done via explicit analytic
   1341 formula using nested FFTs. Since two-dimensional FFT of the same size is
   1342 repeatedly applied to every time slice, its coefficients (complex exponents) are
   1343 pre-computed for all slices, and computations are performed with only a few
   1344 transcendental functions. In case of MA model, performance is also increased by
   1345 doing convolution with FFT. So, high performance of ARMA model is due to scarce
   1346 use of transcendental functions and heavy use of FFT, not to mention that high
   1347 convergence rate and non-existence of periodicity allows to use far fewer
   1348 coefficients compared to LH model.
   1349 
   1350 ARMA implementation uses several libraries of reusable mathematical functions
   1351 and numerical algorithms (listed in table\nbsp{}[[tab-arma-libs]]), and was
   1352 implemented using OpenMP and OpenCL parallel programming technologies, that
   1353 allow to use the most efficient implementation for a particular algorithm.
   1354 
   1355 #+name: tab-arma-libs
   1356 #+caption: A list of mathematical libraries used in ARMA model implementation.
   1357 #+attr_latex: :booktabs t
   1358 | Library                                                      | What it is used for             |
   1359 |--------------------------------------------------------------+---------------------------------|
   1360 | DCMT\nbsp{}cite:matsumoto1998dynamic                         | parallel PRNG                   |
   1361 | Blitz\nbsp{}cite:veldhuizen1997will,veldhuizen2000techniques | multidimensional arrays         |
   1362 | GSL\nbsp{}cite:galassi2015gnu                                | PDF, CDF, FFT computation       |
   1363 |                                                              | checking process stationarity   |
   1364 | LAPACK, GotoBLAS\nbsp{}cite:goto2008high,goto2008anatomy     | finding AR coefficients         |
   1365 | GL, GLUT\nbsp{}cite:kilgard1996opengl                        | three-dimensional visualisation |
   1366 | CGAL\nbsp{}cite:fabri2009cgal                                | wave numbers triangulation      |
   1367 
   1368 For the purpose of evaluation we use simplified version of\nbsp{}eqref:eq-phi-high-amp:
   1369 \begin{align}
   1370     \label{eq-phi-linear}
   1371     \phi(x,y,z,t) &= \InverseFourierY{
   1372         \frac{ \Sinh{\smash{2\pi \Kveclen (z+h)}} }
   1373         { 2\pi\Kveclen \Sinh{\smash{2\pi \Kveclen h}} }
   1374         \FourierY{ \zeta_t }{u,v}
   1375     }{x,y}\nonumber \\
   1376     &= \InverseFourierY{ g_1(u,v) \FourierY{ g_2(x,y) }{u,v} }{x,y}.
   1377 \end{align}
   1378 This formula is particularly suitable for computation on GPUs:
   1379 - it contains transcendental mathematical functions (hyperbolic cosines and
   1380   complex exponents);
   1381 - it is computed over large four-dimensional ($t$, $x$, $y$, $z$) region;
   1382 - it is analytic with no information dependencies between individual data points
   1383   in $t$ and $z$ dimensions.
   1384 Since standing sea wave generator does not allow efficient GPU implementation
   1385 due to autoregressive dependencies between wavy surface points, only velocity
   1386 potential solver was rewritten in OpenCL and its performance was compared to
   1387 existing OpenMP implementation.
   1388 
   1389 For each implementation the overall performance of the solver for a particular
   1390 time instant was measured. Velocity field was computed for one $t$ point, for
   1391 128 $z$ points below wavy surface and for each $x$ and $y$ point of
   1392 four-dimensional $(t,x,y,z)$ grid. The only parameter that was varied between
   1393 subsequent programme runs is the size of the grid along $x$ dimension. A total
   1394 of 10 runs were performed and an average time of each stage was computed.
   1395 
   1396 A different FFT library was used for each version of the solver. For OpenMP
   1397 version FFT routines from GNU Scientific Library (GSL)\nbsp{}cite:galassi2015gnu
   1398 were used, and for OpenCL version clFFT library\nbsp{}cite:clfft was used instead.
   1399 There are two major differences in the routines from these libraries.
   1400 - The order of frequencies in Fourier transforms is different and clFFT library
   1401   requires reordering the result of\nbsp{}eqref:eq-phi-linear whereas GSL does not.
   1402 - Discontinuity at $(x,y) = (0,0)$ of velocity potential field grid is handled
   1403   automatically by clFFT library, whereas GSL library produce skewed values at
   1404   this point.
   1405 
   1406 For GSL library an additional interpolation from neighbouring points was used to
   1407 smooth velocity potential field at these points. We have not spotted other
   1408 differences in FFT implementations that have impact on the overall performance.
   1409 
   1410 In the course of the numerical experiments we have measured how much time each
   1411 solver's implementation spends in each computation stage to explain how
   1412 efficient data copying between host and device is in OpenCL implementation, and
   1413 how one implementation corresponds to the other in terms of performance.
   1414 
   1415 ** Results
   1416 The experiments showed that GPU implementation outperforms CPU implementation by
   1417 a factor of 10--15 (fig.\nbsp{}[[fig-bench-cpu-gpu]]), however, distribution of time
   1418 between computation stages is different for each implementation
   1419 (fig.\nbsp{}[[fig-breakdown-cpu-gpu]]). The major time consumer in CPU
   1420 implementation is computation of \(g_1\), whereas in GPU implementation its
   1421 running time is comparable to computation of \(g_2\). GPU computes \(g_1\) much
   1422 faster than CPU due to a large amount of modules for transcendental mathematical
   1423 function computation. In both implementations \(g_2\) is computed on CPU, but for
   1424 GPU implementation the result is duplicated for each \(z\) grid point in order to
   1425 perform multiplication of all \(XYZ\) planes along \(z\) dimension in single OpenCL
   1426 kernel, and, subsequently copied to GPU memory which severely hinders overall
   1427 stage performance. Copying the resulting velocity potential field between CPU
   1428 and GPU consumes \(\approx{}20\%\) of velocity potential solver execution time.
   1429 
   1430 #+name: fig-bench-cpu-gpu
   1431 #+caption: Performance comparison of CPU (OpenMP) and GPU (OpenCL) versions of velocity potential solver.
   1432 [[file:figures/bench-cpu-gpu.pdf]]
   1433 
   1434 #+name: fig-breakdown-cpu-gpu
   1435 #+caption: Performance breakdown for GPU (OpenCL) and CPU (OpenMP) versions of velocity potential solver.
   1436 [[file:figures/breakdown-cpu-gpu.pdf]]
   1437 
   1438 * Conclusion
   1439 Three-dimensional ARMA ocean simulation model coupled with analytic formula for
   1440 determining pressures under wavy sea surface is computationally efficient of
   1441 performing long-term ship behaviour simulations on the computer. Possible
   1442 applications of the approach include studying ship behaviour in storm and
   1443 shallow water waves. Its validity was visually and statistically verified in a
   1444 number of experiments: distribution of characteristics of waves, produced by
   1445 ARMA model, match the ones of real ocean waves, and velocity potential field,
   1446 produced by the analytic formula correspond to the one produced by the formula
   1447 for small-amplitude waves, and the formula itself reduces to the known one from
   1448 linear wave theory.
   1449 
   1450 Numerical experiments showed that wavy surface generation is efficient on CPU as
   1451 it involves no transcendental mathematical functions, and velocity potential
   1452 field computation is efficient on GPU due to heavy use of Fourier transforms.
   1453 The use of dynamically generated Mersenne Twister PRNGs allows to produce
   1454 uncorrelated sequences of pseudo-random numbers with no practical limitation on
   1455 the realisation period, which in turn allows to perform long simulation sessions
   1456 on parallel machines.
   1457 
   1458 The future work is to make ARMA mathematical apparatus and its numerical
   1459 implementation a base of virtual testbed for marine objects dynamics studies.
   1460 
   1461 bibliographystyle:styles/spphys.bst
   1462 bibliography:refs.bib
   1463 
   1464 * Config                                                           :noexport:
   1465 ** Clone data repository
   1466 #+begin_src sh :exports none :results verbatim
   1467 set -e
   1468 dir=build/arma-benchmarks
   1469 mkdir -p $dir
   1470 if ! test -d "$dir/.git"
   1471 then
   1472     git clone https://github.com/igankevich/arma-benchmarks $dir
   1473 fi
   1474 cd $dir
   1475 git checkout master
   1476 git pull
   1477 git checkout 730ef79a9f49df9c0c22d0a0f06e4d0b69b5bd99
   1478 #+end_src
   1479 
   1480 #+RESULTS:
   1481 : Ваша ветка обновлена в соответствии с «origin/master».
   1482 : Already up-to-date.
   1483 ** Configure org export
   1484 #+begin_src elisp
   1485 (add-to-list 'org-babel-R-graphics-devices '(:eps "cairo_ps" "file"))
   1486 #+end_src
   1487 
   1488 #+RESULTS:
   1489 | :eps        | cairo_ps   | file     |
   1490 | :bmp        | bmp        | filename |
   1491 | :jpg        | jpeg       | filename |
   1492 | :jpeg       | jpeg       | filename |
   1493 | :tikz       | tikz       | file     |
   1494 | :tiff       | tiff       | filename |
   1495 | :png        | png        | filename |
   1496 | :svg        | svg        | file     |
   1497 | :pdf        | pdf        | file     |
   1498 | :ps         | postscript | file     |
   1499 | :postscript | postscript | file     |