Impact of Wave-Vortical Interactions on Oceanic Submesoscale Lateral Dispersion

Submesoscale lateral transport of a passive tracer and Lagrangian particles in the ocean is investigated by means of numerical simulations with intermediate models. Using a projection technique, the models are formulated in terms of wave-mode and vortical-mode nonlinear interactions, and they range in complexity from full Boussinesq to waves-only and vortical-modes-only models. We find that, on these scales, most of the dispersion is done by vortical motions, but waves cannot be discounted because they play an important, albeit indirect, role. In particular, we show that waves are instrumental in filling out the spectra of vortical-mode energy at smaller scales through vortex-wave-wave triad interactions. Waves also transfer energy upscale to Vertically Sheared Horizontal Flows (VSHF) which are a key ingredient for internal-wave shear dispersion. We demonstrate that a richer spectrum of vortical modes in the presence of waves enhances the effective lateral diffusivity, compared to QG. In the waves-only model, the dispersion rate is an order of magnitude smaller and is attributed entirely to internal-wave shear dispersion. Some shear dispersion is also present in the Boussinesq model, but notably absent when wave-triad interactions are not included.


Introduction
The mechanisms driving lateral dispersion in the ocean on scales of 100 m to 10 km remain, by and large, not well identified. Lateral dispersion in this regime, known as the submesoscale, governs pollutant dispersal and the spreading of plankton and fish colonies (Rypina et al. 2014) which strongly impact coastal communities and marine ecosystems. On a fundamental level, submesoscale lateral dispersion represents a pathway from larger-scale stirring toward three-dimensional mixing, important for understanding how energy is cascaded from the mesoscale toward dissipative isotropic scales.
Observationally, our knowledge of lateral dispersion on these scales comes primarily from dye-tracer and Lagrangian float release experiments conducted over the last three decades in a variety of ocean environments.
These range from open-ocean conditions with significant mesoscale activity, e.g. the North Atlantic Tracer Release Experiment (NATRE) experiment (Ledwell et al. 1993) more quiescent summertime environments of LatMix2011 (Shcherbina et al. 2015;Lien and Sanford 2019), and coastal locations such as the Coastal Mixing and Optics (CMO) experiment (Sundermeyer and Ledwell 2001;Ledwell et al. 2004;. Surprisingly, the effective submesoscale lateral diffusivity inferred from all field observations is consistently O(1) m 2 s −1 , and an outstanding question remains as to what dynamics are responsible for such large dispersion rates.
Motions in the submesoscale regime are influenced by both internal waves and vortical modes, and although these two components may display similar spatial scales, they evolve on different temporal scales. Internal wave frequencies lie between the Coriolis (inertial) frequency f and the buoyancy frequency N, while the vortical component is governed by a longer subinertial, advective time scale. Whereas internal waves are ubiquitous and routinely observed in the ocean, our knowledge of vortical modes in the ocean remains largely speculative. Detection of vortical modes has proven difficult due to the presence of dominant mesoscale flows which induce non-negligible Doppler smearing of near-inertial wave frequencies into the subinertial frequency band (Holloway 1983;Kunze et al. 1990;Pinkel 2014). The vortical flow component has been invoked to explain spectral shear and strain levels in NATRE that cannot be reconciled with linear internal wave dynamics (Polzin et al. 2003). Using the vortical-mode strain spectrum from Polzin et al. (2003), Polzin (2004) obtains a submesoscale lateral diffusivity of O(1) m 2 s −1 , in agreement with dye-tracer dispersion estimates in Ledwell et al. (1993). Lien and Sanford (2019) analyze Ertel potential vorticity from EM-APEX floats deployed in the Sargasso Sea during LatMix2011 and conclude that vortical-mode energy levels are 1-2 decades below the observed energy, implying that linear waves dominated the signal at that site. Lien and Sanford (2019) further estimate that the internal wave energy levels are on the order of 0.1*GM where GM denotes the canonical Garrett-Munk internal wave spectrum (Garrett and Munk 1979). Yet, even in the relatively weak energy conditions recorded during LatMix2011, estimates of effective lateral diffusivities from dye-tracer releases are in the O(1) m 2 s −1 range. On the other hand, diffusivities inferred from concurrently deployed drifters are nearly an order of magnitude smaller (J. Early, personal communication). The difference between dye and drifter diffusivities remains to be explained. Possible internalwave driven mechanisms that can influence lateral diffusivities include internal-wave shear dispersion (IWSD) where wave vertical shear interacts with molecular vertical diffusion to produce enhanced horizontal displacements (Young et al. 1982;Kunze and Sundermeyer 2015). In this case, the effective horizontal diffusivity κ h is proportional to (N/ f ) 2 κ z , where κ z is the vertical diffusivity. Another potential wave-driven mechanism is internal-waveinduced dispersion, also called Stokes drift, (Holmes-Cerfon et al. 2011;Bühler et al. 2013) and Early et al. (in preparation). In contrast to IWSD, Stokes drift is independent of κ z . Lateral diffusivity on these scales can also be enhanced by vortical-mode stirring (Sundermeyer 1998;Sundermeyer and Ledwell 2001;Sundermeyer and Lelong 2005) which acts in the same manner as mesoscale eddy stirring (e.g. Abernathey et al. 2010, and references within). Vortical-mode stirring is an inherently nonlinear process whereas the former two wave-driven mechanisms can act in linear and nonlinear environments.
In light of the difficulties encountered in separating vortical modes from internal waves in ocean observations, numerical simulations have proven useful in furthering our understanding of the dynamics governing submesoscale lateral dispersion. For example, linear and nonlinear behaviors can be readily separated and mechanisms that involve vertical diffusivity can be identified by comparing the dispersion characteristics of diffusive dye and nondiffusive Lagrangian particles. Distinguishing wave and vortical-mode contributions to submesoscale lateral diffusivities has proven more difficult. A recent numeri-cal study initialized with a Garrett-Munk spectrum and amplitudes matched to LatMix2011 levels produces O(1) m 2 s −1 effective lateral diffusivity, in agreement with Lat-Mix2011 dye observations (Early et al. in preparation). In contrast to LatMix2011 observations, however, dye and Lagrangian particles in the numerical simulation exhibit comparable rates of dispersion. Moreover, linear and nonlinear simulations initialized with the same wave spectrum produce nearly indistinguishable lateral diffusivities. In this nearly linear regime, one is inclined to think that internal-wave-induced Stokes drift, active in both linear and nonlinear environments and independent of vertical diffusivity, is the dominant mechanism. An important caveat, however, is that vortical modes, while not initially present in the simulations, can be created in both linear and nonlinear simulations through viscous and diffusive dissipation (Riley and Lelong 2000). Therefore, the role of the vortical modes cannot be entirely discounted in Early et al.'s numerical simulation (Lelong et al. 2016). Numerical simulations by Sundermeyer and Lelong (2005) tested the idea first proposed by Sundermeyer (1998) that vortical modes in the ocean interior, created by geostrophic adjustment of well mixed fluid patches following wave breaking events, may account for O(1) lateral diffusivities recorded during the CMO experiment (Sundermeyer 1998). These simulations reproduced qualitative characteristics of the dispersion observations, but could not attain the energy levels recorded at the CMO site.
Several studies have attempted to understand whether dispersion is local, i.e. governed by comparable scales, or nonlocal, governed by larger scales (LaCasce 2008; Beron-Vera and LaCasce 2016). Local dynamics are associated with spectral kinetic energy slopes less than -3 and nonlocal dynamics with steeper spectra. The local/nonlocal dispersion interpretation was originally developed for two-dimensional turbulent flows (Kraichnan 1967) and later applied in the interpretation of quasi-2D geophysical flows, e.g. (Salmon 1983;Rhines 1988). Bennett (1984) used the statistics of stratospheric balloon deployments to infer that the observed relative dispersion was non-local. In the ocean, interpretation of dispersion in terms of local/nonlocal statistics has proven difficult, due in part to the presence of near-inertial waves that can increase energy levels at small scales without having much of an impact on the relative dispersion of drifter pairs (Beron-Vera and LaCasce 2016; Essink et al. 2019). Poje et al. (2014) examined scale-dependent dispersion in the region of the Deepwater Horizontal oil spill by computing two-particle statistics from a large deployment of drifters. They found significant energy at scales below 100 m and concluded, based on spatial and temporal metrics, that dispersion at these scales was strictly local. Since the theoretical framework used to distinguish local from nonlocal behavior was developed for flows devoid of internal waves, much effort has gone into filtering out the wave component from ocean observations. As a consequence, the contribution of internal waves to submesoscale lateral diffusivity remains largely unexplored.
The objective of the present numerical study is to identify the contributions of waves and vortical modes to submesoscale lateral dispersion in the ocean, using a combination of dye and Lagrangian particles in a regime where wave and vortical-mode energy levels are comparable, controlling the degree of nonlinearity. This is accomplished by performing simulations under identical conditions with a set of intermediate models (Remmel 2010;Hernández-Dueñas et al. 2014) that include subsets of all possible nonlinear interactions. The intermediate models were first developed for shallow-water systems (Remmel and Smith 2009) and extended to rotating stratified flows by Remmel (2010) and Hernández-Dueñas et al. (2014). The current study relies on the models described in Hernández-Dueñas et al. (2014), including the quasi-geostrophic (QG) model with only vorticalmode nonlinearities, a model (P2G) with all interactions except wave/wave/wave interactions, and the full Boussinesq (FB) model which retains all possible nonlinear interactions between the two components. A wave turbulence model (GGG) with all wave/wave/wave triads is also included (Remmel et al. , 2014. The GGG model can be viewed as an extension of weak turbulence containing only resonant wave/wave/wave interactions (Zakharov et al. 1992;Newell and Rumpf 2011;Nazarenko 2011;McComas and Bretherton 1977;Lvov et al. , 2010. The QG and GGG models are useful for studying flow evolution when only vortical modes or wave modes are separately present, whereas P2G and FB provide information on energy exchanges between the two components. We find that the presence of internal waves, through their interaction with vortical modes, leads to non-negligible differences in the lateral dispersion patterns. In this study, all models are spun up from rest with identical forcing designed to represent parameterized wavebreaking events occurring at random locations in the domain. These parameterized events are introduced through enhanced localized vertical diffusivities that produce patches of well-mixed fluid. The mixed patches are out of equilibrium with the surrounding stably stratified fluid and undergo cyclo-geostrophic adjustment, resulting in the spin-up of a vortex structure and a radiating wave field Lelong and Sundermeyer 2005). The models are continuously forced in this fashion until the flows reach statistical equilibrium, at which point a passive tracer and non-diffusive Lagrangian particles are placed in the domain. The broad range of temporal and spatial scales that must be resolved simultaneously render these computations very memory and time intensive. However, the computational challenge can be somewhat circumvented by considering a dy-namically similar problem with a reduced wave frequency band N/ f , effectively reducing the time scale separation between wave and subinertial motions while maintaining horizontal length scales relative to the Rossby radius of deformation in order to preserve nondimensional Rossby and Burger numbers (e.g. Lelong and Dunkerton 1998).
The rest of the paper is organized as follows: The mathematical foundations of intermediate models are given in Section 2. The numerical setup and specification of parameter values for oceanic submesoscales can be found in Section 3. Section 4 provides detailed inter-model comparisons for understanding the role of wave and vortical motions on lateral dispersion, with an emphasis on energetics, scale dependence, passive tracer and particle behaviors. Discussion of our results is provided in Section 5.

a. Boussinesq Dynamics
The Boussinesq approximation to the compressible fluid equations is commonly adopted to describe low Mach number, non-hydrostatic motions such as the turbulent velocity and density fluctuations present in the oceanic submesoscales. The Boussinesq model filters high-frequency acoustic waves, while still retaining the influence of density fluctuations through the buoyancy term in the statement of conservation of momentum. Hence, the Boussinesq framework is a practical choice to study the interactions between moderate frequency inertia-gravity waves and so-called 'balanced' flows, which at lowest order are the motions that exist in the absence of waves altogether. These nonlinear exchanges are often called wavevortical interactions, in reference to the vortical nature of balanced flows. Quantitative understanding of wavevortical interactions continues to be an important research area impacting the accuracy of regional and global oceanatmosphere models through subgrid parameterizations.
Assuming alignment of the direction of gravity and the axis of earth's rotation in the verticalẑ-direction, the unforced, inviscid Boussinesq equations are given by where u(x,t) is velocity, ρ(x,t) = ρ o +ρ(z) + ρ (x,t) is density and p(x,t) is pressure. The constant ρ o and linear functionρ(z) = −Bz describe the fixed reference state, and the constant B characterizes the density stratification. The Coriolis frequency f is taken here to be a constant, consistent with relatively small variations of f in a restricted latitude range at mid-latitudes. The linearized version of system (1) with triply periodic boundary conditions is known to support propagating inertia-gravity wave solutions, as well as non-propagating solutions sometimes referred to as 'slow modes,' or 'vortical modes.' These wave and vortical solutions may be conveniently expressed in terms of a scaled density θ = (Bρ o /g) −1/2 ρ , which has dimensions of velocity, such that where the four-vectors s (x,t; k), s = 0, +, − are parameterized by the wavevector k. The eigenvectors φ s (k) can be found in Majda (2003); see also Smith and Waleffe (2002); Hernández-Dueñas et al. (2014). See appendix A1 for more details. The superscript 0 is used to denote the vortical mode with zero-frequency σ 0 (k) = 0, while the superscripts ± denote the wave modes with frequencies (eigenvalues): where k = |k| is the wavenumber corresponding to the wavevector k, and k h = (k 2 x + k 2 y ) 1/2 (k z ) is its horizontal (vertical) wavenumber. The constant buoyancy frequency N is given by N = (gB/ρ o ) 1/2 , and the density fluctuation can be rewritten as ρ = Bθ /N.
Completeness of the divergence-free eigenmodes φ s (k) allows for an equivalent k-space description of the dynamics (1) in a triply periodic domain, given by where the overbar denotes complex conjugate. The fields u and θ are recovered from the expansion Note that (4) is shorthand notation for three coupled partial differential equations, since s k takes on the values s k = 0, ±; three equations are sufficient (instead of four) since the eigenvectors φ s (k) are divergence-free. Fourier pseudo-spectral codes solve (4) for the unknown amplitudes b s (k), taking advantage of fast Fourier transforms to compute the nonlinear term as a local product in xspace rather than the convolution sum in k-space. The known interaction coefficients C s k s p s q kpq are computed from the eigenmodes φ s (k) and their complex conjugates (e.g, Remmel et al. 2014).
The reduced models studied herein result from a restriction of the convolution sum in (4) to selected classes of interactions (s k s p s q ); in physical space this corresponds to a projection onto the corresponding selected wave-vortical interactions. For more details, see Remmel (2010); Hernández-Dueñas et al. (2014) for the Boussinesq system and Remmel and Smith (2009) for the shallow water equations. In physical space, the models correspond to systems of partial differential equations with modified nonlinear terms. Each such model automatically satisfies global energy conservation because each triad (k p q) separately satisfies the detailed balance relation C Kraichnan 1973). Two of the reduced models eliminate wave-vortical interactions, focusing on either: vortical-mode interactions in the absence of waves, which results in the quasi-geostrophic (QG) approximation; or wave interactions in the absence of the vortical mode, which results in the GGG model. These two 'extreme' cases are quite different, since the QG model supports an inverse cascade of vortical-mode energy, while three-wave interactions mainly support a forward cascade of wave energy in strongly stratified flows. A third reduced model includes all interactions except three-wave interactions, and hence elucidates how wave-vortical interactions modify QG dynamics. By comparison to Boussinesq dynamics given by (1) or equivalently (4), numerical simulations of the reduced models help to clarify the different contributions of vortical, wave and mixed wave-vortical interactions for influencing both dynamics and scalar transport in the ocean submesoscales.

b. The Quasi-Geostrophic Approximation (QG)
The quasi-geostrophic (QG) approximation to (1) describes the nonlinear dynamics of the vortical mode in the absence of inertia-gravity waves. The QG model was conceived for mid-latitude, large-scale motions in the atmosphere and oceans, evolving on time scales that are long compared to the wave periods associated with eigenvalues (3) (Charney 1948(Charney , 1971). Among its many foundational aspects, the QG approximation provides a theoretically tractable and numerically inexpensive framework for understanding the baroclinic instability, potential vorticity dynamics and geostrophic turbulence (Charney 1948(Charney , 1971Pedlosky 1982;Gill 1982;Vallis 2017).
In the geophysical fluid dynamics literature, QG is usually derived from a distinguished asymptotic limiting process Majda (2003). Main assumptions are that Fr ∼ Ro = ε → 0, where the Rossby number Ro ∝ 1/ f and the Froude number Fr ∝ 1/N are non-dimensional parameters characterizing the relative strengths of rotation and stratification. Formally, the QG model may also be derived by projection of the dynamics (4) onto the subset of vortical mode interactions (s k s p s q ) = (0 0 0), and excluding all other triad interaction types (Smith and Waleffe 2002). A mathematical proof based on fast wave averaging has rigorously established the decoupling of waves and vortical modes for ε → 0 Majda (1996, 1998); Babin et al. (2000Babin et al. ( , 1997).
From (4), QG dynamics may be written in Fourier space as where we have used the fact that vortical modes have zero frequency σ 0 = 0. Though derived as an asymptotic description of scales larger than the Rossby deformation radius (approximately 100 km in the ocean), QG serves as a 'null hypothesis', providing an important reference model to qualitatively and quantitatively assess the importance of wave-vortical interactions.

c. The Wave Turbulence Approximation (GGG)
A different reference point is provided by three-wave interactions, whose dynamics are given by Equation (8) is shorthand notation for the two equations for b + (k,t)) and b − (k,t), and the acronymn GGG stands for interactions among three gravity waves. Dynamics given by GGG can be viewed as a non-perturbative extension of wave turbulence (WT) theory, which would further restrict the triad interactions in (8) to satisfy the resonance condition σ s k (k)+σ s p (p)+σ s q (q) = 0 (Zakharov et al. 1992;Newell and Rumpf 2011;Nazarenko 2011).
Wave turbulence theory has been used as a model to understand the Garrett-Munk internal wave spectrum (Mc-Comas and Bretherton 1977;Lvov et al. , 2010. By including all three-wave interactions, additional physics is captured by (8), including the generation of vertically sheared horizontal flows (Remmel et al. , 2014Lvov et al. 2012;Gamba et al. 2020), which is excluded by purely resonant interactions. Threewave interactions are expected to be important for scalar transport in the vertical direction.
d. Dynamics in the Absence of Three-wave Interactions (P2G) The remaining model studied in this work is the projection of (1) onto the set of triad interactions that includes at least one vortical mode, given in k-space by the equations: where (11) is shorthand for the two equations for b + (k,t) and b − (k,t), and the fields u and θ are recovered from the expansion (5). This model was named P2G in Remmel (2010) to indicate that triad interactions may contain up to two gravity waves (hence 2G). The dynamics of P2G allow for wave-vortical interactions to modify purely QG dynamics. In previous studies in 2π-periodic domains with aspect ratio 1 and Fr = Ro ≈ 0.1, the wave-vortical interactions were shown to be responsible for asymmetry between cyclones and anticyclones, which asymmetry is absent in QG dynamics alone (Hernández-Dueñas et al. 2014). Furthermore, the size of P2G large-scale vortices was observed to be smaller than the vortices of QG (Remmel and Smith 2009;Hernández-Dueñas et al. 2014), and hence closer to the size of vortices generated by full dynamics. The smaller size of P2G and Boussinesq vortices is linked to a modification of the QG inverse cascade by wave-vortical interactions.
Here we investigate how such dynamical effects of wavevortical interactions change scalar transport, and in particular the effective horizontal diffusivities, in a numerical setup and parameter regime that is relevant to the ocean sub-mesoscales.

Numerical Setup
a. Reduced N/ f regime Correctly simulating submesoscale lateral dispersion requires including simultaneously length scales that span several orders of magnitude over periods of several days, with sufficiently small timesteps to resolve the buoyancy frequency. To circumvent this computational hurdle, our simulations are performed in a regime dynamically similar to typical midlatitude upper ocean conditions but numerically more tractable. We accomplish this by increasing the Coriolis frequency f while decreasing the horizontal scale L by the same factor. Buoyancy frequency N and vertical length scale h remain fixed. Therefore, h/L is increased while N/ f is decreased to preserve the Burger number Bu = (Nh/ f L) 2 and the underlying dynamics of the flow. This technique has proven particularly useful in a number of studies similar to this one where model spin-ups from rest can take hundreds of inertial periods Sundermeyer and Lelong 2005;Brunner-Suzuki et al. 2012, 2014. Sensitivity studies have shown that for simulation times comparable to the ones presented here, results are not significantly impacted by a reduction of N/ f (Lelong and Dunkerton 1998). The results presented in the next section were all obtained with horizontal lengths scaled down by a factor of 10, and f increased by a factor of 10. While all our simulations are performed with reduced N/ f , a rescaling of horizontal and time scales will enable us to relate our results to a realistic upper ocean regime (Section 5).

All simulations are performed in a domain
where L x = L y = 500 m, L z = 50 m, with aspect ratio L z /L x = 1/10. The Coriolis frequency is f = 9.47 × 10 −4 s −1 and the Brunt-Väisäla frequency is N = 9.47 × 10 −3 s −1 . The characteristic timescale is the inertial period, τ = 2π/ f = 6634.8 s = 1.84 h. The density background with constant stratification ρ o − Bz is given by the reference value ρ o = 1024 kg m −3 , and the density stratification B = 0.0094 kg m −4 . The entire list of parameters and definitions is given in Table 1. The models use hyper-(hypo-) viscosity at the smallest (largest) scales to control energy. Details of their implementation are included in Appendix A1.

c. Forcing
The flows in the four models are spun up from rest in identical fashion, with sustained forcing designed to mimic intermittent wave breaking in the ocean. Density anomalies, introduced periodically in the domain at random locations, represent the parameterized end-states of wave breaking events at the stage following isopycnal overturning, when localized mixing has occurred. Each wave breaking event is assumed to produce a perfectly mixed patch of fluid. These patches are out of equilibrium with the linear background density and undergo cyclo-geostrophic adjustment, producing S-vortex structures comprised of a central anticyclone flanked above and below by two weaker cyclones (Morel and McWilliams 1997). In addition to the spin-up of vortices, each adjustment event also excites a radiating internal wave field. Following Sundermeyer and Lelong (2005), each anomaly is computed as the solution of an initial/boundary value problem That is, the initial density is the background state with constant stratification, and we impose periodic boundary conditions on the fluctuation ρ f = ρ f − ρ o + Bz. Furthermore, the coefficient κ z is a Gaussian function given by where the density is nearly constant. Using the characteristic scales r h and r z , the Burger number based on the forcing is d. Dye and particle diffusivities The dye concentration C is governed by, Equation (15) is solved with a positivity-preserving central-upwind scheme that employs a vanishing viscosity for numerical stability (Kurganov et al. 2001). The initial condition consists of a strip oriented along the y-direction, in the middle of the domain, where c o = 1 corresponds to the maximum initial concentration in units of parts per volume. The horizontal diffusivity is therefore computed only in the cross-streak direction, x. The x-component of the center of mass of the dye is given by, , and the dispersion in the x-direction as The horizontal dye diffusivity is then defined in terms of the time derivative of the second moment as, An effective diffusivity can also be defined from the pairwise separation of Lagrangian particles in time. To compare with dye diffusivity, we consider only the relative particle separation in the cross-streak x-direction. In this case, where is the mean-square particle separation in x, averaged over all unique pairs of N particles. A comparison of κ d H and κ p H will help distinguish mechanisms that rely on diffusive effects, i.e. wave-or vortical-driven shear dispersion, from those that are purely kinematic and driven by nonlinear advective forces. Here, we do not consider spatial metrics for diffusivity such as Finite Size Lyapunov Exponents (FSLE) because they are not useful in regimes with significant wave energy.

Results
We now consider the effects of the different wave/vortical interactions and their impact on lateral diffusivity by contrasting the evolution of a dye streak and Lagrangian particles in the Boussinesq, P2G, QG and GGG models described in the previous section. The spin-up to statistical equilibrium and the corresponding physical fields are first presented, followed by identification of the dominant length scales in each model, a discussion of energy spectra and their implications for local and nonlocal dispersion. The dispersion characteristics of the dye and particles are contrasted to isolate the effects of shear dispersion and diffusivities as a function of scale are computed for each model. Finally, we rescale our results, obtained in a reduced N/ f regime, to a realistic upper ocean regime.

a. Energetics
The quasi-steady state of the flow entails a delicate balance between forcing, energy fluxes and hypo/hyper viscous forces. The vortical and wave energies are defined as the corresponding quantities given by the vortical and wave projections of the solution respectively, as done in equation (A1)  inertial periods, at which time dye and particles are placed in the center of the domain. The models are then run for another 50 inertial periods. Steady-state vortical energies (upper panel) in Boussinesq, P2G and QG models are comparable. Wave energy (bottom panel) in the P2G model is slightly higher than in the Boussinesq system, and significantly higher in the GGG model than in the other cases. The latter may be explained by the accumulation of energy in the vertically sheared horizontal flows (VSHF modes). These modes are not as energetic in Boussinesq and P2G, presumably because inclusion of wave/vortex interactions provides additional forwardcascading pathways. VSHF modes will be discussed in more detail in a later section.
b. Physical-space potential vorticity fields A 3D view of the steady-state Boussinesq flow at the time of dye and particle injection is shown in Fig. 2.   FIG. 2. 3D view of the iso-surface PV −1 (0.5 × 10 −5 kg m −3 m −1 s −1 ) of linear potential vorticity for Boussinesq model (red iso-surface). Also shown are the velocity field (cyan arrows) near the vortices and PV -contours at the bottom and two of the side walls. The colorbar is in units of kg m −3 m −1 s −1 and corresponds to the PV -contours at the walls. Outside the box, an insert shows a zoomed-in vortex with the corresponding velocity field around it. The red iso-surface corresponds to anticyclones. The vortices are visualized by plotting the iso-surface (red) of linear potential vorticity (PV) given by where ω = ∇ × u is the relative vorticity. The isosurfaces in Fig. 2 correspond to the value PV = 0.5 × 10 −5 kgm −3 m −1 s −1 , and the nearby velocity field is indicated by cyan arrows. The 3D view of the fluid reveals the horizontal and vertical distribution of the vortices with thin vortices staggered throughout the water column. Also shown at the bottom and at the walls are contours of the linear PV, giving us a hint of the vertical structure. The colorbar is in units of kg m −4 s −1 and corresponds to the PV contours at the walls. Outside the box, an inset shows a zoomed-in vortex along with the neighboring velocity field.
A comparison of the vortex fields in the different models is made by examining horizontal PV contours, as illustrated in Fig. 3, at z = 25 m. In the Boussinesq model (top left), the fluid motion is dominated by anticyclonic vortices. Inertia-gravity waves, with no organized signature on PV, are also present in regions between coherent vortex cores. In contrast, the results obtained with the QG model

c. Potential vorticity centroid
A quantity of interest is the linear PV -centroid, which at time t, is defined as where PV k is the Fourier coefficient of (22) at wavevector k. The stretching factor ( f /N) 2 is motivated by the dynamical equation for linear PV (Vallis 2017), and accounts for the small aspect ratio. In strongly rotating fluids, (23) is a quantity associated with the wavenumber of emerging PV-vortices, with corresponding lengthscale The models QG, P2G and FB allow for inverse transfer of energy to large-scale PV-vortices via 3 vortical-mode interactions, which in our simulations is arrested by hypoviscosity at the smallest wavenumbers. Here we present statistics to measure the size of the PV-vortices in the different models. The solution at time t = 200τ shows a lengthscale given by the PV -centroid of L PV = 30.33 m for Boussinesq, L PV = 14.84 m for P2G, and L PV = 48.92 m for QG. The lengthscale L PV is larger in QG than in the other models, consistent with the physical fields in Fig. 3, and with the QG inverse energy cascade resulting from 3 vortical-mode interactions. In Boussinesq and P2G, both upscale and downscale energy cascades are present to varying degrees (see Section 4e). Compared to QG, smaller values of L PV in the Boussinesq and P2G models indicate the presence of smaller cyclones (blue in Fig. 3), with L PV in P2G about one third of the QG lengthscale.

d. Vertically sheared horizontal flows
Vertically sheared horizontal flows (VSHF) (e.g. Smith and Waleffe 2002;Fitzgerald and Farrell 2018a,b) correspond to the k h = 0 wave modes and are of special interest in determining whether the mechanism of internalwave shear dispersion contributes significantly to horizontal diffusivity. Vertical cross-sections x − z of meridional velocity v at the time of dye and particle injection are displayed in Fig. 4 for Boussinesq and GGG models. The presence of layers in GGG, with corresponding shear on vertical scales of ≈ 10 m, shows that VSHF are a dominant component in the waves-only model. In contrast, there is scant evidence of thin layering or strong vertical shear in the Boussinesq model. The coupling of vertical diffusivity with strong VSHF vertical shear may act to enhance horizontal diffusivity. The contribution of this mechanism will be addressed in Section 4g through a comparison of rates of dispersion for diffusive dye and non-diffusive particles.
By construction, the forcing by density anomalies described by (12) has zero VSHF energy. Therefore, all VSHF energy results from nonlinear interactions (for more details, see Smith and Waleffe 2002;Waite and Bartello 2006;Laval et al. 2003;Remmel et al. 2010Remmel et al. , 2014Fitzgerald and Farrell 2018a,b). Integrated over the time period 0 ≤ t ≤ τ, the total VSHF energy is 1.22 × 10 −8 m 2 s −2 for FB, 3.81 × 10 −9 m 2 s −2 for P2G, and 3.34 × 10 −7 m 2 s −2 for GGG, clearly indicating dominance of VSHF in GGG compared to the other models. As described in the next section and illustrated by Fig. 5, spectral analysis shows that the GGG VSHF are most pronounced at large horizontal scales and intermediate vertical scales, consistent with the GGG layering observed in the bottom panel of Fig. 4.

e. Kinetic Energy Spectra
Further insight into the differences between the models is provided by examining energy spectra. Here we focus on kinetic energy (Fig. 5) to inform the discussion on scalar transport, though complementary information can be extracted from potential energy.   ) as a function of nondimensional horizontal (left) and vertical (right) wavenumber. In each panel, the energy is split into vortical (solid lines) and wave (dashed lines) for the FB (blue), P2G (green), QG (black) and GGG (red) models. The anomaly spectra (dotted line) and the forcing range of scales (horizontal dashed line) are also shown. Horizontal and vertical nondimensional wavenumbers are scaled with ∆k h and ∆k z respectively (see Appendix A1 for details). The wavelength in meters is also provided along the upper axes.
The left panel of Fig. 5 shows kinetic energy spectra as a function of horizontal wavenumber k h , averaged over vertical wavenumber k z , and projected onto wave and vortical components. Similarly, the right panel indicates the spectral dependence on vertical wavenumber k z , averaged over horizontal wavenumber k h . Horizontal (vertical) wavelength λ h (λ z ) in meters is displayed on the top left (right) axis, and we remind the reader that oceanic values for wavelengths may be extrapolated from multiplication by a factor of ten. The forcing spectra are also shown (dotted lines on each panel), where the forcing energy saturates through random injections of density anomalies as described in Section 3c. Notice that the impact of the anomaly forcing is strongest for horizontal scales 50 m < λ h < 100 m and vertical scales 10 m < λ z < 15 m.
For small scales in our numerical set-up and parameter regime, both panels of Fig. 5 show the dominance of wavemode kinetic energy (dashed lines) over vortical-mode kinetic energy (solid lines). When waves are present (FB, P2G and GGG), one sees that the horizontal wave spectra are surprisingly robust in the forward transfer range 10 < k h < 70 (8 m < λ h < 100 m), with rough scaling k −2.5 h (dashed lines on the left panel). At large horizontal scales λ h > 100 m, the GGG wave energy is dramatically elevated owing to inverse transfer into large-scale VSHF modes, as visualized in Fig. 4.
The wave kinetic energy is also dominant for vertical wavenumbers k z > 3 corresponding to wavelengths λ z < 10 m (dashed lines on the right panel of Fig. 5). For these small vertical scales, a notable feature of the figure is the striking amplitude difference between the 'truth' model FB (blue dashes) and the waves-only model GGG (green dashes). Thus one can see that wave-vortical energy exchanges are especially important for establishing accurate wave-energy levels in vertical small scales.
On the other hand, when the vortical mode is present (FB, P2G and QG), the vortical-mode kinetic energy dominates at large horizontal scales λ h > 50 m and large vertical scales λ z > 10 m (solid lines on both panels of Fig. 5). Large-scale, linear PV-vortices are generated by vorticalmode triad interactions, as visualized in Fig. 3. Compared to QG, the horizontal slices in Fig. 3 corresponding to FB and P2G exhibit smaller-scale cyclones, and also the appearance of a granular background which is mostly absent in QG. The small-scale features are attributed to the presence of waves in FB and P2G, manifested in both high-amplitude wave spectra at small scales, and higheramplitude vortical mode spectra at small scales (Fig. 5).
Through wave-vortical interactions, the waves in FB and P2G have the effect of increasing vortical-mode kinetic energy at small-scales (blue and green solid lines in both panels of Fig. 5), compared to QG (black solid lines). Along with shallower vortical-mode spectra, smaller-scale PV-cyclones can be seen in Fig. 3 for FB (blue solid) and P2G (green solid). The steep vortical-mode spectrum of QG (black solid) is associated with non-local mixing by large-scale vortices, while the shallower spectra of FB and P2G are linked with local mixing by the smaller-scale vortices (Bennett 1984;Beron-Vera and LaCasce 2016). These observations will be connected to our results for model dispersion and model diffusivity in Section 4f (see Figs. 8 and 9).
Of the three models with vortical modes, P2G has the highest vortical-mode energy in small scales (green solid lines on both panels of Fig. 5), consistent with the smallscale cyclones observed in the horizontal slice of P2G PV (the P2G panel of Fig. 3). Further understanding is provided by analysis of kinetic energy transfer in vorticalwave-wave (V-WW) triads, which exist only in the FB and P2G models. Computations show that non-resonant V-WW interactions transfer kinetic energy into the vortical mode, as seen in Fig. 6 (blue for FB and green for P2G).
To focus on energy transfer into small horizontal scales, Fig. 6 displays variance-preserving V-WW transfer spectra as a function of horizontal wavenumber k h , and scaled by k h (e.g Thomson and Emery 2014). An analogous plot (not shown) produces similar behavior of V-WW transfer spectra as a function of vertical wavenumber k z . One can see that V-WW energy transfer into small-scale vortical modes is strongest in P2G, in agreement with the higherlevel of small-scale vortical energy in Figs. 3 and 5. Since W-WW interactions do not exist in P2G, the V-WW interactions must absorb more forward energy transfer, compared with FB containing both V-WW and W-WW. We Variance-preserving kinetic energy transfer spectra k h T kin o,±,± /E as computed in equation (A10) for the Vortical -Wave Wave triad interactions as a function of horizontal wavenumber for Boussinesq and P2G models. The transfer spectra are averaged over 20 inertial periods and scaled by the steady-state energy E of each model. finish this section with a summary of observations regarding the structures and statistics arising from wave-modes, vortical-modes and their nonlinear interactions. These observations have implications for horizontal scalar transport in the different models. First, the size of the steady-state vortices determines whether dispersion is predominantly nonlocal (as in QG), or has a significant local component (as in P2G). These steady state vortices arise in models with the vortical mode present, and their size is determined by spectral transfer to large and small scales by nonlinear interactions involving vortical modes. Second, waves by themselves generate predominantly VSHF rather than vortices, but they are essential to establish the size of the vortices in the truth model FB. Without waves altogether, the vortices are too big (QG), but excluding only 3-wave interactions, the vortices are too small (P2G). We will see that the wave-turbulence model GGG with only 3-wave interactions does not capture horizontal dispersion, and yet the 3-wave interactions cannot be neglected because of their indirect effect on steady-state vortex size, and thus on the accurate prediction of horizontal dispersion.
f. Passive scalar and particle transport: effective diffusivities We now compare the dispersion and corresponding diffusivities of the dye streak and particles in each model.
Unlike dye, Lagrangian particles are non-diffusive. To distinguish diffusion-dependent dispersion, 10,000 particles are placed concurrently at random locations near the dye strip. Individual particles with position x = (x(t), y(t), z(t)) satisfy the equation of motion Particles reaching the boundaries are extended out of the periodic domain to avoid discontinuous trajectories. However, particle-pair separations used in the diffusivity computations lie in the range [0, L x /2]. The extension technique cannot be applied to the dye: once the dye concentration becomes homogeneous throughout the entire domain, the rate of change of dispersion i.e. the diffusivity, goes to zero. Dye contours superimposed with the trajectories of a few particles are shown in Fig. 7 at time t = 10τ (10 inertial periods) after injection. The particle trajectories are plotted with random colors (not related to the colorbar) to distinguish them from one another. One can observe from the GGG plot (lower right) that wave-wave-wave interactions by themselves induce very weak dispersion both for the particles and the dye. Particle trajectories are oscillatory, with inertial period (not shown). The QG model with only vortical interactions produces large vortices which trap dye and particles, reducing dispersion. In contrast, interactions between both wave and vortical modes in FB and P2G (top panels) result in increased horizontal dispersion. The dye in FB occupies a greater fraction of the domain than in P2G. In both P2G and FB, the presence of smaller coherent and filamentary structures act to increase the area covered by the dye and particles, compared to QG.

g. Impact of shear dispersion
A comparison of diffusivities (proportional to the rates of dye and particle dispersion) for the four models is shown in Fig. 8.
The contrast between horizontal diffusivities from dye and particles at the same spatial scales, illustrated by the slopes of the black line segments, provides an assessment of the importance of shear dispersion in the different models. If shear dispersion is present, the slope of the dye dispersion will be larger. In the FB model, the difference in dye and particle slopes is most pronounced at small scales. This is to be expected with shear dispersion since this mechanism involves a coupling between vertical shear which is strongest at small scales, and vertical diffusivity which acts on the smallest resolved scales. At the 50 m scale, the slopes differ by about 45 • . Therefore, we estimate that shear dispersion accounts for roughly half of the diffusivity in FB at that scale. The difference in dye and particle dispersion slopes is smaller at 100 m, and suggests that shear dispersion is still present, but not as important as at smaller scales. In P2G, devoid of wavetriad interactions, dye and particle diffusivities are comparable in the range of scales from 50 m to slightly less than 100 m, indicating the absence of shear dispersion. The QG model exhibits relatively low rates of dispersion compared to FB and P2G, with comparable dye and particle dispersion rates at the 50 m scale. Therefore, vortical- mode shear dispersion is not active at this scale. In FB and QG, dispersion slopes clearly diverge over time, but diffusivity estimates cannot be extended much beyond 25 inertial periods due to domain periodicity constraints. In GGG, the particle diffusivity is negligible. Therefore, internal-wave Stokes drift is not present. In the absence of vortical modes, diffusivity is only detected in the dye and, therefore, is attributable entirely to shear dispersion. It is important to note that the GGG dispersion rate is weaker by an order of magnitude from the other models.
Since FB and GGG are the only models showing signs of shear dispersion, we conclude that wave-wave-wave interactions are a necessary component for this mechanism. However, dispersion due to advection by vortical modes in the FB model also contributes significantly.
Dispersion characteristics seen in Fig. 7 help explain the differences in the different models. In the absence of waves, dye and particles remain tightly bound within the large vortex structures, resulting in stirring rather than mixing. This is consistent with the non-local behavior of QG dispersion: The QG KE spectral slope is greater than -3, signaling non-local dispersion and a deficit in smallscale vortical energy (Fig. 5). In Boussinesq and P2G, with shallower KE spectra, dispersion is a combination of local and nonlocal: The presence of smaller-scale vortices is responsible for the enhanced dispersion rates in these models, compared to QG. Moreover, the additional oscillatory wave component acts to perturb the smooth vortex streamlines and further enhances the dispersion.

h. Scale dependence of relative diffusivity
To obtain the scale dependence of each model, the relative diffusivity κ p H was computed from the pairwise relative separation of the particles for the restricted range of scales spanned by [0, L x /2]. The particles were placed in 25 vertical levels according to their initial vertical position in the domain, then binned as a function of initial pair separation. The relative diffusivity in each bin was computed by taking the temporal mean over all pairs. Scale dependence of diffusivity in each model is plotted in Fig. 9. All model diffusivities exhibit some scale dependence in the 0 − 100 m range, with close to linear dependence for QG. Beyond 100 m, the diffusivities are mostly scaleindependent and asymptote to values that can be related to the absolute diffusivity, the so-called diffusive regime (LaCasce 2008; Beron-Vera and LaCasce 2016). P2G exhibits the highest diffusivity, followed by FB and QG. The diffusivity in GGG is the weakest, lower by an order of magnitude from the other model diffusivities.
i. Implications for the N/ f = 100 regime Results obtained with reduced N/ f = 10 are related to the N/ f = 100 upper-ocean regime by multiplying horizontal scales by 10 and dividing the temporal scale by 10. Hence the diffusivities plotted in Fig. 9 correspond to scales up to 2.5 km and range in value in the scaleindependent regime from 0.35 ± 0.01 m 2 s −1 for QG to 0.6 ± 0.05 m 2 s −1 for P2G, with the FB value in between at 0.43 ± 0.01 m 2 s −1 . Considering FB to represent the truth, we find that the excess of small-scale vortical motions in the P2G model overestimates the diffusivity while the vortical-mode deficit in QG underestimates it. A value of O(0.5)m 2 /s is in the range reported in regions with weak mesoscale activity, e.g. during the LatMix 2011 summertime experiment in the Sargasso Sea (Shcherbina et al. 2015).

Discussion and Conclusions
We have examined the role of waves and vortical modes in influencing horizontal diffusivity on O(1) km scales in the ocean by performing simulations in reduced N/ f conditions, with intermediate, reduced-interaction models. These models have proven useful in isolating the contribution of waves and vortical modes to lateral dispersion. A comparison of dye and Lagrangian particle rates of dispersion has further helped identify the mechanisms that are active in this range of spatial scales.
Our study focused on wave-dominated flows (Bu=4) which are characteristic of much of the ocean away from strong mesoscale activity. In this regime, our simulations have demonstrated that while vortical motions are primarily responsible for the observed passive tracer and Lagrangian particle dispersion patterns, the impact of waves cannot be discounted. We find that waves play an indirect but non-negligible role since their presence helps establish the spectral distribution of steady-state vortical-mode fields. In flows devoid of waves (QG), the vortices are large compared to those in the truth model (FB) while in the absence of wave-triads (P2G), the vortical flow has too much small-scale structure. Waves also contribute to lateral dispersion by transferring energy via wave-triad interactions to VSHF, an essential ingredient for internal-wave shear dispersion. The indirect role of internal waves on lateral dispersion through their influence in shaping the vortical-mode spectrum is generally consistent with the conclusions of Sinha et al. (2019). These authors report that filtering out inertia-gravity waves significantly underestimates the lateral dispersion at submesoscales, with little impact on mesoscale dispersion.
A comparison of diffusivities computed from dye and Lagrangian particle deployments demonstrates that in wave-only flows, the effective diffusivity is due solely to internal-wave shear dispersion. This mechanism is, however, inefficient and leads to very weak effective diffusivities. We also do not find any evidence of internal-wavedriven Stokes drift, as evidenced by the lack of dispersion of Lagrangian particles.
In QG, lateral dispersion is shown to be nonlocal, controlled by advection with no evidence of vortical-mode shear dispersion. In P2G and FB, we find that the dispersion depends on the full spectrum of vortical motions, influenced not only by the largest vortices, but enhanced by the presence of smaller vortices. Dispersion by vortical modes in the presence of waves is local (nonlocal) at small (large) scales. Our interpretation of local/nonlocal behavior is only based on the spectral slopes of the vortical kinetic energy spectrum (Fig. 5). The diffusivities in Fig. 9 do not exhibit the power-law dependence on scale predicted by Bennett (1984) for local and nonlocal regimes. This is perhaps not surprising since Bennett's theory was not developed for flows with waves for which many of the statistical metrics used in establishing the scale dependence are not useful.
Internal-wave shear dispersion was present in the Boussinesq and GGG simulations, but absent in P2G, the model which excludes wave-triads. Therefore the wavetriad appears to be a necessary ingredient for shear dispersion, presumably because it is responsible for transferring energy to vertically sheared horizontal flows.
Large diffusivities due to spurious generation of PV in Boussinesq numerical simulations initialized with a broadband internal wave field (Bühler et al. 2013) and a Garrett-Munk spectrum (Sundermeyer et al. 2020) have been reported. In these two studies, the generation of vortical modes (PV) by dissipative forces across a broad range of scales is found to dominate the dispersion. Unphysical production of PV at the grid scale in Large Eddy Simulations is also reported by Bodner and Fox-Kemper (2020). We did not find evidence of numerical vortical mode production in the waves-only (GGG) model or in any of the other models.
Finally, we recognize that the range of spatial scales considered in this study is limited. Future directions will include performing simulations in larger domains capable of encompassing a greater range of scales, maintaining the resolution and gradually increasing the internalwave frequency band N/ f to include a broader internal wave spectrum in order to assess the impact of this parameter. Another promising line of research extends the constant-stratification wave/vortex decomposition used in the present study to cases with arbitrary stratification Early et al. (2020). Numerical implementation of this generalized wave-vortex decomposition will allow the types of simulations presented here to be extended to vertically bounded domains with more realistic density profiles. APPENDIX A1. Definition of energy spectra, hypo-and hyper-viscosity a. Definition of energy spectra Each vector function (u, θ ) with divergence-free velocity can be decomposed as in equation (5). Based on that decomposition, we can define the projection of the solution into the vortical and wave modes respectively as with a s k (k,t) := b s k (k,t)e −iσ s k (k) = (û(k,t),θ (k,t)) · φ s k (k), s k = 0, ±, wheref = F ( f ) is the discrete Fourier transform.
Here, the eigenfunctions are given by where σ s k (k) is given by equation (3), σ = |σ ± (k)|, k 2 h = k 2 x + k 2 y , and k 2 = k 2 x + k 2 y + k 2 z . The horizontal and vertical shells in Fourier space at horizontal wavenumberk h -th and vertical wavenumerk z are the sets S h (k h ) = k ∈ ∆k x Z × ∆k y Z × ∆k z Z and S v (k z ) = k = (k x , k y , k z ) ∈ ∆k x Z × ∆k y Z × ∆k z Z :k z − 1 2 ∆k z ≤ |k z | ≤k z + 1 2 ∆k z , where ∆k h = 2π/L h , L h = 500 m, ∆k z = 2π/L z , L z = 50 m. The spectra as a function of horizontal and vertical wave number are computed as the graphs in log-log scale of the following quantities. The vortical kinetic energy as a function of horizontal (E 0 h,kin (k h )) and vertical (E 0 v,kin (k z )) wavenumbers, and the wave kinetic energy as a function of horizontal (E ± h,kin (k h )) and vertical (E ± v,kin (k z )) are defined respectively as (A7) Similarly, the spectra of the potential energy are defined as (A8)

b. Energy transfer spectra
In the absence of hypo-and hyper-viscosity, the kinetic energy e kin = 1 2 u 2 + 1 2 v 2 + 1 2 w 2 satisfies the equation where , L 2 is the L 2 normalized inner product in Fourier space. The first three terms in equation (A9) correspond to energy transfer terms and can be associated to triads as follows. Given a triad (s 1 , s 2 , s 3 ) where each s j corresponds to either a vortical (0) or wave (±) mode, the kinetic energy transfer from (s 2 , s 3 ) interactions into the s 1 mode at wavenumber k is quantified as (A10) T kin (s 1 ,s 2 ,s 3 ) (k) = −û s 1 (k)F (∇ · (u s 2 u s 3 )) (k) −v s 1 (k)F (∇ · (v s 2 u) s 3 ) (k) −ŵ s 1 (k)F (∇ · (w s 2 u s 3 )) (k), where the superscripts indicate the projections in equation (A1).

c. Hypo-and hyper-viscosity
The hyper-viscosity and hyper-diffusion terms of the form (A11) have been applied to equations (1a) and (1b), respectively, to maintain stability and to provide a sink of energy within a localized dissipation range at small scales. Here the order of the hyper-viscosity is d = 8. The hyper-viscosity coefficients are computed based on resolution as where k h,max = 84 2π 500 m and k z,max = 42 2π 500 m are the maximum horizontal and vertical wavenumbers used after dealising.
On the other hand, a hypo-viscosity is also applied, arresting the inverse transfer of energy to the largest scales and thereby allowing the system to reach a statistically steady state at long times (e.g., Danilov and Gurarie 2004). The hypo-viscosity and hypo-diffusion terms of the form The hyper-viscosity (A11) and hypo-viscosity (A13) are active in relatively narrow wavenumber bands at opposite ends of the energy spectrum, and thus maximize the range of simulation wavenumbers dominated by nonlinear effects in statistically steady state.