A theory of spontaneous tropical cyclogenesis from quasi-random convection

How the cumulus clouds organize into a tropical cyclone remains poorly understood. The difficulty lies in that the deep convection is noisy at the kilometer scale, but follows the physical feedbacks at the mesoscale. We build a barotropic numerical model to understand the interaction of the stochastic and deterministic processes in the genesis of a tropical depression. Deep convection is represented as a multitude of isolated convergence forcing. The convection is assigned to distribute randomly at the small scale. At the mesoscale, convection is preferentially seeded to regions with a high spatially-filtered vertical vorticity. The preferential seeding mimics the physical feedbacks, and the filter implicitly represents the nonlocal convective triggering by gravity wave and cold pool. The result shows that the early-stage evolution is dominated by random vortex tube stretching. Subsequently, the regions where repetitive stretching occurs become vortex clusters, and induce more convection around them. The collision and coalescence between vortex clusters lead to a major vortex, which accelerates the growth by the preferential seeding. This physical picture agrees with a cloud-permitting simulation of spontaneous tropical cyclogenesis over uniform sea surface temperature. A theoretical model with approximate analytical solution is presented to depict the full evolution process. 7


Introduction
Convection is stochastic at the small scale ( 10 km) and in quasi-equilibrium with the environ-27 ment at the large scale ( 10 3 km) where a sufficient amount of clouds is embedded. However, 28 tropical cyclogenesis requires clouds to spontaneously organize via the 10 ∼ 10 3 km physical 29 feedback of convection with Ekman pumping (Charney and Eliassen 1964), wind-induced surface 30 flux (Emanuel 1986;Raymond et al. 2007), moisture (Montgomery et al. 2006;Dunkerton et al. radiative cooling, and drives a secondary circulation. On one hand, the ascending branch of the 48 scale at which the physical feedbacks have a deterministic signal, though substantial fluctuation 126 can occur due to the limited number of clouds. For tropical cyclogenesis, we judge that the "statistical mesoscale" roughly overlaps with the "dynamical mesoscale" where the stratification is 128 strong and the background rotation is moderate (not too strong to suppress the momentum advec-129 tion) (Riley and Lelong 2000). This is illustrated in Fig. 1. The strong stratification makes the 130 gravity wave adjustment fast enough and guarantees the weak temperature gradient approximation.
Here ∇ h = i∂ /∂ x + j∂ /∂ y is the horizontal gradient operator, ω is vertical relative vorticity, u 136 is horizontal velocity, u δ is the divergent flow, u ω is the rotational flow, δ is the divergence, δ u 137 is the divergence due to deep convective updraft (negative value), δ rad is the divergence due to 138 radiative cooling (positive value) and δ E is the divergence due to Ekman pumping or suction.

139
Here we follow Sansón and Van Heijst (2000) to consider the stretching/squashing of the total 140 absolute vorticity in Ekman spin down, rather than the classic linear formula that only considers 141 the planetary vorticity part (Vallis 2017).

142
The deep convection entrains lower and mid tropospheric mass and adds mass to the upper level. 143 We represent it with a quasi-random seeding of convergence pulses, which is an update from the to get an updraft be measured by updraft number density ρ u (x,t), which will be discussed shortly 151 afterwards. The expression of δ u is contributed by multiple updrafts: where δ um (negative constant) is the peak divergence of an updraft, x n is the position vector of 153 the n th updraft which follows the wind, t n is the peak time of the n th updraft, r u is the updraft 154 radius scale, and τ u is the updraft duration time scale. The averaged accumulated convergence 155 within the radius r u is defined as ∆h/H = δ um τ u , which takes a negative value. The ∆h (negative 156 value) is the air column thickness loss in a convective event, and H ∼ 5 km is the basic state layer 157 thickness. The δ rad is set to be uniform, which instantaneously balances all the updraft divergence 158 synchronously: Here δ 0 is the domain averaged updraft-induced divergence (negative), around which −δ rad fluc-160 tuates. The δ rad can be linked to radiative cooling rate Q rad (unit: K s −1 ): where ∆θ is the potential temperature difference across the 1-6 km height layer. The δ E is calcu-162 lated with the simple laminar Ekman layer model: Here E is Ekman number which represents the vorticity-divergence (spin down) relation, and h E 164 is the Ekman layer depth. Equation (8) is less accurate than the Ekman layer model that uses bulk 165 aerodynamic formula (e.g. Schecter and Dunkerton 2009), but is more analytically tractable. 166 We design a quasi-stochastic convective scheme, where the local cloud population depends on 167 the mesoscale vorticity. The anomalous convection caused by the large-scale curl-free flow, such 168 as equatorial Kelvin wave (e.g. Yang and Ingersoll 2013), is not considered here. The convection, 169 which peaks at 2.3τ u after the seeding (to make convergence start from close to zero), depends on 170 the spatio-temporal updraft number density ρ u (x,t) (unit: m −2 s −1 ): The maximum operator is used to keep ρ u positive, but the threshold is hardly reached until the 172 very late stage. Here ρ u0 is a constant basic state seeding density which balances the radiative 173 cooling: As δ u is proportional to ρ u , they are linked via a rescaling: The η in (9) is a fixed nondimensional feedback parameter that measures the sensitivity of ρ u to 176 the spatially-filtered vorticity (denoted as ω). Substituting (6), (8), and (9) into (2), we see that the 177 δ as a random field obeys: The "∼" in (12) means δ as a random variable obeys a distribution, which is the relation used in 179 the numerical simulation. The "≈" in (12) is a further simplification that assumes the Ekman spin 180 down to work at the filtered scale, which will only be used in the theoretical model in section 3.

181
A schematic diagram of this quasi-random seeding scheme is shown in Fig. 2. The general 182 probabilistic relationship between convection and vorticity was noticed by Ritchie and Holland 183 (1997) using observational data. Equation (12)  to the conditional instability of the second kind (CISK) (Montgomery and Smith 2014), but with 192 a nonlocal and quasi-random modification. As deep convection is the source of both vapor and 193 vertical vorticity, we consider the ω in this model to partly represent moisture. Thus, the positive 194 moisture and radiation feedback is implicitly included, and can be considered to modulate η. 195 We use a Gaussian filter operator to subtract the mesoscale field, because it only uses neighbor-196 ing information and yields good math property: Here l is a prescribed filter length scale, which represents the nonlocal trigger length scale and is 198 viewed as the mesoscale in this paper. One property is that filtering twice has a stronger smoothing 199 effect than filtering once. Physically, it means the convective region has a tendency to laterally 200 expand due to the nonlocal trigger by cold pool and gravity wave. In appendix A, we show that the 201 system length is a balance between the lateral expansion and the growth in magnitude when the 202 convergent flow is weak (ω f 0 ), and a balance between the lateral expansion and the convergence 203 when the convergent flow is strong (ω f 0 ). Both balances lead to a flow characteristic length 204 scale of l. In addition, the merger tendency of eddies renders an "eddy tension" that makes the 205 scale shrink. It is shown that this effect makes the scale decrease with increasing −∆h/H. As

206
we have not figured out a method to quantify the "eddy tension", l is regarded as the system 207 length scale throughout the theoretical model in section 3. Note that in the limit of l L, the 208 quasi-random scheme reduces to a purely-random scheme.

209
The problem can be nondimensionalized by choosing l as the length scale and −δ −1 0 as the time 210 scale. This leads to five independent nondimensional parameters: δ 0 / f 0 , r u /l, −∆h/H, η and E.

211
The normalized domain size L/l is another potential parameter especially for the later stage when 212 there are only a few dominant vortices. size L is chosen to be smaller than Rossby deformation radius L R . In the RRCE simulations (e.g.

217
Khairoutdinov and Emanuel 2013), the compensating descent of an updraft is constrained within 218 L R distance from the source in the geostrophic adjustment, so the convection-driven vortices rarely 219 interact with each other when their distances are over L R . This is similar to the phenomenon in 220 doubly-periodic domain that a vortex can no longer merge when it is alone.

226
Using ∆θ = 20 K which makes the atmospheric lapse rate close to moist adiabatic, the δ 0 is 227 equivalent to a radiative cooling rate of Q rad ∼ δ 0 ∆θ ∼ −12.3 K day −1 . This is much larger than 228 the typical tropical value of −1 ∼ −2 K day −1 (Hartmann et al. 2001). The motivation of using 229 such a high-magnitude δ 0 and f 0 is to save computational resource by accelerating the spin up.

230
The key nondimensional parameter values for the reference test are: −δ 0 / f 0 = 0.07 (a large −δ 0 231 and large f 0 yields a moderate −δ 0 / f 0 ), r u /l = 0.133, ∆h/H = −1.6, E = 0.08, η = 1.6. In the 232 simulations, we fix L and r u , but change l, η, −∆h/H, and f 0 , one at a time. We pay more attention 233 to l which is conceptually the most important one to this quasi-random model. The l = 60 km test 234 is designated as the reference test, and the l = 30 km and l = 45 km tests are the sensitivity tests. 235 We encourage the readers to watch the movie of the simulation in the supplemental material.  wave. The clusters also merge with each other, as is more clearly seen from the ω field (Fig. 4).

251
Such a multiscale merger process has been reported in cloud-permitting simulations (Ritchie and  Then, we analyze the sensitivity test: the pure random seeding test, the l = 30 km test and the 257 l = 45 km test. Figures 5 and 6 show that the vorticity field of all tests at t 1 looks similar, with 258 smaller l leading to larger ω magnitude and more clusters. As the mesoscale feedback and vortex 259 motion is not vigorous yet, this is largely a stochastic result. The later stage evolution is quite 260 different:

261
• The purely random test remains at the sporadic vorticity stage, with ω fluctuating around 262 a quasi-equilibrium state. The Ekman spin down is strong enough to suppress any strong 263 vorticity patch (produced by chance) before it grows into a cluster by merger.

264
• When the positive feedback is included, a smaller l leads to faster local intensification due 265 to the higher ω provided by the purely random stage. However, the further merger is less 266 prominent due to the vortex clusters' relatively small size ("collision cross-section"), and the 267 system resembles a point vortex system.

268
As the vortex cluster size seems to scale as l, we track the highest ω point and calculate the 269 radial structure of the azimuthally-averaged ω (Fig. 7). The shape is confirmed to be similar, with peak ω magnitude of the vortex clusters become more widespread (e.g. Fig. 4). This reminds us 279 of the droplet spectrum widening via stochastic collision-coalescence in cloud microphysics (e.g. Finally, we check the vorticity standard deviation std(ω) (Fig. 8b Based on the analysis in section 2c, we build a theoretical model that separately treats the 289 stochastic regime and the deterministic regime. In the stochastic regime, the vorticity grows due 290 to random stretching of the randomly seeded convection against the radiative cooling and Ekman 291 spin down. A purely statistical description is sufficient. In the deterministic regime, we model the 292 mean vorticity of the vortex cluster (close to but not the same as ω), which is a mesoscale property.

293
It takes the filtered end state value of the statistical regime model as its initial condition. The movie of ω evolution shows that the first round of vortex clusters produced by random 296 stretching does not vanish. They grow steadily according to the (positive) physical feedback.

297
Thus, we use the theoretically predicted equilibrium state of the stochastic regime as the initial 298 condition of the deterministic regime model. There is a prerequisite for the regime separation: 299 when the equilibrium is just reached, the preferential seeding signal driven by the feedback should 300 be much smaller than the stochastic component. In appendix B, a criterion based on signal-to-noise 301 ratio is presented, and the separation is shown to be roughly valid for the reference test. 302 ticity quasi-equilibrium (VQE) in a damping time scale T 0 , we estimate the equilibrium field as 304 a field that evolves freely to t = T 0 . The T 0 is estimated with the Ekman spin down rate plus the 305 radiative cooling rate: Here, we have assumed that ω f 0 , as is in most of our cases. This makes the classic linear 307 Ekman spin down time scale ( f 0 E) −1 available. Equation (14) predicts −δ 0 T 0 = 0.47 for the 308 reference test, which agrees with the transition time in Fig. 8b. We estimate the characteristic 309 amplitude of the randomly-forced field with the Fourier spectrum of ω.

310
First, we show that if it is the filtered field rather than the original field that is of interest, the 311 random stretching process which involves convergence can be approximated as a geometrical ran-312 dom superposition of fixed increments. This is because, if the area of an isolated vorticity patch S 313 (surrounded by near-zero ω) is much smaller than l 2 , only its circulation change ∆Γ produced by 314 seeding a convection will influence the filtered value: The ∆Γ is proportional to the total mass sink in an updraft event. In contrast, the local vorticity 316 magnitude evolution in the updraft region grows more steeply. The local relative vorticity obeys 317 a power law due to the conservation of absolute circulation (Fu and O'Neill 2021): where m is the total number of updrafts falling to one place. To make the discussion 319 more general, we use a scalar φ to denote ω. This problem transforms to studying the filtered field 320 of the random superposition of N u Gaussian shape increments, denoted as φ n : whose length scale is r u and the peak increment Φ 0 is: The N u is the total number of updrafts within T 0 time inside the domain, which is estimated with 323 (10): The finite-domain Fourier transform of φ n is denoted asφ n : for wavenumbers: In (19), we have used the infinite-domain Fourier transform of Gaussian function to approximate 327 the finite-domain transform, which is valid due to r u L. The parameter γ = 4π 2 /L 2 in (19) is a 328 conversion coefficient.

329
Second, we show that the power spectrum of the randomly superposed field φ = ∑ N u n=1 φ n (x, y), 330 |φ |, is proportional to the individual one |φ n |: The modulus of the sum of the random shift factor, which is the amplitude of a group of incoherent 332 waves, is N 1/2 u (e.g. Weisstein 2021), which is proportional to T 1/2 0 . This scaling generally agrees 333 with the spectrum of the simulation (Fig. 9a), though the spectrum shape has some deviation due 334 to the presence of damping and the convergent flow. The proportional coefficient N 1/2 u is not N u , 335 which instead represents the superposition of coherent waves that originates from the stationary 336 growth of an existing pattern without damping.

337
Third, we use |φ | to estimate the characteristic amplitude of φ (the spatially filtered φ ). As 338 the filter is a convolution in physical space, the modulus of the filtered field |φ | is obtained by 339 multiplying exp −l 2 (k 2 x + k 2 y )/4 toφ and taking the modulus, using (21): Here we have used r 2 u l 2 to simplify the expression. The filter roughly truncates the lowest 341 significant wavenumber to 2/l. The corresponding cutoff wavelength is 2π/(2/l) = πl. As the 342 filter significantly damps the Fourier modes whose wavelength is shorter than l, the number of φ 343 peaks in the domain (denoted as N l ) should obey: which is in good agreement with the simulation (Fig. 9b). Suppose the φ field consists of N l 345 randomly distributed Gaussian-shape structures, with a length scale of l and an amplitude scale of 346 Φ l . By mimicking (22), we get: multiple of Φ l , we get : where we have substituted in (17), (18) and (23). The λ is a tuning factor that calibrates our coarse 350 theory to the simulation. Figure 9c shows that (25)

356
Transforming back from φ to ω, we summarize that the stochastic regime model provides the 357 initial condition for the deterministic regime model which is a dynamical system of three variables: 358 the vortex cluster radius a, the spatial cluster number density N v (unit: m −2 , rather than m −2 s −1 359 as is for ρ u ), and the relative circulation Γ of each cluster. Their initial conditions are denoted as 360 a 0 , N v0 and Γ 0 respectively: c. The deterministic regime

362
Though it is always stochastic at the convective scale, the mesoscale is more deterministic due 363 to the filter which makes the convective probability smoother in space. A smoother and larger con-364 vective number density ρ u (probability) makes the random field more closely obey the mesoscale feedback. We assume the mesoscale dynamics is completely deterministic after the vorticity quasiequilibrium is established in the stochastic regime. In appendix C, we quantify the degree of 367 determinacy (fluctuation) by defining and calculating a "convective Knudsen number".

368
The nonlinear terms in the governing equations (1)-(4) make the system chaotic, so we only 369 solve the vortex cluster's characteristic relative circulation Γ and the spatial number density N v .

370
The idea is to adapt the vortex gas model in studying two dimensional turbulence (Carnevale

385
The first point is further explained here. Within a vortex cluster of radius l, the central vortex 386 radius (denoted as r c ) is basically determined by the competition between the expansion due to 387 the filter, against the contraction due to convergence and merger. The initial value of r c should 388 scale as r u , so r c l. In the classic vortex-gas model, the merger between two vortices with 389 identical vorticity yields Γ 2 3 = Γ 2 1 + Γ 2 2 and a 4 3 = a 4 1 + a 4 2 , where the subscript 1 and 2 denote the 390 two vortices before merger and 3 denotes the merged vortex (Carnevale et al. 1991;Trizac 1998).

391
The Γ 2 conservation is obtained from the conservation of kinetic energy: Γ 2 ∼ ω 2 r 4 c (unit: m 4 392 s −2 ). As ω is a tracer, the Γ 2 conservation readily leads to the fourth order conservation of r c .

393
The net loss of circulation represents the filaments produced in the mutual shearing process of a 394 merger event. As we look at the vortex cluster scale, the r c l property and the convergent flow 395 make the filaments well contained within a radius of l, and keep the central vortex small. Thus, 396 the circulation Γ, rather than Γ 2 , should be conserved. In Appendix A, we show that the cluster 397 radius is generally constrained to be no larger than l by the convergent flow, so we will use a = l.

398
To apply the vortex gas idea to vortex clusters, we let ω denote the mesoscale ω characteristic 399 magnitude of vortex clusters at time t, and then model its evolution. Our vortex gas model assumes 400 equal strength of all vortex clusters at any time. In the theoretical model, a vortex cluster is 401 considered to be a top-hat structure with a radius of l, with the Γ being linked to ω via: The is an operator that can also be performed on other variables. In the simulation, however, the 403 magnitude of the clusters varies significantly. Among them, the strongest cluster best represents 404 the final major vortex. Thus, in processing the simulation data, we define ω as a conditional 405 average within 2l radius, centralizing at the maximum ω point (x c ): The 2l radius is chosen to encompass most of the positive ω within the vortex cluster (e.g. Fig.   407 7). The ω defined in (27) corresponds to a more compact vortex cluster structure than that in 408 (28), but the former represents the average strength and the latter represents the strongest vortex 409 cluster. As a compromise, we consider them to be comparable in this preliminary investigation. In 410 the future, a more careful comparison can be made when the heterogeneous strength of the vortex 411 clusters are considered. 412 We need to estimate the magnitude ratio of the convergent to rotational wind to determine 413 whether the convergent flow is important. From (2), the net convergence δ n is: which yields −δ n / ω = (η − 1)E = 0.048 for the reference test. As this value is much smaller 415 than one, the effect of convergent flow in driving the motion of the vortex clusters is ignored.

416
The time scale of merger τ m is estimated with the Boltzmann mean collision time (Trizac 1998): where r is the characteristic distance between two vortices that obeys r ∼ N Equation (27) shows that the evolution of ω is equivalent to that of Γ, due to a = l. Applying 421 the in theoretical sense to (1), we get: where we have used Gauss theorem, and the loop integral encapsulates most of the positive ω 423 region. The ω − is the "environmental relative vorticity", which is also entrained into the vortex 424 cluster. When diagnosed from simulation, it is defined as the average vorticity outside of the 425 vorticity clusters (the area excluding all 2l-radius patches). The τ e is the timescale of growth 426 purely due to entraining environmental planetary vorticity: Substituting in the reference parameter, we get −δ 0 τ e ≈ 1.5. When the vortex is not involved in 428 a merger event, the stream function contours are closed and roughly perpendicular to the vorticity 429 gradient vector, so the rotational wind u ω does not induce net transport of vorticity to the vortex.

430
During the episodic merger events, the contour lines are open, and the role of u ω is parameterized 431 as the merger term ω /τ m .

432
We need to solve ω − to make (31) and (32) a closed dynamical system. The ω − becomes 433 more and more negative due to the radiative cooling and Ekman suction. Figure 10 shows that by 434 t = 3, the ω − / f 0 of the l = 30 km simulation has dropped to −0.2. If the environmental absolute 435 vorticity is exhausted (ω − → − f 0 ), the merger will be the only way to raise vorticity magnitude.

436
This consideration is important at the later stage of the evolution. One way to estimate ω − is 437 using the domain-integrated conservation of absolute vorticity: a faster growth of the cyclonic 438 vortex patch leads to a faster consumption of the environmental vorticity: Here we have used an approximation that the cyclonic vorticity patches only take a relatively small 440 fractional area at the beginning (N v0 πl 2 = 1/π), and it further drops due to merger. Substituting

441
(29) into (32), we get: grows a bit slower than exponentially: The cancellation of 1/τ m term indicates that the merger does not influence ω − . This is because the 445 total circulation of the vortices are conserved during merger: the rise of ω is exactly cancelled 446 by the decrease of the total area of the vortex clusters. An initial condition of ω − is required to 447 solve (36). It is estimated with ω | t=T 0 = ω 0 (25), and the cyclonic patch's initial fractional area Equations (36) and (37) yield an approximate analytical solution of ω − as a function of time: In deriving (38), we have used f 0 + ω 0 π −1 ≈ f 0 , because ω 0 π −1 < ω 0 f 0 . analytical solution can be obtained, which is plotted in Fig. 11. The technique is to find the which represents the total circulation of the vortex cluster 456 region. We have: where τ m0 is the initial value of τ m . Using (26), we see it is inversely proportional to ω 0 : The τ m0 does not directly depend on l. This is because a smaller l leads to a shorter vortex interval 461 which enhances mutual advection and shortens the collision path, but the collision cross-section 462 is also smaller. The N v is predicted to decrease with time due to merger. The fractional decrease is larger for a smaller l test which has a larger ω 0 (Fig. 11b). Physically,

464
it is because the smaller initial vortex interval and the subsequent more vigorous vortex motion 465 outweighs the disadvantage of smaller collision cross-section l.

466
The reference parameter yields τ m0 /τ e = 5.8, so merger is not vigorous at the beginning of the 467 deterministic regime, and we can well assume τ m0 /τ e > 1 for any case of interest. In general, at 468 the small t regime (t/τ e ln(2τ m0 /τ e )), the asymptotic expressions of ω , N v and τ m are: They indicate that the growth is solely due to entrainment, and the flow pattern is similar to a 471 linear instability growth at a preferred wavelength. For large t (t/τ e ln(2τ m0 /τ e )), the merger 472 rate, which is a nonlinear upscale growth factor, outweighs entrainment and asymptotically reaches 473 twice its magnitude: The flow pattern becomes chaotic. In this large t regime, the vortices are far away from each 476 other, and the reason for keeping a steady merger rate comes from the increasing ω due to 477 entrainment. If the environmental negative relative vorticity ω − is considered, entrainment will 478 become less efficient, and the growth due to merger will become less and less effective as well.

479
The general evolution picture reminds us of rotating Rayleigh-Bénard convection. As the lower  plotting of ω / f 0 in Fig. 10. Figure 11 shows that the slope is gentler than the prediction of 496 the asymptotic solution (46), due to the entrainment of negative environmental relative vorticity.

497
A larger η only accelerates the growth rate in the deterministic regime; a larger f 0 is predicted 498 to reduce ω 0 in the stochastic regime by reducing the equilibrium time T 0 (hardly justifiable from 499 Fig. 10), but accelerate the deterministic regime by shortening τ e . One significant drawback of our 500 theory is underestimating the sensitivity of ω / f 0 to l, which originates from underestimating the 501 sensitivity of ω 0 to l (e.g. Fig. 9c).

502
The prediction of ω − / f 0 is good at the early stage, but its magnitude is overestimated at the later 503 stage. The barotropic simulation shows that ω − / f 0 is insensitive to f 0 , but we predict |ω − / f 0 | to 504 increase with f 0 . The reason of the deviation has not been figured out.

505
The maximum wind V m is estimated with the characteristic mesoscale rotational wind ω l, plus 506 the characteristic cloud-scale convergent wind induced by an individual updraft V up : The mesoscale convergent wind is negligible, as is shown in (29). A direct summation is used,  However, our theory is still too coarse to quantitatively predict V m , because the role of eddy merger 513 and axisymmetrization in concentrating vorticity and therefore accelerating the flow is neglected.

514
The eddy acceleration factor proposed by Fu and O'Neill (2021), which links the mean vorticity 515 to the maximum wind, could be combined with (49) in the future.

516
The N v diagnosed from the simulation (Fig. 8a) qualitatively agrees with the theory at the t 3 517 early stage (Fig. 11a), where the size and strength of the vortex clusters do not vary significantly. 518 We do not attempt to fully benchmark N v before a more complete theoretical model with hetero-519 geneous vortex clusters is established. as ω.

534
The domain-averaged rainfall rate in Fig. 12c shows that the system enters a convective quasi-535 equilibrium (CQE) state by day 1. The low-mid level maximum total wind, which is calculated 536 using the 1.18-6.25 km vertically averaged velocity vector, gradually climbs to ∼ 80 m s −1 (Fig.   537 12a). By day 35, the last major merger event has finished. The later evolution is the focus of Fu 538 and O'Neill (2021), which will not be discussed in this paper. In appendix D, we use the quasi-539 steady rainfall rate value before day 35 to estimate the column latent heating, and use radiative-540 convective equilibrium assumption to link it to the low-mid level radiative cooling rate, which 541 yields −δ −1 0 ≈ 11.3 days. It indicates that day 35 corresponds to t ≈ 3.1.

547
This is because, the tropospheric part of the initial sounding is already in equilibrium, and the sys-548 tem only needs to wait until the cold pool trigger chain in the boundary layer is established. The 549 rough overlap of T 0 with the convective adjustment time scale (∼ 1 day) should be coincidental.

550
However, we consider CQE should be established before VQE, because convection is the source 551 of vorticity. Figure 13 shows that the power spectrum of ω amplifies in a roughly similar shape 552 within day 1. The spectrum shape is a mixed signal of the monopoles produced by stretching and shows signals of growth much earlier than day 35 (Fig. 12b). The standard deviation of the 100-561 km filtered ω shows a ∼ 8.7 day growth timescale before day 35 (t 4). Three sub-regimes are 562 identified, as are discussed below.

563
The vorticity patches grow stationarily with little change of flow pattern between day 1 and 564 day 15 (Fig. 14). It shows that the deterministic physical feedback works at a length scale down 565 to as small as 30 km. We hypothesize that this short-range organization is due to the nonlocal

572
There is vigorous vortex merger after day 15. The vorticity amplitude is large enough to drive 573 eddying motion. This transition is shown in std(ω) (Fig. 12b) where the std(ω) with 30 km-filter 574 transits from a growing to a platform stage by day 15, and then transits to a second growing stage 575 by day 25. Meanwhile, the std(ω) with 100 km-filter grows steadily. Fig. 15 shows that day 15 576 -day 25 is an upscale-growing stage with vigorous merger but slow amplitude growth. We have 577 not figured out the reason why the 30 km-filtered ω does not significantly grow in amplitude at the 578 platform stage.

579
Between day 25 and day 35, the 30 km scale clusters merge into at least 100-km scale super-580 clusters and grow in magnitude (Fig. 16). The 100 km-and-above scale explains most of the 581 std(ω) (Fig. 12b). At l 100 km scale, the vorticity evolves synchronously with the surface 582 cold core which represents convective activity. The vorticity also evolves synchronously with the

589
There is an interesting episode between day 30 and day 35. By day 30, the convective patches 590 have organized into a dominant cyclonic vortex (Fig. 16d). However, the anti-cyclonic vortex of 591 the non-convective region elongates the cyclone into a few pieces, which merge again by day 36.

592
We have not observed any similar episode in the barotropic simulation.

618
The filter not only sets a system length scale by expanding the convective region against the 619 shrinking effect of convergence, but also spans a spatio-temporal cube in which the probability of 620 convection genesis is smooth. This renders a relatively deterministic mesoscale, where the vortex 621 clusters can be modelled with a dynamical system that depicts the random collision-coalescence 622 process. The initial condition of the dynamical system is provided by the noisy earlier phase, 623 which is denoted as "the stochastic regime". If the vorticity quasi-equilibrium state is reached 624 before the positive feedback becomes significant, we can estimate the initial condition by using the shape-invariant property of the Fourier spectrum under random superposition.

626
The average vorticity within the strongest cluster ω , which is closely related to the maximum 627 wind, is predicted to be larger for a smaller filter length scale l, larger feedback parameter η, larger  However, the seemingly non-unique candidate of l shows that a single nonlocal convective trig-638 gering length scale is insufficient. In a follow-up paper, we will investigate the physical basis 639 of using filter to represent the nonlocal trigger by establishing a theory of the convective trigger 640 chain.

641
Even if no further complexity is added, the present theoretical model can be improved in the 642 following aspects:

643
• An extension to consider the size and circulation spectrum of the vortex clusters is desired.

644
Especially, a heterogeneous treatment of the vortex clusters in the stochastic regime is the key 645 to resolve the underestimated sensitivity of ω 0 to l.

646
• A collision-coalescence model of vortex-sink system (Novikov and Novikov 1996) rather 647 than a pure vortex system is needed to account for the mesoscale convergent flow, which is Here we explore how the nonlocal convective trigger (represented by a Gaussian filter) cooperates 661 with convergence to set the system length scale l * .

662
The vorticity equation (1) is a stochastic nonlinear integro-differential equation. Imposing a 663 filter on (1), we get the mesoscale vorticity equation: To facilitate the math, we have added a filter on the spin down term to make it work at the same 666 scale as the updraft. This approximation weakens the damping effect on scale contraction, and 667 therefore causes an underestimation of the system scale. We have also introduced a growth time 668 scale τ −1 ≡ (ω + f 0 )(η − 1)E to denote the vorticity magnitude growth due to stretching. Sup-669 posing τ is a constant, (A2) is simplified with a Taylor expansion on the Gaussian filter term: If the advection term is further neglected, (A3) reduces to a form identical to the column water

675
When ω f 0 , there is τ −1 −δ , with δ obeying (12). The advection can be neglected, making 676 the system linear. The basic length scale l * should be The first scaling has been used by Windmiller and Craig (2019) in their coarsening model where 678 the initial condition is a highly fine-grained noise. It agrees with their non-rotating cloud-679 permitting simulation at the early stage. The second scaling, which denotes scale saturation, is the 680 neutral wavelength of the normal mode solution of (A3). Our theory only predicts the mesoscale, 681 so the initial length scale of ω for the deterministic regime is l. Thus, the system length scale will stay at l according to the second scaling of (A4). Though this conclusion is simple, there is 683 a more complicated process behind. The merger as a nonlinear factor can increase the size of the 684 cluster's central vortex, which is smaller than l at the beginning of the deterministic regime. The 685 convergent flow in the later evolution will constrain the size of a cluster, as well as its embedded 686 central vortex, to be no larger than l. This is discussed below. seek for a Lagrangian treatment. The ω/τ and D∇ 2 ω terms show that the convective area expands 695 linearly with time at a rate of l 2 /τ. The advection term makes the area shrink at a timescale of 696 −δ −1 at the same time. Let the l 2 * tendency caused by the filter be (dl 2 * /dt) l = l 2 /τ ∼ −l 2 δ and 697 that by convergence as (dl 2 * /dt) δ = l 2 * δ . The sum of the tendency leads to an ODE of l 2 * : It yields a stable equilibrium radius l with a relaxation time scale of −δ −1 . Thus, l serves as the 699 system length scale at all stages. The signal-to-noise ratio of the convective scheme 709 The signal-to-noise ratio (SNR) of the seeding scheme (9) is defined as the ratio of updraft-induced 710 divergence anomaly to its basic state value. The SNR at the vorticity equilibrium time T 0 is derived 711 with (9), (11) and (25): It involves all the five nondimensional parameters. The separation between the stochastic and 713 deterministic regime requires SNR| t=T 0 1. The reference parameter yields SNR| t=T 0 ≈ 0.24, so 714 the separation is roughly valid.

716
A measure of the mesoscale determinacy 717 A high mesoscale determinacy means there is a sufficiently large number of updrafts falling in 718 a spatio-temporal cube l 2 T , where T is the shortest time scale of ρ u (or ω) and l is its smallest 719 spatial scale. We denote the updraft number in the cube, after subtracting the part that balances 720 the radiative cooling (ρ u0 ), as the inverse of a convective Knudsen number Kn: The determinacy requires Kn 1. The original definition of Knudsen number is the ratio of the 722 mean free path of gas molecules to the length scale of interest, which measures the degree to 723 which the continuum assumption is valid (Batchelor 2000). Let T = τ e be a rough estimate of the 724 characteristic timescale. Substituting (9), (10) and (33) into (C1), we get: Note that Kn does not depend on E. The determinacy is predicted to be higher for a larger ω/ f 0 , 726 which makes the feedback signal stronger, and a lower r u and −∆h/H which raise the convective 727 number density. It might be somewhat surprising that a smaller η reduces Kn. This is because, as 728 η approaches unity from above, the growth time scale τ e → ∞, but the seeding rate is still finite 729 and balances with Ekman spin down.

730
Next, we estimate the value of Kn of our reference test simulation. Because ω increases in the 731 deterministic regime, the maximum Kn occurs at the beginning (t = T 0 ) is calculated by substitut-732 ing (25) into (C2): where 741 ρ l = 10 3 kg m −3 is liquid water density, ρ a = 0.9 kg m −3 is the low-mid level air density, L v = 742 2.5 × 10 6 J kg −1 is evaporation latent heat, c p = 1004 J K −1 kg −1 . Considering that the net latent 743 heating is balanced by radiative cooling, and using ∆θ = 20 K, we have: which yields −δ −1 0 ≈ 11.3 days (1 × 10 −6 s −1 ) and Q c ≈ 1.77 K day −1 . the blue, red and yellow lines denote t = t 1 , t 2 , and t 3 respectively. For (b), the blue, red 917 and yellow lines denote t = t 2 , t 3 , and t 4 respectively. For (c), the blue, red and yellow 918 lines denote t = t 3 , t 4 , and t 5 respectively. The sampling slot is shifted, because the system shadow denotes the ±1 standard deviation range. For (a), the blue, red and yellow lines denote t = t 1 , t 2 , and t 3 respectively. For (b), the blue, red and yellow lines denote t = t 2 , t 3 , and t 4 respectively. For (c), the blue, red and yellow lines denote t = t 3 , t 4 , and t 5 respectively. The sampling slot is shifted, because the system growth rate is slower for a larger l.  9. (a) The one-dimensional power spectrum |ω| of the index-1-run of the purely random test. It is obtained by "azimuthally" averaging |ω| over each wavenumber circle (constant |k| = k 2 x + k 2 y ). The solid blue line denotes t = 0.12 (t ≈ −δ 0 T 0 /4), the solid red line denotes t = 0.24 (t ≈ −δ 0 T 0 /2), and the solid red line denotes t = 0.47 (t ≈ −δ 0 T 0 ). The dashed purple line denotes the equilibrium spectrum predicted by (22).
The diagnosed spectrum at the high wavenumber end is gentler than the theoretical estimate, probably due the sharpening effect of the convergent transport. (b) The dependence of domain total vortex number N v0 L 2 on filter length l, at t = 0.47 (t ≈ −δ 0 T 0 ). All the simulations are run with l = 60 km, and the value at different l is due to the change of l in filtering the same set of ω data. The solid blue, red and yellow lines denote the data using −∆h/H = 0.4, 0.8 and 1.6 respectively. Note that τ u is changed proportionally to keep the updraft-induced convergence −∆h/(τ u H) unchanged. The yellow dashed line denotes the theoretical prediction N l = L 2 /πl 2 .
(c) is the same as (b), but for benchmarking the theoretical ω 0 / f 0 at −∆h/H = 0.4, 0.8 and 1.6 (dashed blue, red and yellow lines). The simulation counterpart (solid lines) uses ω / f 0 , where ω is defined as the area average within 2l radius of the maximum ω point (the strongest vortex cluster). The motivation of this choice is explained in section 3c.