The Mid-Lithospheric Discontinuity caused by channel flow in the cratonic lithosphere

Stable cratons with a thick (> 200 km) and cold lithosphere form rheologically strong plates that move atop a ductile asthenospheric mantle. Various types of seismic observations show the presence of a potentially rheologically weak zone at depths of ca. 80 – 150 km termed the Mid-Lithosphere Discontinuity (MLD). While various mechanisms may explain the MLD, the dynamic processes leading to the seismic observations are unclear. We propose that the MLD can be caused by channel flow in the lower lithosphere, triggered by negative Rayleigh-Taylor instabilities at cratonic margins in the Archean, when the mantle was hotter than at present. Presence of a chemically distinct, low-density cratonic lithospheric root is required to initiate the process. Numerical modeling shows that the top of the channel flow creates a shear zone at a depth comparable to the globally observed seismic MLD. Grain size reduction in the shear zone and accumulation of percolated melts or fluids along the channel top may reduce seismic wave speeds as observed in the MLD, while the channel flow itself may explain radial anisotropy of seismic wave speeds. Secular cooling of the Earth deepens the top of the channel flow on a 1 Gyr scale, and early-stage large-scale (1000’s km long) channel flow deformation switches to a different deformation style with a smaller (100’s km) wavelength. These different flow patterns may explain the different seismic response of the MLD and the lithosphere base.


Introduction
The plate tectonic model requires that the outermost layer of the solid earth, the lithosphere, moves as rigid plates on a weak viscous zone, often referred to as the asthenosphere. The depth to the lithosphere-asthenosphere boundary (LAB) in stable cratons is debated and is typically interpreted at depths between ca. 200 and 400 km, as estimated from heat-flow analysis (Artemieva and Mooney, 2001), mantle xenoliths (O'Reilly et al., 2001), seismological data (Gung et al., 2003), and long period magnetotelluric data (Jones et al., 2003). Conversely, high-resolution seismological data reveal a complex structure of the mechanically strong layer (Fig. S1).
These seismic discontinuities mark the top of a seismic low-velocity zone with a small reduction in P-wave velocity (Vp) and a larger reduction, of up-to 6%, in shear wave velocity (Vs) over a depth range of 20-50 km in the Baltic Shield (Abramovitz et al., 2002). Similarly, the presence of one or multiple seismic low-velocity zones with the top typically at depths of ca. 80-100 km and a thickness of 20-40 km were reported for the Siberian Craton based on the Soviet PNE long-range profiles (Egorkin et al., 1987).
Beneath the North American craton, joint inversion of the high frequency P wave coda with long period surface wave dispersion data identified not only a velocity discontinuity with a 2.1% Vs decrease at depths of ca. 100-120 km beneath the cratonic stations, but also a positive intra-lithospheric discontinuity at around 150-170 km depth with a 3.6% Vs increase (Calo et al., 2016).
The intra-lithosphere seismic discontinuity with a negative velocity converter at depths ca. 70-120 km globally under stable continents has also been identified by high-frequency seismic receiver function (RF) studies (Rychert and Shearer, 2009).
Similar studies, largely based on S-RF (S-wave receiver functions), reported a negative converter at ca. 60-85 km depth beneath the Precambrian Western Australia (Sun et al., 2018) and typically at ca. 80-120 km depth beneath the cratonic North America (Chen et al., 2018). Beneath the Kalahari craton, some S-RF studies identified a shallow (80-90 km depth) negative converter found in other S-RF studies (Rychert and Shearer, 2009), while other S-RF studies found a prominent interface with a 4.5% reduction in seismic velocity at ca. 150-160 km depth (Savage and Silver, 2008). Notably, the deep converter corresponds to the top of melt-metasomatised mantle beneath the Kaapvaal craton (Griffin et al., 2003) (Fig. S1b).
Vertical stratification of the cratonic lithosphere with the transition between compositionally distinct layers at ca. 120-150 km depth is revealed by mantle xenolith studies in the Archean-Paleoproterozoic provinces of the Slave craton and the Baltic Shield (Kopylova and Caro, 2004). Vertical stratification of the cratonic lithospheric mantle is also imaged in seismic azimuthal anisotropy studies of the North American and the Australian cratons, which show a boundary between mantle layers with different fast directions at ca. 100-150 km depth (Yuan and Romanowicz, 2010;Sun et al., 2018). Likewise, strong electrical anisotropy has been reported for the Superior Province of the North American craton with the top of the anisotropic layer at ca. 70-100 km depth (Mareschal et al., 1995), while in the Slave craton a pronounced decrease in electrical conductivity by 0.5 log unit at 135-150 km depth (Jones et al., 2003) closely correlates with the compositional boundary established from mantle xenoliths (Kopylova and Russell, 2000).
This brief overview demonstrates the presence of an intra-lithosphere heterogeneity at a depth of around 100 km in stable continental lithosphere globally, although with significant regional depth variations (Fig. S1). It is, however, still debated if the top of an intra-lithospheric seismic refraction low-velocity zone and negative seismic convertors imaged by RFs (later referred to as the Mid-Lithospheric Discontinuity, MLD) image the same intra-lithospheric transition zone, and if anisotropic layering and metasomatic imprint on composition of the cratonic lithospheric mantle correspond to the same feature. Various models have been brought forward to explain the MLD, such as: (i) rheological layering at temperature close to the solidus in the presence of melt/fluids (Thybo and Perchuć, 1997;Thybo, 2006), (ii) compositional layering due to either partial melt extraction (Yuan and Romanowicz, 2010), or basaltic melt additions and metasomatism associated with magmatic activity (Thybo, 2006;Rader et al., 2015), (iii) change in deformation style, either depth-dependent (Karato et al., 2015), or time-dependent such as transition from frozen-in (e.g., associated with mineral fabric inherited from cratonization) to young seismic anisotropy (Yuan and Romanowicz, 2010).
These mechanisms are under discussion. Firstly, it is debated if solidus temperature may be reached at mid-lithosphere depths below stable continents even for a fertile and volatile-rich composition (Karato et al., 2015). Secondly, compositional layering may not necessarily relate to the seismic MLD. Minerals rich in carbon or amphibole may produce a >5% seismic wave velocity drop (Selway et al., 2015), but mantle xenoliths suggest that such minerals may not occur ubiquitously in cratonic lithosphere (Rader et al., 2015;Selway et al., 2015). Thirdly, an analysis of feasible mechanisms for a seismic velocity change in seismic S-wave receiver function suggests that radial anisotropy in seismic wave velocity could be a strong candidate to explain the seismic MLD (Selway et al., 2015). However, this model requires a global operation of a geological process which produces the same anisotropic layering pattern at the same depth range in every craton, and such process is yet unknown. Azimuthal anisotropy may either increase or decrease seismic wave velocity depending on the back azimuth of the seismic waves, but this is not revealed by long-range seismic controlled-source profiles in the cratons of North America, Siberia and Baltica (Thybo, 2006). An elastically accommodated grain-boundary sliding and mantle hydration, e.g. by melt metasomatism, seem to predict a velocity drop similar to observed at the MLD (Karato et al., 2015). However, the model indicates no relationship between a reduction in seismic wave velocity at the MLD and anisotropy variations within the lithospheric mantle. Therefore, mechanisms for the presence of the MLD should explain the following features: (1) a seismic velocity interface observed both in seismic controlled-source and RF models, (2) a transition between layers with different directions of seismic, and possibly electrical, anisotropy, (3) depth variations of the intra-lithosphere discontinuity in various cratons.
Here, we propose a new mechanism that may resolve the controversy. It allows for the formation of a rheologically weak mid-lithosphere zone in the cratons globally, with the associated changes in seismic velocity and formation of anisotropic layering.
We propose that a global mid-lithosphere weakness zone may have formed in the Precambrian at the early stages of cratonization by channel flow in the mantle. We argue that the intensity of the channel flow mechanism gradually ceased with time due to secular cooling of the Earth's interior. We test the model predictions by a 2D thermomechanical numerical modeling. Our numerical results, coupled with observations of seismic-wave anisotropy and petrological studies in various cratons, may reshape our understanding of the evolution, longevity and stability of cratonic lithosphere.

Governing equations
A new 2D finite difference code LYSI is used in our numerical experiments . The code applies a marker-and-cell technique to solve thermo-mechanically coupled conservation equations of mass, momentum: where is velocity, is deviatoric stress, is dynamic pressure, is density and is σ ' ρ gravitational acceleration. For the equation of conservation of mass (1), we assume an incompressibility condition. To reflect changes in temperature, we solve heat conservation equation where is heat capacity, is thermal conductivity and is heat production (only radioactive heat source is considered).

Rheology.
The visco-plastic non-Newtonian rheology automatically adjusts to physical conditions.
For ductile deformation, the flow law can be formulated as: Where is strain rate, is material constant, is stress exponent, is activation ε energy, is activation volume, and is gas constant. Wet quartzite, Anorthosite and Olivine are proxies for the rheology of the upper crust, lower crust and mantle in this study, respectively. With equation (4), we can reformulate it to obtain the effective viscosity η = σ ' 2ε (5) A maximum model cut-off effective viscosity is set up to 10 25 Pa·s and 10 18 Pa·s for minimum viscosity. When stress reaches a threshold value changing flow law from viscoelastic to plastic failure, we use Drucker-Prager yield criterion, where is the square root of the maximum second deviatoric stress invariant, is σ pressure, is friction coefficient and is cohesion. Details about parameters tan ϕ used in numerical experiments are listed in Table S1.

Time-dependent radioactive heat generation rate
To investigate the effect of radioactive heat generation we have implanted the time dependent heat generation model based on a three-dimensional global reference model (Huang et al., 2013) (Table S2). Heat generated by radioactive elements in earth interior can be expressed by (Rybach, 1976): , where is heat production per unit = 0. 01 ρ(9. 52 + 2. 56 ℎ + 3. 48 ) between TLAB and CLAB (with zTLAB < zCLAB) (Fig. 1ab). The molten lower part of the Archean CBL (i.e., the layer between the TLAB and the CLAB) with lower viscosity than in the upper, conductive, layer of the lithospheric mantle can form a secondary convection system within the lithosphere root, which is separated from the convection in the sublithospheric mantle (Sleep, 2003). This layered structure of the convecting upper mantle may produce complex geotherms. We simplify the geotherm shape by (i) assuming a continuous adiabatic gradient of 0.3°C km -1 in all convecting mantle layers and (ii) initially placing the TLAB at 90 km depth (Herzberg et al., 2010). Over time, the secular cooling of the mantle gradually deepens the TLAB, while preserving the CLAB depth largely unchanged.

MANTLE TEMPERATURES
The initial mantle potential temperature is assumed to be 1500°C (Fig. 1b), which is ca. 200°C higher than at present, in accordance with geochemical models on mantle melting temperatures in the Archean (Nisbet et al., 1993). To reflect the long-term cooling of the Earth, we introduce a time-dependent heat generation decay

General pattern
During model evolution, the difference in radiogenic heat production across the CLAB (with a higher heat production in lithospheric than asthenospheric mantle (Huang et al., 2013) Fig. 2b), the channel flow within the lower lithosphere forms two shear zones with high strain rate values. The top shear zone (S1 at a depth of ca. 100-120 km) is located beneath the initial TLAB depth at 90 km, whereas the lower shear zone S2 with the maximum strain rate develops roughly at the depth of the cratonic CLAB (at ca. 200-230 km depth, deepening away from the craton edge). The maximum flow velocity is at the mid-point depth (ca. 160-180 km) of the two shear zones along the profile (Fig. 2b) and forms a parabolic velocity profile typical of a channel flow (Turcotte and Schubert, 2002). This channel flow extends horizontally from the craton interior to the downwelling center at the craton edge and exists in the whole craton over > 2000 km distance (Fig. 2b) deepening rate is also observed in the velocity and viscosity profiles (Fig. 3b, c).
In the early evolution (<79 Myr), the uppermost layer of the lithosphere ( km) with a viscosity of 10 25 Pa s (set as maximum viscosity in the model) ℎ 0 − 60 acts as a rigid layer (Fig. 3c). In the lower part of the model ( km), mantle ℎ > 300 viscosity is~3 times higher than in the layer with the channel flow (Figs. 3b and c). Therefore, the channel flow is surrounded by two rigid layers.
At early Stage 1 (< ca. 200 Myr), geotherms have an intriguing curvature at~200 km depth (red lines in Fig. 4), when the temperature in the chemical boundary layer is higher than in the sublithospheric mantle. This kind of a hot lithospheric geotherm may exist during a period with a very high rate of radioactive heat generation in the Precambrian lithosphere (Michaut and Jaupart, 2007). High temperature of the continental lithosphere root reduces its viscosity and facilitates formation of the channel flow once a sufficient pressure gradient develops in the lithosphere root. This curvature of the geotherm disappears by ca. 475 Myr (Fig. 4), but the channel flow in the lower lithosphere remains (Fig. 3b).
With secular cooling (Fig. 4), the lithosphere overall viscosity increases (Fig. 3c), which reduces channel flow velocity and strain rate. Drippings of a cool asthenosphere material in the right half of the model push hot asthenosphere subhorizontally towards the model edge, causing the asthenosphere material to move passively in the direction opposite to the overlying channel flow (Figs. 2 and 3b). This forms a strong shearing at ca. 180-230 km depth, seen as the channel bottom in our model (Fig. 3b).

Stage 2: small-scale convection
At Stage 2 (model time >1272 Myr), multiple small-scale convection cells develop in the sublithospheric upper mantle, and the temperature field in the CBL part of the lithospheric mantle approaches the equilibrium state which lasts for more than 1 Gyr (from 1272 to 2366 Myr; Fig. 4). The transition from Stage 1 to Stage 2 is marked by a leftward migration of convection cells from the right end of the model (Fig. 2b, c). From 1272 Myr to 3066 Myr (Fig. 3a, b), the middle portion of the lithosphere (125 -175 km depth) is still subject to a strong shearing, and the velocity and strain rate in the sublithospheric mantle sharply increases by 1-2 orders ( Fig. 3a and 3b, note different scales of the upper and bottom panels) as expected for vigorous mantle convection.
Note that the flow direction in the bottom part of the lithospheric mantle (210 km -280 km depth) changes between 1866 Myr and 3066 Myr (Fig. 3b) from the leftward to the rightward, while the channel flow formed at Stage 1 preserves its rightward motion in the lithosphere mantle (ca. 80 km -220 km depth).

Conditions for formation of a channel flow
Continent-scale channel flow is formed in the lower part of Archean chemical lithosphere (CBL) by a horizontal pressure gradient at craton margins. The pressure gradient is caused by downwelling mantle flow formed around the craton-to-noncraton transition due to differences in composition and heat production within the CBL and below it, which were particularly strong in the Archean (Korenaga, 2003).
Maintaining the channel requires the formation of a chemically differentiated continental lithospheric root with density and viscosity lower than in sublithospheric mantle. We ran several test models to examine the roles of compositional density contrast across the CLAB and of asthenosphere viscosity on the formation and evolution of the lithosphere channel flow (Fig. 5).

ROLE OF DENSITY CONTRAST ACROSS THE CLAB
A continuous channel flow does not form in the absence of a density contrast between the lithospheric and sublithospheric mantle (Model Den0). Instead, multiple convection cells develop (Fig. 5a) and destroy the cratonic lithosphere root.
In contrast, when the density contrast is large (60 kg m -3 instead of 20 kg m -3 ), the compositionally light cratonic lithospheric mantle cannot drip into the asthenospheric mantle (Fig. 5b, Model Den60). Reduced dripping results in a higher viscosity of the sublithospheric mantle due to a lower strain rate. This reduces both the intensity of mantle convection and the efficiency of heat diffusion in the sublithospheric mantle.
The resulting thermal regime is characterized by cooling of the lithosphere and heating of the deeper upper mantle (Fig. 5b).

ROLE OF VISCOSITY CONTRAST ACROSS THE CLAB
Viscosity in the sublithospheric mantle is an important factor for control of mantle dynamics, but it has a complicated dependence on temperature, pressure, stress, volatile content and mineral grain size (Karato, 1995). Chemically depleted dry lithospheric mantle is generally assumed to have a higher viscosity than wet fertile mantle (Peslier et al., 2010), and we test two models with viscosity of the sublithospheric mantle of 1/2 (Model Asthen2) and 1/5 (Model Asthen5) of our model lithospheric mantle viscosity (Fig. 5c, d).
Higher viscosity of the sublithospheric mantle (Model Asten2) reduces the vigor of convection in the sub-lithospheric mantle. In contrast, rigorous convection that efficiently releases heat from the lithosphere to the surface produces an extremely thick thermal lithosphere (> 400 km) in the Asthen5 model. Therefore, the final geotherm in the Asthen2 model fits the present cratonic lithosphere geotherms better than the Asthen5 model. In the Asthen5 model, a continuous channel is observed at the early stage of evolution, but mantle temperature decreases without approaching an equilibrium state (Fig. 5d) as observed at Stage 2 of the reference model (Fig. 4).

UNIQUENESS OF ARCHEAN CHANNEL FLOW
Channel flow cannot form in modern cratonic settings: cold and rigid cratons with high-viscosity lithosphere roots tend to resist the development of a channel flow, even when there is a significant horizontal pressure gradient at the cratonic lithosphere root, e.g., from edge convection. Due to high radioactive heat production in the lithospheric mantle during the Archean, a curved geotherm may have developed in the relatively weak cratonic root (Michaut and Jaupart, 2007).

CHANNEL FLOW AND SEISMIC LOW-VELOCITY ZONES
Our numerical model provides a possible dynamic process for the MLD formation by channel flow within the cratonic lithosphere in the Archean. Intensive shearing at the top of the channel flow (layer S1, depth 80-120 km, Fig. 2b, top) produces a low viscosity zone (Fig. 3) at depths consistent with the depth of seismic velocity discontinuities (Thybo and Perchuć, 1997;Rader et al., 2015) (Fig. S1). Although low viscosity does not necessarily require low seismic velocity (Karato, 1995), the seismic low velocity zone is generally regarded as a weaker zone in the upper mantle (Thybo, 2006). Mineral physics experiment on melt-free polycrystalline aggregates of Fo90 olivine indicates that shear modules decrease with reduced grain size (Faul and Jackson, 2005). Intensive shearing in the mantle may reduce grain size (Bystricky et al., 2000), as evidenced by mantle peridotite xenoliths from the Kaapvaal craton with a pronounced drop in olivine grain size from ca. 12 mm at 120 -140 km depth to 4 -8 mm below 150 km depth in the lithospheric mantle (Mercier, 1980). Therefore, shearing-induced grain size reduction may partly result in the formation of a seismic low-velocity zone in the lithosphere.

CHANNEL FLOW AND MANTLE METASOMATISM
A global analysis of elastic moduli and densities of mantle xenoliths suggests that hydrous minerals in a chemically distinct mantle layer may also reduce seismic velocity (Rader et al., 2015). Volatile-rich melts percolating through the lower lithosphere may accumulate at ca. 80 km depth, leading to accumulations of minerals with low seismic velocities (pyroxenes, phlogopite, amphibole, carbonates) (Aulbach et al., 2017). The presence of these minerals at mid-lithosphere depths was proposed as an explanation of seismic MLD observed in S-RFs (Selway et al., 2015). Besides, fluid accumulation affects peridotite solidus, such that the mid-lithosphere seismic discontinuity may be explained by a seismic wave speed reduction where temperatures are close to the wet solidus (Thybo, 2006).
The upper shear zone with horizontal deformation caused by the channel flow may cause anisotropy of permeability and therefore change the flow direction of rising melt to (sub)horizontal (Morgan, 1987) (Fig. 6). In case of melt segregation, the layer above the melt zone with a temperature lower than the solidus functions as an impermeable cap and the contraction in the freezing zone may form a porous layer beneath which lateral migration of melt may take place (Sparks and Parmentier, 1991).
For simplicity, our model does not include these complex processes, but channel flow in the lower lithosphere can accumulate molten material at its top (Fig. 6).
Former presence of melts at a depth of 80-100 km in the lithosphere is supported by seismic normal-incidence reflection profiles from several locations. These profiles image strong horizontal reflections generally around 24 s two-way time (Steer et al., 1998), e.g. in northern Europe in the Skagerrak (Lie et al., 1990) and the North sea (Mona Lisa Working Group, 1997) areas, and in the Urals region (Knapp et al., 1996).
These observations are consistent with the presence of magmatic sills in the depth range of 80-100 km, where they may have been emplaced following mantle peridotite melting in the presence of carbon dioxide and water (Thybo, 2006).

LAB
Strong radial anisotropy, which is stronger in the shallow mid-lithosphere mantle than at around the LAB, may explain the MLD (Sun et al., 2018). The channel flow should cause strong radial anisotropy at the top (S1) and the base (S2) of the channel (especially for non-Newtonian material) (Fig. 2b, top), but not in the central part of the channel, which has a very low strain rate typical of the Poiseuille flow. Radial anisotropy may produce a seismic MLD as observed in S-RF (Selway et al., 2015), although it may not lead to seismic velocity reduction.
Understanding the evolution of thermal lithosphere in relation to the TLAB may advance our understanding of seismic anisotropy in the upper mantle. Our numerical models predict two stages with two distinct mantle material motion patterns (Fig. 2, 6).
A continuous channel flow forms beneath the TLAB at Stage 1, while multiple small convection cells underlie the TLAB at Stage 2. Therefore, at stage 1, a constant horizontal shear atop the TLAB due to the channel flow should produce a stronger radial anisotropy than at Stage 2, which is characterized by alternating mantle up-and down-wellings.
Therefore, a shallow MLD may be associated with frozen-in anisotropy from a continuous channel flow in an ancient asthenosphere. Its preservation requires cooling of the shallow mantle to a critical temperature of 900°C (Fig. 3d): at higher temperature, the mobility of olivine crystals is too high to preserve the anisotropy since the Precambrian (Goetze and Kohlstedt, 1973).
After billions of years of evolution, the CLAB, formed by mantle differentiation on the early Earth, is no longer flat as in the initial model setup (Fig. 2a, 6). Instead, a wave-like topography with short wavelength "V"-shaped chemical layer drippings is created by multiple convection cells. These drippings move material from the chemical lithosphere into the convecting sublithospheric mantle, leading to an intensive mixing of the lower portion of the chemical lithosphere with the underlying mantle. The downward smearing of the chemical lithosphere, as well as metasomatism caused by the upward percolation of volatiles and basaltic melts within the cratonic lithosphere, may create a transitional chemical boundary layer in the lower portion of the lithospheric root, as observed in mantle xenoliths from various cratons (Griffin et al., 2003). This may explain the difficulties with detecting the transitional CLAB by seismic tomography and receiver function studies (Rychert and Shearer, 2009).

Implications for mantle dynamics and stability of cratons
The channel flow model may, intuitively, seem to be in conflict with the conventional model of stability of cratons since the Archean (O'Reilly et al., 2001), which is not the case. At early stages of the channel flow, the TLAB is shallower than the CLAB, and after about 500 Myr the chemical lithosphere cools to a temperature below the hydrous solidus (a case with 400 ppm water; Fig. 4). At this time, a very low flow speed in the channel (as low as 0.005 cm yr -1 ) may, at maximum, lead to a 50 km material displacement over 1 Gyr. Such scale of displacement in a continent root is much slower than plate velocities. Although at Stage 2 the rate of horizontal material displacement increases by one order of magnitude (Fig. 3b, bottom), the largest velocity of the material flow occurs below the CLAB (depth >200-250 km) rather than within the cratonic lithosphere. Additionally, migration of convection cells causes material of the cratonic root to move back and forth within the underlying convection cells, unlike Stage 1 when material flows only in one direction (Fig. 3b, top). Therefore, the final cumulative offset between the crust and the cratonic lithosphere root may not be reflected in mantle xenoliths.
Physically, the channel flow may form only when cratonic lithosphere is significantly warmer than at present. In our numerical model, we assume that this thermal condition is satisfied in the Archean, with mantle temperatures ca. 200°C higher than at present. However, large-scale mantle thermal anomalies, like plumes, may impart heat at the base of cratons such that it may increase temperature at the lithosphere base by 100-150°C, possibly out to distances of 500 km from the impact center (Sleep, 2003). In such cases, channel flow may also be triggered in the lower cratonic lithosphere if the basal temperature increases by thermal events from below.
This may contribute to explaining some enigmatic features, such as high (2-7 %) shear wave velocity anomalies which appear to continue from the ancient continental cratonic roots of the southern African cratons to beneath the oceanic basins 500-1300 km away from the cratonic margin (Grand, 2002). If mantle thermal anomaly associated with the separation of the South American and African continents increased the temperature of the cratonic root by 100-150°C, it would have reduced the mantle viscosity by approximately two orders of magnitude. Both rifting and mantle edge convection tend to create a low-pressure zone at cratonic margins with respect to cratonic interior, thus attracting a weakened cratonic root to flow towards an ocean basin, similar to the pattern observed at stage 1 (Fig. 2). If the continental plate moves in the direction opposite to the channel flow, it may enhance shearing at depths of the top of the channel flow (S1 in Fig. 2b), and this process may explain seismic observations around the southern Africa (Wang et al., 2017).

Conclusions
With a two-dimensional numerical geodynamical model, we examine how the secular cooling down of a chemical lithosphere affects the thermomechanical evolution of a craton. With a shallower TLAB than the CLAB and a step-like transition in lithospheric mantle thickness as a priori, the numerical experiments show different flow patterns beneath the TLAB (Fig. 6).
(1) A continent-scale channel flow (Stage1) is formed in lower chemical lithosphere due to horizontal pressure gradient, which is triggered by negative Rayleigh-Taylor instabilities happened around the chemical lithosphere transitional zone.
Formation of a channel flow in Stage1 may happen in the Archean Earth which is supposed to be much hotter than the present state. The intensively sheared roof of a channel may reduce olivine grain size, accumulate contemporary volatile-rich melt rising from below and create frozen-in radial anisotropy of seismic waves. All these factors contribute to the development of a discontinuity in seismic-wave velocity.
(2) With temperature cooling down in the chemical lithosphere, but sufficient internal heating in the sublithospheric mantle, the flow structure beneath the TLAB changes to be smaller multiple convection cells (Stage2)

Acknowledgements
The code we used here was mainly developed by Dr. Zurab Chemia. HY was supported by the Melbourne Research Scholarship. IMA and HT are partially supported by the MOST special funds of China for GPMR State Key Laboratory (GPMR2019010). (a) The model top 20 km is made of a low-viscosity layer, and the surface y = 0 refers to the topography, not to the model top. The underlying crust includes three layers (upper, middle and lower crust) with thicknesses of 10 km, 15 km and 15 km, respectively. The CLAB (chemical LAB) is at a depth of 235 km (x = 0-2400 km) and 135 km (x = 2600-4000 km), with a linear transition (x=2400-2600 km) between the two domains.
(b) Initial 1D temperature field set to be hotter in Archean cratonic than present lithosphere (Turcotte and Schubert 2002): where is surface temperature, = 120 mW m -2 -initial surface heat flow (double of 0 0 the present continental average due to higher Archean heat generation), mW = 35 m -2 -heat flow from the mantle (set up at the upper limit of observations in present cratonic mantle, (Artemieva and Mooney, 2001)); km -characteristic depth of ℎ = 10 radioactive heat generation, and mW m -2 -thermal conductivity in the crust. = 2. 7 Temperature linearly increases from the Moho depth (40 km) to the thermal LAB (TLAB) with 1500 o C at 90 km depth at time zero (Archean), and then follows the adiabatic mantle gradient of 0.3 °C km -1 .
(c, d) Change of radioactive heat generation decay with time (zero corresponds to the present time) in three crustal layers (c) and two mantle layers (d) (based on Huang et al., 2013). The model start time corresponds to the Precambrian Earth, and the values at 1700 Ma are selected as the starting points for the forward modelling. Red vertical bars in (d) shows the possible ranges of radioactive heat generation in the lithospheric mantle, for which we take the upper limit of the heat generation. Gray shading -chemical lithosphere; pale yellow -asthenospheric mantle. Black arrows show velocity of material motion. Two color lines at depths of 125 km and 175 km show pressure drop with respect to pressure at the leftmost point at the corresponding depth. Density inversion at the craton edge around x = 2500 km produces a low pressure zone to the right which drives a rheologically weak material due to high temperature in the chemical lithosphere root towards region with a thinner lithosphere, thus forming a continuous channel flow of few thousands kilometers long. At 1366 Myr, the flow begins to split into multiple smaller convection cells (Stage 1). The downwelling flow of the chemical lithospheric material mixes it with a deeper mantle (model time 2366 Myr, Stage 2).
(b) Strain rate (top) and horizontal displacement traces (bottom) after 175 Myr along km (area within the box in (a, top)) sampled with 50 km = 800 − 3000 distance step. Changes in the parameters are shown by both colors and offsets from the vertical. Each offset of the trace is scaled to the maximum amplitude with color showing the absolute value. Deformation of the lower lithospheric mantle is localized in two shear zones (S1 and S2, top image) at ca. 100-230 km depth. The parabolic velocity profile is symmetric about the centerline and is rightwards channeled due to the pressure gradient (a).   (a) The model Den0 has no density contrast between the chemical lithospheric mantle (CBL) and the sublithospheric mantle. Lower buoyancy of the CBL due to lower temperature than in the sublithospheric mantle involves the continental material into convection and produces multiple small-scale convection cells.
(b) The model Den60 has a high-density contrast of 60 kg m -3 apt to protect the cratonic lithosphere root from being disrupted by mantle convection. The entrainment of the chemical lithosphere into the sublithospheric mantle convection is reduced due to high lithospheric mantle buoyancy at stage 2. It reduces the vigor of mantle convection and reduces the efficiency of radiogenic heat release, thus increasing temperature in a deeper part of the convective upper mantle.
(c-d) The models Asthen2 (c) and Asthen5 (d) have reduced viscosity in the sublithospheric mantle equal to 1/2 and 1/5 of the reference model, respectively. As a result, mantle convection becomes more rigorous, and the efficiency of the release of radioactive heat increases. A fast cooling of a deeper upper mantle leads to an extremely thick lithospheric mantle in the Asthen5 model. Figure 6. A sketch illustrating mantle evolution at different stages. It shows how footprints of the once existing channel flow structure remain preserved in the present mid-lithospheric mantle.