Efficient 3‐D Large‐Scale Forward Modeling and Inversion of Gravitational Fields in Spherical Coordinates With Application to Lunar Mascons

An efficient forward modeling algorithm for calculation of gravitational fields in spherical coordinates is developed for 3‐D large‐scale gravity inversion problems. The 3‐D Gauss‐Legendre quadrature (GLQ) is used to calculate the gravitational fields of mass distributions discretized into tesseroids. Equivalence relations in the kernel matrix of the forward modeling are exploited to decrease storage and computation time. The numerical tests demonstrate that the computation time of the proposed algorithm is reduced by approximately 2 orders of magnitude, and the memory requirement is reduced by N′λ times compared with the traditional GLQ method, where N′λ is the number of the model elements in the longitudinal direction. These significant improvements in computational efficiency and storage make it possible to calculate and store the dense Jacobian matrix in 3‐D large‐scale gravity inversions. The equivalence relations can be applied to the Taylor series method or combined with the adaptive discretization to ensure high accuracy. To further illustrate the capability of the algorithm, we present a regional synthetic example. The inverted results show density distributions consistent with the actual model. The computation took about 6.3 hr and 0.88 GB of memory compared with about a dozen days and 245.86 GB for the traditional 3‐D GLQ method. Finally, the proposed algorithm is applied to the gravity field derived from the latest lunar gravity model GL1500E. Three‐dimensional density distributions of the Imbrium and Serenitatis basins are obtained, and high‐density bodies are found at the depths 10–60 km, likely indicating a significant uplift of the high‐density mantle beneath the two mascon basins.


Introduction
A fast and accurate forward modeling algorithm for computation of the gravity responses at a set of observation points for a given subsurface density distribution is required for many geophysical applications. First of all, it is used to estimate the gravity effect of some predefined density structure in both regional and global levels, as is commonly done for the crustal models based on existing seismic data (e.g., Kaban et al., 2004Kaban & Mooney, 2001). It is also widely applied in topographic and isostatic correction and for geoid determination (Forsberg & Tscherning, 1997;Heiskanen & Moritz, 1967;Hirt et al., 2010;. The direct computations of the gravity effect are also used in some types of the gravity inversion methods. This is typically employed for imaging subsurface density anomalies for mining problems (e.g., Kamm et al., 2015;Li & Oldenburg, 1998Martinez et al., 2013;Mosher & Farquharson, 2013;Nabighian et al., 2005) and investigating crust-mantle structures of planetary interiors (e.g., Alvarez et al., 2012;Braitenberg & Ebbing, 2009;Liang et al., 2014;Neumann et al., 2004;Wieczorek et al., 2013;Wieczorek & Phillips, 1998).
For computation of the gravity effect in a Cartesian coordinate system, subsurface mass distributions are typically discretized into right rectangular prisms since they provide a relatively simple and useful way to approximate complicated density sources without "holes" (Caratori Tontini et al., 2009). High-efficiency forward modeling methods in Cartesian coordinates are developed to evaluate the Newton's volume integral (Blakely, 1995), such as the fast Fourier transform method (Bhattacharyya & Navolio, 2010;Caratori Tontini et al., 2009;Parker, 2010;Shin et al., 2006), Gauss-fast Fourier transform method (Wu & Tian, 2014;Zhao et al., 2018), and adaptive multilevel fast multipole method (Ren et al., 2017).
In spherical coordinates, subsurface mass distributions are typically discretized into tesseroids rather than rectangular prisms (Anderson, 1976) to consider the curvature of the Earth when tackling large-scale problems. With the fast development of the satellite gravity gradiometry, global gravity data with high resolution become available, which makes the interpretations on both regional and global scales feasible (Braitenberg, 2015;Reguzzoni et al., 2013). However, the Newton's integral in spherical coordinates cannot be solved analytically except for the case of a homogeneous spherical shell and a spherical cap along the polar axis Mikuska et al., 2006). Numerical integration methods have been used to solve this problem, including the 2-D, 3-D Gauss-Legendre quadrature (GLQ) (Asgharzadeh et al., 2007;Wild-Pfeiffer, 2008), the Taylor series expansion (Deng et al., 2016;Grombein et al., 2013;Heck & Seitz, 2007), and approximation of tesseroids by other forms, which induce nearly the same gravity field but which effect can be easily computed (e.g., Kaban et al., 2004;. Also, there are widely used direct methods, which operate in the spherical harmonic domain providing sufficient accuracy of the computed geoid, gravity field, and its derivatives (e.g., Root et al., 2016).
The Taylor series expansion produces accurate results at low latitudes. However, the accuracy decreases rapidly toward the polar regions, as the tesseroids degenerate into approximately triangular shapes. Furthermore, singularities exist at the poles in the modeling of the gravitational gradients . The 3-D GLQ methods approximate Newton's volume integral by a weighted sum of the effects of point masses. The accuracy of the integration can be improved by increasing the number of point masses at the expense of an exponential increase in computation time. Ku (1977) proposed an empirical criterion to numerically evaluate the vertical component of the gravitational acceleration of right rectangular prisms, in which the distance between point masses should be greater than the distance to the computation point. Li et al. (2011) designed an algorithm to enforce the criterion proposed by Ku (1977). Recently, Uieda et al. (2016) published an open-source software package Tesseroids and a modified version of the adaptive discretization of Li et al. (2011), providing a sufficiently accurate way to evaluate the Newton's integral by the 3-D GLQ method. However, the computational efficiency of the forward modeling algorithm is still a severe issue in large-scale 3-D gravity inversion. The memory requirement for storing the kernel matrices (i.e., the Jacobian matrix) is another severe problem. Previous works that tackle this issue generally involve iterative algorithms (e.g., Uieda & Barbosa, 2012) or some form of lossy compression (e.g., wavelet transform and compression (Li & Oldenburg, 2003), moving footprints (Barnes & Barraud, 2012;Barnes & Lumley, 2011), and adaptive data down-sampling (Foks et al., 2014)).
In this study, an efficient forward modeling method for tesseroids is proposed to significantly decrease the computational time as well as the memory requirement. The numerical integration is performed by the 3-D Gauss-Legendre integration (Wild-Pfeiffer, 2008). We carry out numerical investigations to verify the performance of our method and to provide comparisons for computation time and memory requirements. Subsequently, we apply the algorithm to a 3-D gravity inversion of the latest lunar gravity model GL1500E (Konopliv et al., 2014).

Gravitational Forward Modeling Using 3-D GLQ
On both regional and global scales, the source region is commonly divided into many small regular spherical elements, that is, tesseroids. The total gravity effect of the source region is obtained by superposition of the individual source effects. As shown in Figure 1, the tesseroid is a mass element bounded by three pairs of surfaces: a pair of concentric spheres (r 1 , r 2 ), a pair of meridional planes (λ 1 , λ 2 ), and a pair of coaxial circular cones defined by the parallels (φ 1 , φ 2 ). For a computation point P outside the tesseroid, the formulas of the

10.1029/2019JB017691
Journal of Geophysical Research: Solid Earth gravitational fields with respect to the local north oriented coordinate system are given by Grombein et al. (2013) (following the notation of Uieda et al., 2016): where α,β ∈ (x, y, z); ρ is the density of the tesseroid; G is the gravitational constant; δ αβ is Kronecker's delta (δ αβ = 1 if α = β and δ αβ = 0 if α ≠ β), and Since the integrations concerning λ′ and φ′ comprise elliptical integrals in equations (1)-(3), they cannot be solved in a closed analytical form . Here we follow Asgharzadeh et al. (2007) and Uieda et al. (2016) to perform the numerical integration using the 3-D GLQ.
The GLQ for 3-D volume integrals in equations (1)-(3) can be written in a more general form as (Uieda et al., 2016) where A = (r 2 − r 1 )(λ 2 − λ 1 )(φ 2 − φ 1 )/8; I, J and K are the numbers of Gaussian nodes in the radial, longitudinal, and latitudinal directions, respectively; and are the Gaussian nodes scaled to the actual coordinates within the tesseroid; are ith, jth, and kth Gaussian weights and nodes within the range of [−1, 1]. The commonly used Gaussian weights and nodes for the GLQ formulae up to 5 orders are given by Wild-Pfeiffer (2008). As we can see from equation (5), the GLQ method approximates the 3-D volume integrals by a weighted sum of the point mass Uieda et al., 2016). In this paper, we employ two Gaussian nodes in each direction.

Equivalence of Kernel Matrix
The forward modeling of gravitational fields using the above 3-D GLQ method is time consuming, especially when the high-order Gaussian quadrature is applied to guarantee high modeling accuracy or when the model mesh and computational points are dense. In this section, we develop an efficient forward modeling algorithm using regular tesseroid meshes.
As shown in Figure 2, we divide the subsurface source region into N′ r segments along the radial direction, N′ φ and N′ λ in the latitudinal and longitudinal directions, respectively, defining a total cell number of N tess = N′ r × N′ φ × N λ . For each tesseroid, its geometric center is (r′ l , φ′ n , λ′ m ) with an equal mesh interval of Δr, Δφ, and Δλ, respectively. The density of each tesseroid is assumed to be constant. The observations are assumed to be on a spherical patch located at the height of r 0 and distributed on a regular grid of N obs = N φ × N λ .
Then the gravitational potential, accelerations, and gradient tensor at an observation point (r 0 , φ q , λ p ) can be written as the form of a summation of each tesseroid where p = 1, 2, …, N λ , and q = 1, 2, …, N φ ; ρ lnm is the density of the tesseroid numbered (l, n, m); and are the kernel functions of the gravitational potential, accelerations and gradient tensor. Here (r, φ, λ) in equation (4) is replaced by (r 0 , φ q , λ p ). The kernel functions depend only on the volume of the calculated tesseroid and the geometric distance between the tesseroid and the observation point.
Equations (7)-(9) can be written in the matrix form as where ρ represents the density vector of the tesseroids with the dimension N tess ; V, g α , and g αβ are the vectors containing the forward results with the dimension N obs ; K v , K α , and K αβ are the kernel matrices with the dimension N obs × N tess , in which the elements are given by equations (10)-(12).
When a fine discretization is used, the kernel matrices in equations (13)-(15) are large. For example, in a simple global gravity inversion, the Earth is discretized into N φ = N′ φ = 180, N λ = N′ λ = 360, and N′ r = 10 (i.e., partitioning the Earth into 1°× 1°in the latitudinal and longitudinal directions and 10 layers in the radial direction). The memory requirement for storing each kernel matrix is approximately 168 GB in the single precision or 336 GB in the double precision floating point numbers. Furthermore, the calculation of the above kernels is time consuming.

Journal of Geophysical Research: Solid Earth
Fortunately, if the tesseroid model is discretized into a regular mesh and the computation points are on a regular grid of a constant height that aligns with the tesseroid mesh (as shown in Figure 2), a series of equivalence relations can be exploited in the kernel matrix. The amplitudes of the elements in the kernel matrix resulting from the same sizes of tesseroids and same computation distances are equal. Henceforth, we will refer to these relations as the equivalence of kernel matrix.
In view of the relation with (λ′ − λ) in equation (4), the kernel functions in equations (10)-(12) can be written as According to equations (1)-(4), K v , K x , K z , K xx , K xz , K yy , and K zz are even functions of (λ′ − λ). Also, the meshes of both source region and observation surface are in an equal interval and the observation points align with the center of model meshes. Taking K v as an example, we have and we define this relation as a swapping equivalence.
To be more explicit, we take one layer of tesseroids in Figure 3 as an example to illustrate the swapping equivalence. For one layer of tesseroids, the variables r 0 and r ′ l in the radial direction in equation (19) can be omitted. For simplicity, we use K(q, p; n, m) to represent the gravity response on the observation point P(r 0 , φ q , λ p ) produced by the tesseroid Q r ′ l ; φ ′ n ; λ ′ m À Á . As shown in Figure 3, the kernel elements (for K v , K x , K z , K xx , K xz , K yy , and K zz ) along the crossed dashed red (or blue) lines have equal values.
On the other hand, when m ≤ p, we also have and we name this relation as shifting equivalence. As shown in Figure 4, the kernel elements for the gravitational potential, accelerations, and gradient tensor are equal along parallel dashed red (or blue) lines.
When m > p, we can first use the swapping equivalence of equation (19), then adopt the shifting equivalence, Combining both the shifting equivalence and the swapping equivalence, we obtain the equivalence of the kernel matrix for the gravitational potential as Figure 3. The swapping equivalence of the kernel matrix. A model with a 6 × 6 observation grid and one 6 × 6 layer of tesseroids is taken as an example. For example, the kernel element K(3, 2; 2, 5) represents the gravitational response on the observation point P(r 0 , φ 3 , λ 2 ) produced by the tesseroid Q r ′ l ; φ ′ 2 ; λ ′ 5 À Á with unit density. The kernel elements represented by the two crossed dashed red (or blue) lines are equivalent (for K v , K x , K z , K xx , K xz , K yy , and K zz ) or minus signs (for K y , K xy , and K yz ).

Journal of Geophysical Research: Solid Earth
As for K y , K xy , and K yz , since they are odd functions of (λ′ − λ), we should add a symbolic function relative to equation (22). Taking K y as an example, we have Using this novel equivalence strategy, we just need to calculate the kernel elements in the first longitudinal column, and then the rest of the kernel elements can be obtained by the equivalence relations (equation (22) or (23)), which yield a great improvement in computational efficiency. In the meantime, the memory requirement for storing these kernel vectors is reduced to 1/N′ λ of the original one. For more general cases, the source region may consist of several layers, and then the kernel matrix for multilayer tesseroids can be calculated by repeating the above one-layer process. Moreover, the equivalence of the kernel matrix presented in this section can be equally applied to the Taylor series expansion method (Heck & Seitz, 2007).
The equivalence of the kernel matrix can be incorporated in the adaptive discretization algorithm (Uieda et al., 2016) to increase accuracy of 3-D GLQ. Using the adaptive discretization algorithm, any tesseroid is subdivided into smaller ones if the observation point is close to the tesseroid until all the newly meshed tesseroids meet the given distance-size ratio (D) (Uieda et al., 2016). Because of the combination of the kernel matrix equivalence, the adaptive discretization strategy needs to be applied only to the tesseroids in the first longitudinal column. After the adaptive discretization, the kernel elements produced by the original tesseroids are calculated by superposition directly, with no use of the equivalence of kernel matrix strategy in this column. Thus, the kernel elements for the first column can be evaluated with high accuracy by the adaptive discretization algorithm, and for the rest of the kernel elements, the gravity effect can be obtained with the same accuracy using the kernel matrix equivalence.
In the above formulations, the spherical tesseroids are used to discretize subsurface mass distributions by applying the equivalence kernel matrix strategy. In most cases, the spherical approximation bounded by geocentric spherical coordinates yields sufficiently accurate results (Heck & Seitz, 2007), and this is also a reasonable approximation for regional gravity applications or for near-sphere planets (e.g., Asgharzadeh et al., 2007, Du et al., 2015, Liang et al., 2014, Wild-Pfeiffer, 2008. However, in terms of the Earth (or other ellipsoid planets), the nonsphericity effect may be significant and nonnegligible. The ellipticity of the Earth can be taken into account in the forward modeling by the ellipsoidal tesseroids with a latitude-dependent radius (Heck & Seitz, 2007). The equivalence of the kernel matrix strategy is also valid when the ellipsoidal tesseroids with a latitude-dependent radius are used in the discretization.

The 3-D Inversion in Spherical Coordinates
In this section, we introduce the application of the developed forward algorithm for 3-D gravity inversion problems. Here the Tikhonov regularization method (Tikhonov & Arsenin, 1977) is used. This approach seeks to construct the simplest model by minimizing a data misfit function and a model objective function weighted by a regularization parameter, where Φ d is the data misfit function, Φ m is the model objective function, and μ is the regularization parameter that determines the trade-off between Φ m and Φ d . The regulation parameter μ can be determined by the L-curve criterion (Calvetti et al., 2000;Hansen, 2001), for example.

Journal of Geophysical Research: Solid Earth
We use a weighted squared L 2 -norm data-misfit function similar to (Li & Oldenburg, 1998) where d obs = (d 1 , …, d N ) T is the observation data vector, d pre is the predicted data vector, W d = diag{1/σ 1 , …, 1/σ N } is a diagonal data weight matrix, and σ i is the uncertainty associated with the ith datum.
To compensate the decay of sensitivity with depth, we use a normalized depth-weighted model objective function (Liang et al., 2014) where α s , α r′ , α φ′ , and α λ′ are the coefficients that affect the relative importance of different components in the objective function, dv = r′ 2 cosφ′dr′dφ′dλ′, ρ ref is the reference model, and w(r´) is the depth weighting function in the radial direction where r 0 is the radius of the observations, and r ′ 1 is the radius of the first layer tesseroid mesh. In the data misfit function (equation (25)), the improved forward algorithm introduced in previous sections is used to calculate the predicted gravity field d pre . By minimizing equation (24), the inversion problem is transformed into a linear system of equations, which is solved by the conjugate gradient method (e.g., Pilkington, 1949, Purucker et al., 2000.

Numerical Examples
In order to test numerical performance of the proposed forward method, we perform synthetic tests on a spherical shell model to examine the computational efficiency and memory requirement. A synthetic regional model with known density distributions is also designed to demonstrate the practical application of the forward method in large-scale 3-D gravity inversion.

Spherical Shell Model
According to the adaptive discretization algorithm used to improve the accuracy of the forward method, the original tesseroid is subdivided into smaller tesseroids as the computation point gets closer to it (Uieda et al., 2016). Therefore, the altitude of the computation surface inversely affects the computation time. On the other hand, the number of tesseroids in the mesh directly influence the computation performance. In this section, we investigate the computational efficiency by comparing our algorithm with the software Tesseroids (Uieda et al., 2016) for different observation heights and tesseroid mesh numbers.
A homogenous spherical shell is employed as the reference model because it has analytical solutions along the polar axis  and can be discretized into tesseroids perfectly. The spherical shell has the density 1,000 kg/m 3 and thickness 100 km ranging from 6,271 to 6,371 km (the Earth's radius). The shell is discretized into regular meshes along the horizontal directions with just one layer in the radial direction.
We first test the computational efficiency for different observation heights. The spherical shell is discretized into 180 × 360 tesseroids with an equal interval of 1°in both the latitudinal and longitudinal directions. The observation surface has the same discretization as the spherical shell. Its elevation ranges from 1 to 250 km above the top surface of the spherical shell. The computation time of g z and g zz for different observation heights is shown in Figures 5a and 5b. For convenience, the absolute computation time is normalized by the slowest calculation time, henceforth referred to as relative computation time.

Journal of Geophysical Research: Solid Earth
As shown in Figures 5a and 5b, the execution time of our optimized method is reduced by approximately 2 orders of magnitude compared to the method of Uieda et al. (2016). The improved performance is achieved due to the use of kernel matrix equivalence. In this case, the increase in computational efficiency is~50 times for g z and~80 times for g zz compared with Uieda et al. (2016). Another remarkable characteristic is that the computation time of g z for both methods does not vary significantly with the observation heights, while the computation time of g zz increases as the observation height decreases. To obtain accuracy with a maximum error of 0.1%, the distance-size ratio D of 8 is used for gravity gradients and 1.5 for the gravitational acceleration (Uieda et al., 2016). Therefore, the original model is subdivided into more tesseroids for g zz , especially when the observation surface is relatively low.
We also investigated the computation time and memory requirement for g zz in cases of different model size. The spherical shell was divided into small tesseroids by a set of intervals ranging from 0.25°to 20°. The observation surface is placed at 10-km height with the same discretization as the shell. Figures 5c and 5d show the relative computation time and memory usage for computation of g zz with different mesh intervals (i.e., the size of the discretization). Table 1 presents the statistics of the absolute computation time and memory occupation.
The results show that the computation time of both methods increases exponentially with the decrease of the discretizing interval ( Figure 5c and Table 1). However, for our method, the increase in computation time is more gentle compared with the method of Uieda et al. (2016), and the time cost is reasonable for the cases with the interval less than 1°. With the increase of the mesh numbers, the difference in computation time between two methods become more significant. As shown in Figure 5c, the difference is about 1 order of magnitude for meshes with an interval greater than 5°and 2 orders of magnitude for meshes with an interval less than 3°. As for large-scale 3-D gravity inversion problems, the discretization in depth normally implies more than one layer, and the inversion needs dozens or even hundreds of forward iterations; the improved performance is more evident and significant, which will be demonstrated in the next section.

Journal of Geophysical Research: Solid Earth
Another advantage of the new method is the low memory occupation as shown in Figure 5d and Table 1. As mentioned above, it is not necessary to store the kernel matrix in a forward application. However, storing the dense Jacobian matrix is probably inevitable in an inverse method. With the equivalence of kernel matrix, the proposed algorithm reduces the memory requirement by N′ λ times when compared to the storage of the full kernel matrix, where N′ λ is the number of model elements in the longitudinal direction.
For the computation accuracy, Figure 6 shows the maximum relative errors of g z varying with the height elevation of the observation surface using the traditional 3-D GLQ method, Uieda et al. (2016) method, and our method on the basis of the same spherical shell model as shown in Figure 5a. The relative errors of all three methods decrease with the increase of the observation height. Owing to the use of the adaptive discretization (Uieda et al., 2016), the error for the proposed algorithm is exactly the same as that one of Uieda et al. (2016). The errors of these two methods are reduced by 2 orders of magnitude compared to the traditional method when the observation surface closes to the source region. This test demonstrates that the accuracy of the forward modeling is not affected by the use of the equivalence of kernel matrix.

Synthetic Mascon Model
In this section, we design a synthetic lunar mascon model with known density distributions to verify applicability of the new method for a large-scale 3-D inversion. The model consists of two blocks with the same density contrast of 600 kg/m 3 (Figure 7a). The model ranges from −35°to 35°longitude, 15°to 50°north latitude and 0 to 100 km in depth. This model region is discretized into a mesh of 280 × 140 × 20 tesseroids with a spacing of 0.25°, 0.25°and 5 km in the longitudinal, latitudinal, and radial directions, respectively. Li and Oldenburg (1996) suggested upward continuation of the observation data in the inversion to a constant height approximately equal to the width of the surface cells in the model. Considering the cell size of the source model, the gravity anomalies used in the inversion are calculated at the height of 10 km above the reference radius of 1,738 km (i.e., the lunar radius) by using the proposed forward algorithm. Gaussian random noise with zero mean and standard deviation of 1% of the maximum amplitude of the synthetic data is added to the gravity anomalies to simulate the observation errors (Figure 7b).  (2016) is not necessary to store the kernel matrix in forward modeling. The memory requirement is to store the kernel matrix (i.e., Jacobian matrix) for the applications in gravity inversion. For example, the total kernel elements for cases with the mesh interval of 0.25°are 1.075 × 10 12 (i.e., 720 × 1,440 × 1 × 720 × 1,440), and the corresponding memory requirement is about 8,599.6 GB in double-precision floating point numbers (i.e., 1.075 × 10 12 × 8). b The computation time is estimated according to the computation time of our method (11.4 hr) multiplying by the average increasing times (71.57) for the mesh intervals of 0.5°-3°; s = second; hr = hours; d = day. This model is the same as in Figure 5. GLQ = Gauss-Legendre quadrature.
We perform the inversion of the synthetic data using the inversion method described in section 2.3. The predicted data from the inversion model and the data residuals are shown in Figures 7c and 7d. The inversion is performed using parameters of μ = 5.0 × 10 2 obtained from the L-curve shown in Figure 8, α s = 1.0 × 10 −9 , α r′ = 1.0 × 10 −7 , α φ′ = α λ′ = 1.47 × 10 −8 , and ρ ref = 0 (i.e., no reference model is used). As shown in Figure 7d, the inversion residuals are small. The inverted density distributions are shown in Figure 9. The estimated density distribution is able to recover the right locations of the two anomalous bodies in the actual model (Figure 9), which demonstrates the validity and capability of the inversion algorithm. Note that the recovered density of the anomalies (~300 kg/m 3 ) is significantly lower than the true one (600 kg/m 3 ) due to vertical smearing to a greater depth. This is because the gravity data only do not resolve the radial density distribution, which entirely depends on the applied radial weighting function (Liang et al., 2014).
We also tested the inversion using the data at a higher observation surface (50 km above the reference radius of 1,738 km). The recovered densities are similar to the ones in Figure 7, but their amplitudes are smaller than that of those ones obtained from the observation data at the height of 10 km. This is a result of the inherent nonuniqueness of the inversion calculations.
In the inversion tests, the entire computation time used in the inversion on the basis of our forward modeling algorithm is approximately 6.3 hr (in a desktop with i7-4790k CPU, 16 GB memory). It would be~13.1 days for the computation without using the kernel matrix equivalence in the forward modeling (estimated according to Figure 5). Meanwhile, the memory required for storing the kernel matrix is about 0.88 GB in our algorithm, while it would be 245.86 GB without the equivalence of kernel matrix.

Application to the Lunar Mascons
We now apply the proposed method for a 3-D inversion of real lunar gravity data. We select the nearside Imbrium and Serenitatis mascon basins as the study area, which is located from 35°W to 35°E and from 15°N to 50°N as shown in Figure 10a. Because the Moon is characterized by an insignificant ellipticity with a semimajor axis of 1,738.1 km and a semiminor axis of 1,736.0 km, the nonsphericity effect is insignificant. For simplicity, we utilize spherical tesseroids in the source discretization for this lunar mare application.
Here we use the latest lunar gravity field model GL1500E (Konopliv et al., 2014, http://pds-geosciences. wustl.edu/missions/grail/default.htm) and the topography model LRO_LTM05_2050 (Smith et al., 2010, http://pds-geosciences.wustl.edu/missions/lro/lola.htm) to obtain the Bouguer gravity anomalies. The gravity model GL1500E (Konopliv et al., , 2014 is the latest spherical harmonic solution of the lunar gravitational field derived from the GRAIL mission. It extends to degree and order 1,500, and exhibits unprecedented detail of the gravity field. The lunar gravity observations (i.e., free air gravity disturbances) are computed by a spherical harmonic expansion, which is given by (e.g., Ditmar et al., 2003;Wieczorek, 2007) where (r, φ, λ) is the coordinates of observation points; R is the reference radius; GM is the gravitational constant times the mass of the central body; l is the degree, m is the order; C lm and S lm are normalized spherical harmonic coefficients of degree l and order m; and P lm are the fully normalized associated Legendre functions.

Journal of Geophysical Research: Solid Earth
The free air gravity disturbances (Figure 10b) are calculated to degree and order 720 at the elevation of 10 km above the reference radius of 1,738 km. The gravity effect of the topography (i.e., terrain correction, Figure 10c) is calculated using the spectral method (Wieczorek & Phillips, 1998) at 10-km height with the density of 2,560 kg/m 3 (Zuber et al., 2013). It is truncated to the same degree and order of 720 as the free air gravity anomalies. The Bouguer gravity anomalies shown in Figure 10d are obtained by removing the gravity effect of the topography (Figure 10c) from the free air gravity anomalies (Figure 10b). The data spacing of 0.25°is employed in all calculations.
The source region is discretized into 280 × 140 tesseroids in the longitude and latitude directions both with a horizontal interval of 0.25°, which is consistent with the spacing of the observed data ( Figure 10d). Based on the thickness of the lunar crust (Wieczorek et al., 2013), our model domain along the radius direction is designed to occupy a spherical shell extending from the reference sphere of 0 km to a depth of 100 km, which is divided into 20 layers with an equal interval of 5 km. We carry out the 3-D inversion of the observed Bouguer gravity anomalies (Figure 10d) to recover the 3-D density distribution in the model domain using the same regularization parameters as in the synthetic mascon model. The inversion results are displayed in Figures 11 and 12.
The inverted results along four different cross sections ( Figure 11) show that the high-density anomalies are mainly located beneath the centers of the Imbrium and Serenitatis basins with the top boundary consistent with the Moho interface revealed by Wieczorek et al. (2013). The high-density anomalies mainly concentrate at the depths 18-40 km beneath the Imbrium and 13-60 km beneath the Serenitatis, which agree with the previous estimate by Liang et al. (2014). Hikida and Wieczorek (2007) previously indicated that the

Journal of Geophysical Research: Solid Earth
azimuthally averaged crustal thickness of Imbrium and Serenitatis are~20 and 16 km, respectively, which is in a good agreement with our results.
As shown in Figures 11d and 12, the distribution of the high-density anomalies is highly correlated with the high positive Bouguer gravity anomalies (Figure 10d), while has little consistency with the extent of the flat topography of the basins (Figure 10a). Previous studies suggest that the high gravity anomalies in lunar mascon basins are mainly produced by the uplift of high-density mantle materials and basalt filling in basins after planet impacts (Kaula, 1971;Konopliv et al., 2014;Melosh, 1975). The density distribution with depth reveals the upwelling of the dense material, which is consistent with the hypothesis of the mantle uplift as the primary source of the high positive Bouguer gravity anomalies in the Imbrium and Serenitatis basins, as suggested by Liang et al. (2014) and Melosh et al. (2013). The Apennine Mountains and the Caucasus Mountains which lie between the Imbrium and Serenitatis basins exhibit negative density anomalies as shown in Figures 11a and 12.
It is worth noting that the problem of underestimating the amplitude of the density anomaly due to vertical smearing that occurred in the synthetic model in section 3.2 likely exists in this application to the lunar mascons. Inclusion of some a priori constraints such as the Moho interface and/or crustal density variations in the inversion would improve the quality of the recovered density structure, which is a future task. The computation time for this example is approximately the same as for the synthetic model in section 3.2 (~6.3 hr) because the same number of tesseroids and observation points were employed.

Conclusions
This study introduced an optimized forward-modeling method of gravitational fields in spherical coordinates. The novel strategy proposed (kernel matrix equivalence) improves the computation speed and reduces the memory requirement in the forward modeling by approximately 2 orders of magnitude. The kernel matrix equivalence is not only suitable for the 3-D GLQ method, as presented in this study, but also for other numerical methods employing the Newton's integral (Fukushima, 2018;Heck & Seitz, 2007;Wild-Pfeiffer, 2008). In addition, the strategy is easy to combine with adaptive discretization (Uieda et al., 2016) to ensure high accuracies of the forward gravity modeling.
Note that the kernel matrix equivalence strategy works under the condition that the model is divided into a regular tesseroid mesh and that the computation points are on a regular grid of constant height that aligns with the tesseroid mesh. For the observation data in the inversion, the gravity observation data can be easily calculated from the global gravity field models on arbitrary points on or above the surface of the planets using equation (28). In particular, the gravity observation data are distributed at a spherical or ellipsoid surface above the source when a constant radius r or a latitude-dependent radius is used in equation (28). Therefore, it is easy to meet the requirement of the spherical (or elliptical) surface distribution assumption of observational data in the proposed method. Alternatively, the gravity data can be provided for the orbit heights above the planet's surface. These data have been also used for lithospheric modeling (e.g., Bouman et al., 2016). Because these observation data might not be presented on a regular grid at a constant height, the proposed algorithm would not work directly for these observed data. However, the regular gravity grids can be obtained by data processing before the inversion to meet the required discretization criteria (e.g., Bouman et al., 2016). For example, available gravity grids with a constant height at the satellite altitude with respect to the reference ellipsoid or spheroid can be used as the observation data for our algorithm, such as the gravity gradient grids from GOCE (Bouman et al., 2015;Bouman et al., 2016). As pointed out by Bouman et al. (2015Bouman et al. ( , 2016, gravity grids are easier to handle than the original orbit data because the orbit data have complicated error characteristics and are presented in a rotating frame at varying heights. For the gravity grids with a constant height at the satellite altitude with respect to the oblate reference ellipsoid, the kernel matrix equivalence strategy is still valid when the spherical tesseroids with a latitude-dependent radius are used.
We performed a series of synthetic tests on a spherical shell model to evaluate the computational efficiency and memory occupation of the algorithm. The results show that the computation time of the proposed algorithm is reduced by about 2 orders of magnitude, and the memory requirement is reduced by N′ λ times compared with the calculation of a full kernel matrix, where N′ λ is the number of model elements in the longitudinal direction. These significant improvements in computational speed and storage usage provide a solution to the problems of calculating and storing a dense Jacobian matrix without loss of accuracy in 3-D large-scale gravity inversions.
A 3-D regional inversion on a synthetic lunar mascon model was carried out to verify the capability of the algorithm in handling large-scale gravity inversions. The recovered lateral density distributions are qualitatively in a good agreement with the input model. This inversion based on the new algorithm took approximately 6.3 hr computation time and 0.88 GB memory, compared with about a dozen days and 245.86 GB for the inversion using the full kernel matrix, which further demonstrates high efficiency of the proposed forward-modeling method both in computation time and memory usage.
We applied our forward method to a regional 3-D inversion of data from the latest lunar gravity model GL1500E. A 3-D density distribution is obtained beneath the Imbrium and Serenitatis mascon basins, which feature prominent high-density anomalies beneath their centers. The top boundary of the high-density anomalies agrees with the Moho interface estimated in previous studies. It is consistent with the hypothesis that the mantle uplift is the primary source for the high positive Bouguer gravity anomalies in the Imbrium and Serenitatis basins.