Modeling P waves in seismic noise correlations: Application to fault monitoring using train traffic sources

The theory of Green’s function retrieval essentially requires homogeneously distributed noise sources. Even though these conditions are not fulfilled in nature, low-frequency (<1 Hz) surface waves generated by ocean-crust interactions have been used successfully to image the crust with unprecedented spatial resolution. In contrast to low-frequency surface waves, highfrequency (>1 Hz) body waves have a sharper, more localized sensitivity to velocity contrasts and temporal changes at depth. In general, their retrieval using seismic interferometry is challenging, and recent studies focus on powerful, localized noise sources. They have proven to be a promising alternative but break the assumptions of Green’s function retrieval. In this study, we present an approach to model correlations between P waves for these scenarios and analyze their sensitivity to 3D Earth structure. We perform a series of numerical experiments to advance our understanding of these signals and prepare for an application to fault monitoring. In the considered cases, the character of the signals strongly diverges from Green’s function retrieval, and the sensitivity to structure has significant contributions in the source direction. An accurate description of the underlying physics allows us to reproduce observations made in the context of monitoring the San Jacinto Fault in California using train-generated seismic waves. This approach provides new perspectives for detecting and localizing temporal velocity changes previously unnoticed by commonly exploited surface-wave reconstructions.


INTRODUCTION
The retrieval of low-frequency (<1 Hz) surface waves from correlations of ocean-generated microseismic noise has been used extensively to image and monitor the crust (Shapiro et al. 2005;Brenguier et al. 2008). In comparison to surface waves, body waves show less sensitivity to velocity perturbations near the surface in the region between the source and the sensor, and a sharper sensitivity at greater depths. For example, direct/refracted high-frequency (>1 Hz) P waves generated using active sources have been used to image sharp structural boundaries down to Moho depths (Davenport et al. 2017). Nakata et al. (2015) were among the first to report the retrieval of high-frequency direct and refracted P waves from the correlation of cultural noise in the Los Angeles basin. Following that pioneering work, more studies have shown that direct P waves traveling at distances of a few kilometers can emerge from the correlation of powerful, localized noise sources such as surf break (Roux et al. 2005;Nakata et al. 2016;Brenguier et al. 2020) or car/train traffic (Nakata et al. 2011;Brenguier et al. 2019;Dales et al. 2020;Pinzon-Rincon et al. 2021). However, the correlation of seismic waves generated by localized noise sources such as vehicle traffic breaks the assumption of uniformly-distributed noise sources required for the process of Green's function retrieval (Lobkis & Weaver 2001;Wapenaar 2004). This work aims at better understanding the origin of these interferometric body waves in the case of localized sources, including their 3D sensitivity to velocity perturbations at depth. Leaving the realm of Green's function retrieval requires an alternative approach, and we thus follow the approach of correlation modeling. The corresponding field of research has recently left the stage of theoretical considerations and has been applied to both source inversions and tomography (Ermert et al. 2017;Datta et al. 2019;Xu et al. 2020;Sager et al. 2020;Ermert et al. 2021;Igel et al. 2021). We outline the method in section 2 together with modifications required to focus on correlations between P waves. In section 3, we perform a series of numerical experiments to identify and understand the underlying principles. Section 4 aims to compare modeled and real correlation data in the context of the San Jacinto Fault passive seismic monitoring project (Brenguier et al. 2019). The results show that situations which strongly diverge from the requirements for Green's function retrieval can still be useful for monitoring purposes, thus opening up new avenues for the appli-cation of this approach to the detection of tectonic, volcanic and human-induced deformation transients in the shallow crust.

MODELING CORRELATIONS AND COMPUTING SENSITIVITY KERNELS
Correlation functions detect coherently propagating energy. Waveforms that are difficult to interpret directly, e.g. ambient noise recordings or earthquake coda waves, can thus be turned into accessible and valuable sources of information. Correlating all available recordings with a reference trace is a linear operation, and the resulting traces still satisfy the wave equation. Thus, wavefront tracking techniques originally developed for earthquake recordings (Lin et al. 2012) can readily be applied to correlations to extract subsurface information underneath dense arrays (Bowden et al. 2015(Bowden et al. , 2017. If the original wavefield fulfills specific assumptions, correlation functions can be re-interpreted in terms of reconstructed Green's functions (Lobkis & Weaver 2001;Wapenaar 2004;Wapenaar & Fokkema 2006). This idea has been used extensively at various seismological scales, when either coda waves and/or ambient seismic noise records satisfy these conditions. Although this approach has been tremendously successful over the last two decades, recent developments address its shortcomings by interpreting correlations as self-consistent observables (Tromp et al. 2010;Hanasoge 2013;Pha . m et al. 2018;Sager et al. 2018b;Tkalčić et al. 2020). A subset of these studies relies on forward modeling correlation wavefields instead of correlating simulated wavefields. The former has advantages for inverse modeling, and we will detail the approach in the following together with modifications necessary to focus on correlations of body waves, also called body-wave interactions in the following.

Forward problem
Modeling correlations as self-contained observables originated in helioseismology (Duvall Jr et al. 1993;Woodard 1997;Gizon & Birch 2002) and was translated to terrestrial seismology by Tromp et al. (2010). The pivotal trick for the simulation of correlations is to apply the aforementioned linear operator to the source time function used in a numerical solver -a strategy routinely applied for filtering operations. In order to see how this works for correlation functions, it helps to compare the expression for a correlation function to the representation theorem. The latter links the i-component of a wavefield ui(x1) to the Green's function Gi,n(x1, ξ) due to an impulse in the n-direction and a generic forcing term Nn(ξ). In the frequency domain it is given by where Einstein's summation convention applies. The correlation function Cij(x1, x2) between two recordings can be written as Reordering the integrals to Gi,n(x1, ξ 1 ) G * j,m (x2, ξ 2 )N * m (ξ 2 )Nn(ξ 1 )dξ 2 dξ 1 and comparing it to equation (1) reveals that the linear operator is applied to Nn(ξ 1 ) and the whole part in square brackets plays the role of the forcing term. The new source time function excites a special wavefield -the correlation wavefield -and a correlation function is simply a wavefield recording. The source for the correlation wavefield can be simplified under the assumption of spatially uncorrelated noise sources, which is commonly invoked in previous studies (Wapenaar 2004;Wapenaar & Fokkema 2006). Equation (3) then simplifies to where Snm(ξ) denotes the power-spectral density distribution, which describes the excitation of ambient noise in space and frequency. The numerical implementation of equation (4) requires three steps: (i) Use source-receiver reciprocity for the second Green's function and simulate Gm,j(ξ, x2), later referred to as generating wavefield, by injecting an impulse at the reference station x2 in jdirection.
(ii) Evaluate the source for the correlation wavefield, i.e. timereverse the generating wavefield and combine it with Snm(ξ).
(iii) Simulate the correlation wavefield with the result of the previous step as source time function. Bowden et al. (2020) illustrate the recipe in a schematic and contrast it with simulating ambient noise and subsequently correlating the recordings. The implementation is closely related to time reversal experiments (Fink 1992): A wavefield, excited at x2 at time zero, is recorded along a closed surface. The recordings are time-reversed and injected again as source time functions along the surface. The resulting wavefield can be seen as a time-reversed version of the original wavefield, which focuses again at x2 at time zero and propagates further at positive times. The power-spectral density distribution can be used to alter the re-injected recordings, which in the context of noise correlations allows us to capture the nature of ambient noise sources and the imprint of heterogeneous noise source distributions.

Focus on body waves
Although equation (4) is the most generic expression for a correlation function assuming uncorrelated sources, the targeted focus on body-wave reconstructions in correlation functions faces additional challenges. Since ambient noise sources are typically confined to the surface of the Earth or the ocean bottom, surface waves dominate the ambient noise field. Together with the favorable conditions for the excitation of surface waves in correlations (Snieder 2004;Tsai 2009;Sager et al. 2018a), this leads to a systematic underrepresentation of reconstructed body waves (Forghani & Snieder 2010). The same limitations naturally apply to numerical simulations of correlation wavefields, which is straightforward to understand in terms of the simulation recipe. A vector source at the surface leads to stronger surface waves in the generating wavefield compared to body waves. Surface-wave contributions to the correlation source are thus stronger and the excitation of surface waves is again stronger in the correlation wavefield. On top of that, more sources on the surface contribute to the recovery of surface waves compared to body waves. Due to the combination of these effects and the absence of sources at depth, correlations between body waves are weak. In order to enable the investigation of pure body-wave interactions, we only use specific parts of the generating wavefield.  Figure 1. The first step in the simulation of a correlation wavefield is to compute the generating wavefield (top left) with a source at x 2 . In a simple, hypothetical medium two waves are excited: a fast type A and a slow type B. The generating wavefield is saved at all possible source locations (here only at one point -yellow star). The recorded wavefield is time-reversed, multiplied with the power-spectral density distribution and injected again as a source. The resulting correlation wavefield (top right) contains all available wavefield interactions: AA, BB, AB, BA. Muting type B in the generating wavefield (indicated with * ) allows us to focus on the remaining combinations AA and BA. If waves of type A are P waves and type B includes S & surface waves, AA are pure P-wave interactions and AB are interactions between P waves and S & surface waves. We show the correlation function of pure P-wave interactions in the bottom panel. Typically, correlations are interpreted in terms of an inter-station Green's function with the virtual source at x 2 . We thus also show the recording of a P wave of the generating wavefield (its negative time-derivative for an appropriate comparison -see Fichtner & Tsai (2018)). Both signals are recorded at x 1 , normalized and low-pass filtered with a cut-off frequency of 4 Hz. The blue panels window the two main arrivals in the correlation function. The 3D setup for the simulation is shown in figure 2, where we used the 1D-velocity profile A presented in figure S1. The source is located in one of the mirror positions of the receiver pair. The different nature of the correlation and the inter-station Green's function is studied in detail in section 3. the idea, let us assume a simple, hypothetical medium (figure 1) that only allows the propagation of two wave types: a fast wave of type A and a slow wave of type B. For a localized power-spectral density distribution, the full generating wavefield in equation (4) gives rise to all possible wavefield interactions in the simulated correlation wavefield: AA, BB, AB, BA, where the first letter indicates the propagation type, and the second encodes its generation type. Only using specific parts of the generating wavefield, allows us to focus on a subset of all possible wavefield interactions. For instance, if we are only interested in the interaction of type AA, we mute wave B in the generating wavefield and the resulting interactions are limited to AA and BA. The latter can either be muted or ignored given a sufficiently large source-receiver distance.
In this study, we are interested in interactions between P waves. Therefore, S and surface waves take the role of type B in the example above, and we mute the corresponding contributions in the generating wavefield. We ignore mixed interactions (type BA) and only focus on correlations between P waves. An example of a correlation function of type AA is shown in the bottom panel of figure  1. Its nature differs significantly from a type A wave in the interstation Green's function (its negative time-derivative is shown for an appropriate comparison (Fichtner & Tsai 2018)). The generation mechanisms underlying the simulated correlation function and the corresponding sensitivity to structure is studied in section 3.

Sensitivity to structure
For the computation of sensitivity kernels for structure, we follow the general approach outlined in Fichtner (2015). The correlation wavefield is simulated for the reference station at x2, a measurement with the recording at x1 is performed and the corresponding adjoint source time function drives the adjoint simulation. The interaction of the forward and the adjoint wavefield highlight the part of the model that may affect the measurement. In principle, this is the whole procedure for conventional source-receiver full waveform inversion (Tromp et al. 2005); however, for correlation wavefields, it has to be repeated with the reference station and the receiver interchanged. The first adjoint run only provides the sensitivity kernel corresponding to Gi,n(x1, ξ) and only after the second adjoint run can the full sensitivity kernel be assembled. Repeating the procedure is less efficient compared to the approach described by Sager et al. (2018b), but is the preferred strategy here to investigate body-wave interactions since it only requires the discussed changes in the forward problem. The interaction of the appropriate forward and adjoint wavefield types is achieved by their different propagation speeds, at least at distances that allow a clear separation of different wave types. We will thus restrain ourselves from interpreting the kernels in the vicinity of both stations.

UNDERSTANDING SIMPLE P-WAVE INTERACTIONS
In this manuscript, we implement and apply the presented theory in order to understand the process of pure P-wave interactions in correlation functions, focusing on localized source distributions. A plethora of studies (e.g. Halliday & Curtis 2008; Kimman & Trampert 2010) already studied the generation of body waves in noise correlations, however, typically in the context of Green's function retrieval. We explicitly leave this area of validity and investigate what kind of signals are formed in the more general case, how they are formed and what they are sensitive to in terms of structure. The forward and inverse problem for correlation wavefields is implemented based on the spectral-element solver Salvus (Afanasiev et al. 2019). In the following, we only consider correlations Czz between vertical recordings due to vertical noise sources at the surface of the Earth. Throughout this study, we separate the powerspectral density distribution S(x, ω) into a single spectrum s(ω) and a spatial distribution S(x). Using the spectrum in the simulation of the generating wavefield simplifies the preparation of the correlation source to the multiplication of the time-reversed wavefield with S(x). We take first steps in applying correlation modeling to advance our understanding of body-wave interactions. We thus choose to only investigate simple P-wave interactions in 1D-velocity models. For these scenarios, a simplistic time-distance taper is sufficient to mute all contributions slower than the direct P wave. Future studies may require more sophisticated wavefield separation techniques or windowing of specific parts that are of interest. Attenuation and complex geometries can be added thanks to the underlying numerical solver, but we focus on elastic waves in simple Cartesian domains for now.

Point source
For pedagogical reasons, we start the experiments with an extreme case: a small source distribution, close to a point source. Although computing correlation functions for spatially limited sources may not be an obvious choice in an application, the interpretation of seismic waves induced by certain source types can nevertheless benefit from an interferometric approach. Train tremors and long duration signals, for instance, are difficult to use directly for monitoring purposes (Brenguier et al. 2019;Pinzon-Rincon et al. 2021).
Correlations can reduce the complexity (similar to a source deconvolution in exploration seismics, but applied in the far field), since they focus on coherently propagating energy and act as spatial filters.
We define a 3D Cartesian mesh (see figure 2) designed for a maximum frequency of 4 Hz. A Ricker wavelet with a center frequency of 3 Hz is used for the generating wavefield. Except for the free surface, all sides are absorbing to prevent reflections from the boundaries. The receivers are placed on the surface at an interstation distance of 30 km. The spatial weights S(x) are given by a small Gaussian-shaped distribution (σ of 500 m in both x-and ydirection), located in one of the mirror positions of the receiver pair. We start with the 1D-velocity profile A shown in figure S1, and follow the simulation recipe described in section 2. We compare the resulting correlation function to the recording of the P wave in the generating wavefield (bottom panel in figure 1) to contrast our approach to Green's function retrieval. Two arrivals are observed in the correlation: before and around the P-wave arrival in the generating wavefield. The first arrival is considerably stronger.
To investigate how these signals are formed and what they see in terms of structure, we window both arrivals, measure their energy and compute the corresponding structure kernels (see figure 2, top panel). Perturbing the VP-velocity model in locations indicated by the sensitivity kernels changes the windowed arrivals in terms of the performed measurement. Here we use their specific shapes to learn which waves can be involved in the generation of the observed signals. While the first arrival captures the correlation of two direct P waves, the second arrival includes a direct P wave from the source to the nearby station 2 and a PP wave to the distant receiver 1. In addition, we observe oscillatory features at larger depths that can affect the signal in the second time window. In order to understand these contributions, it helps to interpret a structure kernel in terms of potential scatterers that excite secondary wavefields that arrive in the time window of interest. In our case it means that if a scatterer is introduced at one of the indicated locations, a secondary wavefield is excited once the direct P wave from the source reaches its location; the secondary wavefield also travels to station 1 and the accumulated traveltime is similar to the PP-traveltime. The potential scatterers are located in higher-order Fresnel zones of the direct P wave between the source and receiver 1. A cartoon summarizing all the potential contributions we observe in figure 2 is shown in figure 3. Using the adjoint source time function corresponding to a timeshift measurement leads to a different polarity of the contributions to receiver 1 and to receiver 2 (figure 2, bottom panel). Interpreting a correlation as a collection of differential traveltimes ∆t indeed confirms that a specific ∆t can be increased by either arriving earlier at station 1 or later at station 2. This leads to different signs in the velocity changes proposed by the sensitivity kernels.
In addition, we compute the correlation function and a sensitivity kernel (figure S2) for a 1D-velocity model with a constant gradient (profile B shown in figures 4 and S1). According to ray theory, the differential traveltimes of P-P and PP-P for this model are (for the considered station pair) only separated by 0.15 s instead of 0.77 s as in the previous case. This leads to only one apparent main arrival in the correlation function, and the sensitivity kernel is dominated by the interaction of two direct P waves.

Signal generation and cancellation
In order to quantify the effect of different source extents and shapes, we require a better understanding of the signal generation and cancellation process. From the viewpoint of Green's function retrieval, we expect direct P-wave interactions to cancel out for increasing source extents and to be left with only PP-P interactions from a stationary zone around the mirror point for an accurate P-wave reconstruction (Snieder 2004; van Manen et al. 2006). In principle, we could compute finite-frequency source kernels within the framework of correlation modeling (Ermert et al. 2017). It would, however, again not be straightforward to decipher the contributions from different wave types. Bowden et al. (2020) discuss the similarity between correlation modeling and matched field processing (MFP) and reveal that source kernels computed with both methods are equal if a noise source model of zero power is assumed. Although the forward model in MFP can include synthetic waveforms, it is typically based on ray-theoretical (differential) traveltimes. This allows us to study the contributions of different wave  types to a source kernel and the validity is guaranteed due to the discussed connection. We implement a variant of MFP that maps out source locations that contribute to certain time windows in a correlation function, which works as follows: (i) Define a receiver pair and a 2D grid for all possible source locations on the surface of the Earth.
(ii) For each source, trace a P wave or a PP wave to receiver 1 and a P wave to receiver 2.
(iii) Generate corresponding waveforms by introducing Ricker wavelets (here with center frequency f = 1 T = 3 Hz to be close to the simulations) at the computed traveltimes.
(iv) Compute the correlation of both waveforms (or alternatively place a Ricker wavelet at the differential traveltime).
(v) To obtain the contribution of each source to a pre-defined window of length T 2 centered around the differential traveltime of interest, compute the sum of the correlation in the respective window.
(vi) Plot the value at the source location.
We use both 1D-velocity models from before and present four source kernels (figure 4) based on ray-theoretical differential traveltimes. Inspired by Pha . m et al. (2018), we additionally show the difference in take-off angles in figure S3. The comparison of both P-P kernels does not allow a straightforward identification of features common to both velocity models, but reveals that the broad region of constructive contributions (for -5 km x 5 km and y 30 km in the top panel) is due to the velocity model. A closer inspection confirms that the fastest P wave starts to dive below 5 km depth at around 30 km source-receiver distance and the difference in take-off angles is small for larger distances. As expected, the PP-P kernels consistently exhibit constructive contributions around the mirror point of the receiver pair. Plotting both kernels with a different color map (see figure S4) shows a characteristic shape in the form of an X. This shape is a general consequence of the PP-P differential traveltime curve. The stationary point for this interaction is a saddle point, which leads to the observed X-shape instead of an elliptical shape expected for a local maximum/minimum. A more detailed explanation is provided in appendix A.

Large source extent
Both imaging and monitoring applications ultimately rely on spatially well-confined sensitivity distributions related to our measurements. A spatially wide-spread sensitivity might not always be an issue, but in noisy environments, a focused sensitivity definitely helps to make weak signals emerge from random fluctuations. This is particularly true when we have a good a priori on the target we want to study, e.g. in the case of an active fault region. For this purpose, the principle of Green's function retrieval is the ideal framework as the coverage of the region of interest is controlled only by  figure 2. For the first arrival in the correlation function around 4.92 s (left) two direct P waves from the source (green star) to both stations (white squares) are involved. The generation of the expected arrival at the inter-station P-wave arrival time of 5.69 s (right) requires a PP wave to station 1 and a P wave to station 2. An alternative contribution is possible, which replaces the PP wave to station 2: a scatterer (brown star) located in a higher-order Fresnel zone of the direct P wave to station 1 (dashed line) excites a secondary wavefield; the combined traveltime from the source to the scatterer and from the scatterer to station 1 is similar to the PP-traveltime. The interpretation in terms of potential scatterer locations also holds for other interactions.  the array geometry. As shown in section 3.1, however, small source extents leave this realm and exhibit sensitivity towards the source.
In the next set of numerical experiments, we probe the area between both end-members and analyze the cancellation process of the source tail in figure 2. The goal is to develop an intuition for the extent of the source that is necessary to focus the sensitivity between two stations; alternatively, we attempt to understand how much source-side sensitivity can impact our measures depending on the extent of the source. The sizes and shapes of the sources used in the following are based on the PP-P source kernel computed for the 1D-velocity profile A. We only choose sources in the main region of the PP-P source kernel that contribute constructively and define two scenarios based on the dimension in y-direction: (1) 25 < y < 35 km and (2) 10 < y < 60 km (indicated in figure 4). All spatial weights S(x) within the defined regions have the same magnitude. The corresponding correlation functions and sensitivity kernels for traveltime shifts are shown in figure 5. Similar to the point source case, the P-P interaction in scenario 1 is dominant in the correlation function and only starts to disappear in scenario 2. In scenario 1, the PP-P interaction already leads to a good reconstruction of the phase in terms of Green's function retrieval, considerably improved to the point source case (figure 1). However, it only dominates the total contribution with an increased source extent. Although the center part of the X-shaped stationary zone in the first scenario is covered with sources, the tail is still visible. It almost vanishes in the second scenario and the sensitivity focuses between both stations. In addition, the highly oscillatory features at larger depth also start to disappear. Further tests reveal that the additional source region in scenario 2 at y > 35 km are less important for the tail to disappear than the region between 10 and 25 km. The same holds for the suppression of P-P interactions, which is in agreement with the corresponding source kernel, since scenario 2 includes some of its negative contributions.

APPLICATION TO FAULT MONITORING
Detecting aseismic deformation in the vicinity of faults using observations of temporal changes of seismic velocities is a long sought goal in seismology (Scholz et al. 1973). In previous experiments, promising observations were balanced by the difficulty of maintaining stable sources over the long-term that limits the capability of monitoring seismic velocities precisely and continuously over many years (Karageorgi et al. 1992;Niu et al. 2008;Tsuji et al. 2018). Recent studies have shown that train traffic generates strong seismic radiation . Brenguier et al. (2019) and Pinzon-Rincon et al. (2021) have used these train tremors to retrieve stable P waves using interferometry between arrays of sensors across the San Jacinto fault in California. The rational for these studies is to use station pairs in the vicinity of the San Jacinto fault to increase sensitivity to fault-zone processes. Here, we also use large distances (> 10 km) between seismic stations for interferometry in order to retrieve P waves diving down to a few kilometers depth, thus reducing the influence of environmental perturbations occurring at shallow depths. The retrieved P-wave central frequency is 5 Hz, corresponding to a wavelength of about 1 km, which is a good compromise between limiting attenuation and increasing spatial resolution. In this section, we compare modeled and observed P waves from train correlations at sensors located near the San Jacinto fault. Two examples are presented: The first station pair, II.PFO-AZ.FRD represents an ideal scenario in which the train sources cover part of the stationary phase zone; for the second station pair, II.PFO-AZ.CRY this requirement is relaxed (see figure 6-E). For data selection and processing we follow a strategy similar to Pinzon-Rincon et al. (2021). We use station CI.IDO located near the railway to time trains as they are passing by the study area (Brenguier et al. 2019). A 12 min-long time window centered around each detection time is selected. We take the seismograms recorded at the target stations (vertical component only) and perform a crosscoherence measurement in the selected time windows. We perform the analysis over ten years of seismic recordings from 2010 to 2020 and average all the correlation functions obtained. For the synthetic correlation computations, we use the train timings to construct the power-spectral density distribution and slightly modify the 1D-velocity profile A used in the previous sections (slowed down by 2.5%) to improve the general waveform fit. The simulation mesh is designed for a maximum frequency of 6 Hz and both observed and synthetic correlations are filtered between 4 and 6 Hz. Due to the larger distance of AZ.CRY to the train track compared to AZ.FRD (approx. 6 km difference), we only observe the separation of P-P and PP-P interactions for II.PFO-AZ.CRY (figure 6-A,C). We window the arrivals of interest in both cases and compute data-independent sensitivity kernels to get insight into the spatial sensitivity pattern ( figure 6-B,D). Both kernels are dominated by two direct P waves to the respective stations and the PP-P interactions are considerably weaker (only visible with a stronger clip of the color-scale). Since the station pair II.PFO-AZ.CRY is not perfectly aligned with the train, complex patterns emerge in the combined sensitivity kernel. For both cases we see that there is a strong mismatch between our observations and the Green's functions. The latter would only show sensitivity in the area between the stations and not in the region between the source and the stations. Thus, with a single station pair, one cannot unambiguously infer that a transient velocity perturbation must lie between the two stations. More observations with different station pairs is required to locate possible velocity changes.

DISCUSSION
Defining and implementing a consistent forward problem for correlation wavefields allows us to study more than the effect of different source configuration on correlation functions (e.g. Halliday & Curtis 2008; Kimman & Trampert 2010) by including (source and structure) sensitivity kernels in the analysis. Their tangible nature helps us to train our intuition about correlation wavefields and they provide valuable insights. Changes are, however, required to focus on correlations between body waves. Without the presented modifications, the signals are dominated by interactions involving surface waves, which can lead to misinterpretations. For instance, the source kernels for potential S waves in figure 4 by Sager et al. (2018a) do not show the expected stationary zone because the kernel is dominated by interactions between direct S waves and different surface-wave modes. In principle, pure P-wave interactions can also be studied using an acoustic approximation. The difference in the underlying physics is, however, crucial, especially with respect to the targeted application in terms of monitoring small-scale effects (Cance & Capdeville 2015). The current implementation furthermore allows us to study interactions between P and S waves in the future. The trend in ambient noise seismology towards known and localized sources is promising and necessary in order to go beyond conventional techniques, but an interpretation in terms of Green's function retrieval becomes increasingly concerning. For instance, the PP-P interactions reconstruct the phase of the Green's function reasonably well in the first scenario presented in figure 5, but the source tail is nevertheless still significant, which might be underestimated if interpreted with the mindset of the principle of Green's function retrieval. The nature of the signal computed for the 1Dvelocity profile B could lead to the conclusion that it is the reconstructed P wave involving PP-P interactions. The VP-velocity would then mistakenly be made faster in order to get a good fit between the Green's function and the correlation. Future studies should keep in mind that covering a small part of the stationary zone at the surface is not sufficient. The presented results promote the idea that correlations should be treated as self-consistent observables. Localized sources are far from requirements necessary for Green's function retrieval. These situations are, however, extremely useful for monitoring purposes.
Focusing on sources of known origin and extent can reduce sourcestructure trade-offs inherent in noise correlation functions (Fichtner 2015;Sager et al. 2018b). However, the interpretation of correlation observations in this case requires a different mindset and correlation modeling is a promising approach.
Modeling body-wave interactions in correlation functions is a relatively expensive and tedious endeavor. If one were only interested in correlation waveforms, database approaches would be a good alternative ) and the required modifications can be included easily. Our current framework provides insights into the reconstruction of body waves in correlation functions and is employed to study specific observations in the context of monitoring the San Jacinto Fault. In general, combining observed data and simulations has several potential benefits, e.g. facilitating/guiding interpretations, testing of different hypotheses that might explain the observations and initiating/accelerating algorithmic developments. The presented developments are therefore worthwhile contributions that deserve further research efforts. A promising direction for further application is to extend the proposed framework to correlation-based deep Earth seismology, either using ocean microseism (Boué et al. 2014;Retailleau et al. 2020) or late earthquake coda waves (Pha . m et al. 2018;Tkalčić et al. 2020;. For microseism sources observed at teleseismic distances, we expect the problem to be similar to the case presented above using train traffic, i.e. with a source extension smaller than the classically considered stationary phase area and relatively simple raypaths (P, PP, PKP-type waves). For earthquake coda interferometry, the body-wave interactions of interest are considerably more complex, but computing the sensitivity of such interferences could be critical to improve our insight about the deepest structures of the Earth and other planets . We also emphasize here the connection between this study and the more classical two-station methods (de Vos et al. 2013), developed for surfacewave applications. For this study we chose to only investigate scenarios in the context of ambient noise with sources at the surface of the Earth and have ignored horizontal force directions and moment tensor sources. Sources at depth are known to play a crucial role in the exact reconstruction of body waves (Forghani & Snieder 2010;Kimman & Trampert 2010), but are beyond the scope of this study.
Regarding the theoretical foundation of this study, there are two points to keep in mind. Firstly, the theory presented in section 2 is only valid for linear processing. In order to make sure that processed correlations satisfy the wave equation, the applied processing should be kept linear (if possible) (Bowden et al. 2015) or can be linearized following the approach introduced by Fichtner et al. (2020). Secondly, we acknowledge that moving trains are correlated sources and thus do not fulfill the assumption invoked to arrive at equation (4). Solving equation (3) instead is costly, but might be feasible, since trains are constrained to specific locations in space. Setting up the power-spectral density distribution can then benefit from progress made in train traffic modeling . Ayala-Garcia et al. (2021) discuss the effects of correlated sources and present a new stacking strategy to mitigate them. Addressing the problem with a different processing scheme is compelling and will be tested in the future.

CONCLUSIONS
The presented approach for modeling P waves in noise correlations is capable of reproducing observations made in the context of monitoring the San Jacinto Fault using train induced seismic waves. The reconstructed signals strongly diverge from inter-station Green's functions and thus require an alternative interpretation. In most of the considered cases the sensitivity to 3D Earth structure is dominated by interactions between direct P waves to the respective stations and exhibits significant contributions in the source direction beyond the station-station region. Instead of trying to fulfill the requirements necessary for Green's function retrieval we advise practitioners to treat correlations as self-contained observables and apply the modeling techniques developed here to guide the analysis.

ACKNOWLEDGMENTS
We are grateful to Yehuda Ben-Zion  ) and for valuable input. Computing resources were provided by the Institute of Geophysics at LMU Munich (Oeser et al. 2006) and the Leibniz Supercomputing Centre (LRZ, project no. pr63qo on SuperMUC-NG).

DATA AVAILABILITY
Data used in this manuscript are publicly available at IRIS. The facilities of IRIS Data Services, and specifically the IRIS Data Management Center, were used for access to waveforms, related metadata, and/or derived products used in this study. IRIS Data Services are funded through the Seismological Facilities for the Advancement of Geoscience and EarthScope (SAGE) Proposal of the National Science Foundation under Cooperative Agreement EAR-1261681.

APPENDIX A: GENERALITY OF X-SHAPED SOURCE KERNEL FOR PP-P INTERACTIONS
The X-shaped source kernel for PP-P interactions can be understood as being a general consequence of the PP-P travel time curve.
For any 1D Earth model, we may define τ (θ) to be the travel time between any 2 points on the Earth's surface separated by distance θ (e.g. either angular distance on the sphere, or linear distance on the plane). We then consider any branch of the travel time curve where τ (θ) is monotonically increasing, dτ /dθ > 0, and with a monotonically decreasing derivative, d 2 τ /dθ 2 < 0. On the Earth, travel time curves for most 1D Earth models (e.g., ak135, iasp91) satisfy these 2 assumptions for direct P and S body wave arrivals at distances of less than 120 degrees. For example, it can be shown that when velocities increase with depth, then both criteria are generally satisfied; moreover, travel times on a sphere satisfy the assumption even for a homogeneous velocity model. To understand the source kernel shape for PP-P, we initially assume the source (S) and 2 stations (A and B) lie along the same great circle (colinear) and solve for the stationary phase criterion. Assuming A-S distance of θ and A-B distance of θAB, then the PP-P differential travel time is ∆τ = 2τ (θ/2) − τ (θ − θAB). Taking a derivative with respect to θ and setting to zero gives the stationary phase point d∆τ dθ = dτ (θ/2) dθ − dτ (θ−θ AB ) dθ = 0. Assuming θ > θAB, the monotonically decreasing derivative implies that the arguments of the two terms must be equal, i.e. θ = 2θAB. Unsurprisingly, the PP-P stationary point is at exactly twice the stationstation distance. Further calculating the second derivative gives d 2 (∆τ )/dθ 2 = − 1 2 d 2 τ (θAB)/dθ 2 > 0, implying that θ = 2θAB is a local minimum and the only extrema in the θ > θAB range.
Moving the source S off of the great circle path by an amount φ, then the first term of ∆τ is unaffected and the second term is strictly larger for any non-zero φ. Thus, at the stationary phase point, d∆τ /dφ = 0 and d 2 ∆τ /dφ 2 < 0, i.e. the stationary phase point is a local maximum with respect to φ. Since the point is a minimum with respect to θ and a maximum with respect to φ, θ = 2θAB is a saddle point (and the only extremum in the θ > θAB range). This saddle point behavior implies that the source kernel is X-shaped rather than the elliptical shape that one would expect for a local maximum or local minimum type of stationary phase point. Correlation function and sensitivity kernel for the 1D-velocity profile B with a constant gradient (figure S1). The correlation function (red line) is windowed (blue box) and the corresponding adjoint source time function for time shift measurements drives the adjoint simulation. The sensitivity to VP is dominated by two individual direct P-waves, from the source (green circle) to each receiver (white squares). In addition, we show the (negative time derivative) of the Green's function (red dashed line). figure 3 in the main manuscript. In addition to the velocity models (left) and the source kernels (second and fourth column), we show the differences in take-off angles (third and fifth column). Figure S4. The PP-P interactions shown in figure 3 of the main manuscript and in figure S2 are plotted with a different color-map to highlight the X-shape nature of the stationary zone. While the velocity gradient changes at 5 km depths for the left figure, it is constant for the right panel.