Assimilation of GNSS Reflectometry Delay-Doppler Maps into Global Ocean Surface Wind Analyses

Direct remote sensing observations (e.g. radar backscatter, radiometer brightness temperature, or radio occultation bending angle) are often more effective for use in data assimilation (DA) than the corresponding geophysical retrievals (e.g. ocean surface winds, soil moisture, or atmospheric water vapor). In the particular case of Global Navigation Satellite System Reflectometry (GNSS-R), the lower-level delay-Doppler map (DDM) observable shows a complicated relationship to the ocean surface wind field. Prior studies have demonstrated DA using GNSS-R wind retrievals produced from DDMs. The complexity of the DDM dependence on winds, however, suggests that the alternative approach of directly ingesting DDM observables into DA systems, without performing a wind retrieval, may be beneficial. We demonstrate assimilation of DDM observables from the NASA Cyclone Global Navigation Satellite System (CYGNSS) mission into global ocean surface wind analyses using a two-dimensional variational analysis method. Bias correction and quality control methods are described. 
Several models for the required observation error covariance matrix are developed and evaluated, concluding that a diagonal matrix scaled with DDM magnitude performs as well as a fully populated matrix empirically tuned to a large ensemble of CYGNSS observation data. 10-meter surface winds from the European Centre for Medium-Range Weather Forecasts (ECMWF) operational forecast are used as the background. Collocated scatterometer (ASCAT, OSCAT) winds are considered the truth for comparison. Results using one month (June 2017) of data show a reduction in the root-mean-square error (RMSE) from 1.17 to 1.07 m/s and bias from -0.14 to -0.08 m/s for the wind speed at the specular point. Within a 150-km-wide swath along the specular point track, the RMSE was reduced from 1.20 to 1.13 m/s. Wind speed results from DA show smaller RMSE and bias than other CYGNSS wind products available at this time.

2) As a lower-level measurement, the full DDM contains more information on the ocean reflections over a larger region of the 54 glistening surface than a wind speed retrieval estimated from only a few samples around the specular point.  (ECMWF) model. DA performance is assessed by comparison with collocated scatterometer winds. We will show two benefits 70 of DDM assimilation: 1) A positive impact on the NWP analyses over a swath at least 150-km wide; 2) Lower error than wind 71 retrievals (e.g., CYGNSS Level 2 products) when interpolated to CYGNSS specular points. 72 The outline of the paper is as follows. The DDM measurement is introduced in section 2. The DA method is presented in The GNSS-R DDM is formed by first cross-correlating the reflected signal with a model of the transmitted signal over a range of 79 delays, τ, and Doppler frequencies, f , producing a complex function, X (t , τ, f ). The power of this complex voltage signal is 80 then incoherently averaged to reduce the speckle noise. Samples at (τ, f ) known not to contain signal (shorter delay than that 81 through the specular point) are used to estimate the noise floor, Y n , which is is subtracted from the average, giving 82 need to be discarded. An empirical method is applied to select informative DDM samples. Only samples with power magnitude 92 larger than 10% of the peak DDM power are selected for use in DA. The informative samples of the DDM in Figure 1(a) are 93 shown in Figure 1(b). All K of the informative samples of the DDM at one time, t , are grouped into a vector 94 which will be used as the observation in DA. The VAM finds the optimal field of wind vectors, x, that minimizes a cost function composed of three terms: J b , representing the difference between the wind field and the background, J o , representing the difference between the wind field and the observation, 6 HUANG ET AL. explained in Hoffman et al. (2003). Rather than assimilating all DDMs in one cost function, the DDMs on each CYGNSS 114 specular point track are assimilated sequentially to reduce the computation and memory cost. One DDM will be assimilated at a 115 time and the analysis wind field will be updated after processing each DDM until all observations within the DA cycle have been 116 assimilated. The observation error covariance matrix, R, presents the errors and correlations of all DDM samples at the same 117 time. Characterization of the matrix R will be given in section 4.
where t m is the time of the m-th DDM; K m is the number of informative samples of the m-th DDM; h i (x, t m ) is the power of the 157 i-th modeled DDM sample at time t m , computed from the background wind field using the forward operator. 158 When assimilating DDMs on the track, each modeled DDM from the forward model is multipled by the scaling term Φ, 159 such that the cost function (5) becomes 3.4 | Quality control 161 The following quality control (QC) tests are applied to filter CYGNSS Level 1 DDM observations before DA.

162
1) The netCDF variable "quality_flags" values in the CYGNSS L1 data are required to be zero. This discards cases in which 163 the observation is over or close to land, the spacecraft has attitude rotation larger than 1 • , the transmitter power has a high 164 uncertainty or there are some calibration issues. 165 2) All data with signal-to-noise ratio (SNR) less than 3 dB are discarded. Small SNR indicates high noise power, making it 166 difficult to extract informative DDM samples.

167
3) All data with incidence angle larger than 60 • are discarded. DDMs observed under large incidence angle can have a 168 glistening zone larger than 120 km × 120 km, which cannot be modeled accurately by the forward operator. The QC tests are summarized in Table 1 The observation Y is a vector assumed to follow a Gaussian distribution by the central limit theorem as it is an average value  We have observed that ρ has a dependence on wind speed. An empirical parametric model for the dependence of the correlation 239 coefficient on wind speed is assumed to take the form of where u is the background wind speed at the specular point. beyond its optimal value, the RMSE increases dramatically due to overfitting. Therefore, if the optimal λ ddm cannot be precisely 358 decided in an experiment, it is generally preferable to use a smaller one.   3) The reconditioning method used to decrease the large condition number of the non-diagonal error covariance matrix could 378 add extra noise to the DA process, counteracting the benefit of additional information contained in the off-diagonal elements. 379 It is valuable to note in Figure 5, that the RMSE for R-model increases more slowly than the RMSE for R-scale when 380 λ ddm increases beyond its optimal value. This means that results from using R-model would be less sensitive to the choice of 381 λ ddm . One possible reason for this effect could be that the performance of DDM assimilation is mainly dependent on the error 382 variances of DDM bins near the specular point and the weight λ ddm . So if λ ddm is selected to accurately correct the observation 383 error covariance, the result is not sensitive to the method computing the covariance matrix. Whereas, if λ ddm is not optimal, 384 more accurately estimated covariances of DDM bins away from the specular point (from R-model) could mitigate the effect of 385 sub-optimal weighting. 386 Our conclusion is that the three DDM error covariance matrices should give similar results when the optimal λ ddm is selected. 387 For the remainder of this study, the covariance matrix R-scale with its optimal weight will be applied, due to its simplicity. 388 The non-diagonal covariance matrix R-model accounting for error correlations in the DDM could be valuable if DDMs are 389 assimilated into DA systems at mesoscale or smaller spatial scales, e.g., a regional weather forecast model.   and NOAA CYGNSS wind product apply a track-wise correction on the retrieved wind speeds using referenced NWP models. 431 Wind speeds in the three products retrieved from the same CYGNSS Level 1 product for the one month of data in this study are 432 compared to collocated SCAT winds. Note that all the three products apply some additional QCs and the NOAA CYGNSS wind 433 product implements 25-km gridding along the track. Therefore, there are fewer collocated wind speeds from these three products 434 (especially in the case of the NOAA product) than the number of CYGNSS Level 1 observations used in the DDM assimilation. 435 RMSE and bias of all four products are compared in Table 5. The wind speeds from ECMWF-CY-DDM are shown to have   436 smaller RMSE and bias than any of the other CYGNSS products. Another advantage of the DDM assimilation is that a wind 437 direction is assigned to each specular point, which might be beneficial to DA systems. of the results, has been presented. A track-wise bias correction scheme was found to be necessary. The best results were obtained 455 using a simple diagonal observation covariance matrix combined with optimal selection of the cost function weights. However, 456 we did find a lower sensitivity to the observation weight when a non-diagonal covariance matrix was used. Our explanation for 457 this effect is that the observation weight can counteract an inaccurate covariance matrix and the small-scale information in the 458 error correlations is smoothed out by the constraint terms in the VAM. For some applications, such as regional forecast models, a 459 full observation covariance matrix accounting for correlation between delay-Doppler coordinates may be beneficial. 460 We demonstrated our approach on one month ( forecast is the subject of a future study. We found that improvement was mostly limited to wind speeds below 15 m/s, however, 470 probably due to the decrease in the sensitivity of DDM observations to higher winds. 471 Wind vectors interpolated to the CYGNSS specular points can also be treated as a Level 2 product. These were compared to 472 wind speeds from other CYGNSS wind products (Level 2, Level 2 CDR and NOAA). The RMSE and bias of ECMWF-CY-DDM 473 wind speeds were found to be lower than those of these other three products, as compared to scatterometer data.   • The range of wind speeds for each batch is less than 10% of the average wind speed for the batch. This is to confirm that the 688 wind speed almost remains the same during the time of a batch, in case there is a high variational wind condition. 689 In contrast to the QC approach defined in section 3.4 for DA, we did not set requirements on the relative power difference or 690 correlation coefficient. A total of 119193 DDM batches in June 2017 passed these QC tests. 691 The contribution of thermal noise was assumed constant in time and independent of the delay-Doppler coordinate. An 692 average of the sample variances for the first two rows (assumed not to contain and reflected signal) was used to compute a value 693 ofσ 2 n = 9.576 × 10 −38 W 2 . 694 The sample variance for the i-th delay-Doppler coordinate of the DDM,σ 2 i was computed for each batch as well. The 695 thermal noise contribution was then subtracted to produce an estimate of the speckle contribution to the standard deviation, 696σ i ,s = σ 2 i −σ 2 n .
(16) Figure 9 shows scatterplots for the speckle noise contribution,σ i ,s vs. the DDM magnitude from all batches for two different