NIRVP: a robust structural proxy for sun-induced chlorophyll fluorescence and photosynthesis across scales

Max Planck Institute for Biogeochemistry, Hans Knoll Straße, 10, D-07745 Jena, Germany International Institute for Earth System Sciences, Nanjing University, Nanjing, Jiangsu 210023, China. Jiangsu Provincial Key Laboratory of Geographic Information Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing, Jiangsu 210023, China.

clumping 18,19,[22][23][24] . For ΦF, most work has been conducted at the leaf-level. Although leaf-level ΦF responds to environmental stress 2,25,26 , ΦF typically varies relatively little compared to the photochemical quantum yield of photosystem II (ΦPSII) both at diurnal and seasonal time scales 27,28 . Evergreen needleleaf forests are the exception to this pattern, with considerable seasonal ΦF variations due to physiological downregulation of photosynthesis during extended cold periods in winter 29,30 .
Based on current knowledge, canopy-level SIF variations are expected to be driven predominantly by APAR and fesc rather than ΦF, at least in the absence of strong environmental stress and ecosystems other than evergreen needleleaf forests. At sub-daily time scales, PAR is the dominant driver of SIF given the typically much smaller diurnal variations in the fraction of absorbed PAR (fPAR), fesc and ΦF 22,27,31 . At seasonal time scales, APAR and fesc appear to show stronger variations compared to ΦF 22,23 . If indeed APAR x fesc is strongly correlated to SIF in most cases (see Eqns. 1-4 in the Methods section), it could be used as a structural proxy for SIF as recent studies have derived approaches to estimate fPAR x fesc from widely available reflectance observations 19,32 . In particular APAR x fesc can be approximated by NIRVP 19,22 , which is defined as the product of the near infrared reflectance of vegetation (NIRV) 33 and incoming PAR (see Methods).
Despite recent findings on canopy structure effects and fesc that seem to support the dominance of APAR x fesc in variations of SIF 22,32,33 , a comprehensive evaluation to what degree NIRVP and other related structural SIF proxies explain the variations of canopy SIF has not been reported. In particular, SIF is commonly compared to vegetation indices such as the enhanced vegetation index (EVI) 34 without considering PAR 33,35,36 , which does not permit a meaningful direct comparison in cases where spatial and/or temporal variations of PAR are important. Furthermore, previous studies have not evaluated the spatial and temporal patterns of SIF and NIRVP separately, but have instead focused either on long time and large spatial scales 33 or examined only a few crop sites 22,32 .
Since one of the major goals of SIF research is to improve remote sensing based GPP estimation 1,7 , it is important to also compare the performance of SIF and NIRVP for this purpose. Several recent studies demonstrated the promise of using NIRV and NIRVP to estimate GPP across multiple spatial scales 33,37-39 . However, direct comparisons of SIF and NIRVP for GPP estimation using observations from the same sensors (or at least from the same viewing geometries) have been limited to a small set of crop sites 22,40,41 .
When using observations from different satellites, differences in viewing geometry, overpass time and other factors related to data acquisition can affect results in addition to inherent differences between SIF and NIRVP. Therefore, there is the need to directly compare SIF and NIRVP regarding GPP estimation across different ecosystems at the site-level and, using the same satellite sensor for both variables, at the global scale.
In this study, we conducted i) a comprehensive evaluation of the relationship between NIRVP and far-red SIF from plot to global scales and ii) a detailed direct comparison of far-red SIF and NIRVP for GPP estimation with a strong focus on the global scale. We used a unique collection of observations that covers the majority of plant functional types and a wide range of temporal and spatial scales. In particular, this data collection includes observations from tower-based, airborne and satellite instruments with temporal resolutions ranging from half-hourly to monthly and spatial resolutions between 1 m and 50 km.

Results
Tower-based measurements. At the site-level, we found very strong linear correlations between SIF and NIRVP at the seasonal time scale (Fig. 1). The squared Pearson correlation (R 2 ) between SIF and NIRVP for individual sites ranged from 0.73 to 0.94 for half-hourly and 0.79 to 0.96 for daily data (Figs. S1a).
NIRVP performed slightly better than or similar to FCVIP and EVI2P (see Methods) for individual sites except for the two corn sites, where either FCVIP or EVI2P showed slightly stronger correlations (Fig. S2).
NIRVP more clearly outperformed NDVIP, APAR and NIRV alone (Fig. S2). When combining the observations from all sites, the SIF-NIRVP correlation was high (R 2 = 0.86 and 0.90 for half-hourly and daily data, respectively; Fig. 1b) and considerably stronger than the APAR-NIRVP correlation (R 2 = 0.74 for both 30 min and daily data; Fig. 1c). The slopes of the SIF-NIRVP relationship differed somewhat between sites (Fig.   S1a). Regarding the correlations of SIF and NIRVP to GPP, two previous studies 22,40 already reported higher or similar correlation of NIRVP for the rice, wheat, soybean and the two corn datasets we used. In addition, we found that NIRVP outperformed SIF also for the grassland site (R 2 = 0.60 vs. 0.49 for half-hourly data;   Airborne measurements. At the landscape scale, the SIF-NIRVP relationship was predominantly linear for airborne SIF retrievals at 1 m resolution in a crop scene (R 2 = 0.86) and NIRVP captured SIF variations within and between larger crop fields well (Fig. 2a,c). Furthermore, the SIF-NIRVP relationship was strong and linear (R 2 = 0.89) when selecting a subset of the flight line covering mostly small-scale plots where phenotyping experiments are performed and therefore larger variations in leaf physiology are expected (Figs. 2b,c). We observed a slight tendency towards saturation of NIRVP at high SIF values (Fig.   2a,c). NIRVP had comparable correlation to SIF as FCVI, while EVI2 showed a slightly weaker correlation and NDVI showed strong saturation and only moderate correlation (R 2 = 0.63; Fig. S2a,b).  Satellite observations. At the global scale, we found strong temporal correlations between SIF and NIRVP (Fig. 3). The temporal SIF-NIRVP correlation of 8-day composite data at 0.05 degree spatial resolution was very strong (R 2 >0.80) over large areas, especially in the northern hemisphere and for deciduous forests and crops (Figs. 3a). The temporal regression slopes showed spatial variations (Fig. S3a).
When spatially averaging global data for each plant functional type (PFT), SIF and NIRVP showed high temporal correlations, particularly for deciduous forests, evergreen needleleaf forests, crops and shrubland (R 2 ≥ 0.9, Fig. 3b). For evergreen broadleaf forests, the correlation was lower (R 2 = 0.62), but this was mostly due to the smaller seasonal variations of SIF which were well captured by NIRVP both globally and in the Amazon (Fig. S4a).
Apart from the temporal correlations, we found strong spatial correlations between SIF and NIRVP at the global scale (R 2 = 0.78, Fig. 4a,c). In particular, NIRVP captured spatial SIF variations very well in North America where SIF shows very high values in the US Corn Belt (R 2 = 0.84, Fig. 4a,c) and the part of Eurasia that shows a band of high SIF values (R 2 = 0.77; Fig. 4b,c). The spatial and spatio-temporal SIF-NIRVP correlations were high throughout the growing season in Europe, the US Corn Belt and globally (mostly R 2 around 0.8 at 0.05 degree resolution; Figs. 5, S4), while monthly spatial correlations were somewhat lower for East Asia after the onset of the Monsoon (Fig. S6c, e). The spatial correlation for July and the spatio-temporal correlation of SIF vs. NIRVP over the growing season showed indications of a weak nonlinearity towards high values at the global scale (Figs., 4c, 5c). Spatial regression slopes differed between PFTs and partly also showed considerable seasonal variations (Fig. S5, S6). NIRVP had stronger spatial and temporal correlations with SIF than other variables that have previously been reported to be good SIF proxies such as APAR 14,42 , EVI2P, and FCVIP 32 (Fig. S7). Both for TROPOMI SIF retrievals and the machine learning product CSIF 43 , NIRVP had the highest spatial and temporal correlations to SIF followed by EVI2P, FCVIP, APAR and NDVIP (Fig. S7). The performance rankings were consistent for spatial and temporal correlations, but performance differences between structural SIF proxies were larger for the spatial correlation. APAR and NDVIP had comparable temporal correlations but showed differences for the spatial correlation.

Relationships to GPP.
As SIF is commonly used as a proxy for GPP at large scales, we also evaluated the relationships of TROPOMI-based SIF and NIRVP to FLUXCOM GPP 44,45 at the global scale (see Methods). We found that NIRVP was highly correlated to GPP and clearly outperformed SIF for both spatial, temporal and spatio-temporal correlations (Fig. 6). Temporal aggregation led to larger increases in the correlation to GPP for SIF than for NIRVP (Fig. 6). This was apparent for both spatial and temporal correlations but was more pronounced in the spatial case. The pattern of increasing correlation between SIF and GPP closely mirrors that of the corresponding SIF vs. NIRVP relationship, especially in the temporal domain (Fig. 6c). The correlation and slopes for GPP relationships across PFTs were very similar for SIF and NIRVP (Figs. S3b,c; S8a ) but the variability in slope was considerably larger for SIF than for NIRVP (Fig. S8b).
NIRVP showed the strongest correlation to GPP among structural SIF proxies and the patterns of correlation strength to GPP were similar to those for SIF (Fig. S7).

Discussion
Overall, our analysis shows that NIRVP is a robust structural proxy of far-red SIF across ecosystems, spatial and temporal scales and instrument platforms (Figs. [1][2][3][4][5]. In particular, the SIF-NIRVP relationship holds at very high spatial and temporal resolutions (Figs. 1 and 2), across different crop management regimes (e.g. irrigated vs. rainfed), at a Mediterranean grassland site that experiences very dry conditions ( Fig. 1), and for a series of phenotyping plots where physiological variations are expected to be important ( Fig. 2b,c). Our results indicate that the physiological component in SIF, i.e. the ΦF term, varies little compared to the canopy structure and radiation component of SIF, i.e. APAR x fesc, and that this structural component is well captured by NIRVP (see Methods). More specifically, ΦF appears to not only vary considerably less than APAR, which has been previously established 14,21 , but ΦF seems to also vary considerably less than fesc (Figs. 1, S1) 22,23 . Therefore, ignoring variations in fesc 46 can lead to a misinterpretation of discrepancies between APAR and SIF 22 , as those discrepancies might be wrongly attributed to ΦF. Results from a process-based model (Fig. S9) further confirmed our observation-based findings and demonstrates that our interpretation of SIF being dominated by canopy structure and radiation is consistent with the current theoretical understanding of SIF 27,28 in addition to being supported by empirical evidence from previous studies 22,23,32,33 . The predominance of structure and radiation in explaining spatio-temporal variations of SIF implies that many research applications that use SIF directly, i.e. without extracting or considerably enhancing its small physiological component, could instead use NIRVP without large reductions in performance, and potentially even some improvements 12 .
We found that NIRVP is also a robust structural proxy for GPP and, somewhat unexpectedly, NIRVP outperformed SIF in estimating global GPP at different spatio-temporal scales (Figs. 6, S1d). From a theoretical standpoint, the shared structure and radiation components of SIF and NIRVP 19 , combined with relatively small variations in ΦF 22 , mean we should expect similar performance of SIF and NIRVP for GPP estimation supposing ideal, non-noisy signals. However, in contrast to NIRVP, which tends to have high signal quality, SIF is known to be affected by considerable retrieval noise 3,4 (Table 1). Several aspects of our results support the interpretation that differences in signal quality might explain the weaker GPP estimation performance of SIF compared to NIRVP. First, the temporal aggregation results (Fig. 6c) suggest that SIF suffers from considerable noise while NIRVP and FLUXCOM GPP both have much higher signal quality. Noise is reduced with temporal aggregation and correlation strength is expected to considerably increase as a consequence, which matches the observed patterns of correlations involving SIF. For the correlation between NIRVP and GPP, however, the increase with temporal aggregation was much less pronounced, especially for the spatial correlation. Second, the close relationship between PFT-level patterns of slopes and correlation in the SIF-GPP and NIRVP-GPP relationships indicate strong similarities between SIF and NIRVP despite the differences in overall correlation (Figs. S3b, S8a). In our findings, only the much larger coefficient of variation in the SIF-GPP slopes compared to the NIRVP-GPP slopes aggregated to PFT level (Fig. S8b) cannot be easily explained by invoking SIF retrieval noise and therefore other potential reasons for these differences should be investigated. Our results on the direct comparison of SIF and NIRVP for GPP estimation confirm and considerably extend previous findings from site-level studies 22,40,41 . Among other things, we confirmed that previous site-level results showing better GPP estimation performance of NIRVP compared to SIF in crops 22,40 also held in a Mediterranean grassland site which experiences drought (Fig. S1d). However, since NIRVP is entirely based on structure, it is expected to not fully capture fast physiological responses of photosynthetic light use efficiency to short-term drought and heat stress where canopy structure is relatively stable. Despite the robust SIF-NIRVP correlation across a wide range of spatial and temporal scales, we found differences between the two variables. Conceptually, such differences could be due to differences in data acquisition and processing of SIF and NIRVP, e.g. due to the use of different sensors, limitations in signal quality of either variable, limitations of NIRVP as an approximation of the structure and radiation component of SIF, or the physiological variations in SIF which are not captured by NIRVP. We tried to minimize differences in data acquisition and processing by using observations from the same sensors and found that this clearly improved the SIF-NIRVP relationship for airborne and satellite data (Fig. S2b, S7b).
Several of our results indicate that the lower signal quality of SIF is an important factor explaining observed differences between SIF and NIRVP. In addition to the GPP estimation results discussed above (esp. Fig. 6c), relevant results regarding the retrieval noise in SIF include higher correlations of NIRVP to enhanced machine learning based SIF products compared to original SIF retrievals (Fig. S7b) and extremely strong and linear SIF-NIRVP relationships based on simulated, noise-free data (Fig. S9). Apart from the SIF retrieval noise, we found a strong effect of clouds on the SIF-NIRVP relationship and despite a rather strict filtering (see Methods) cloud contamination still affected the results, especially for East Asia during the monsoon period that starts in July (Fig. S6a) and the tropics (Fig. S5). A stricter filtering would have limited the data availability for our analysis.
Setting aside discrepancies between SIF and NIRVP related to signal quality or the use of different sensors, we found some indications of a slight saturation of NIRVP at high values of SIF for spatial and spatio-temporal relationships (Figs. 2c, 4c, 5c). However, the apparently similar saturation patterns seem to be due to different factors for the airborne and satellite data. For the airborne data, the saturation of NIRVP appears to be inherited from NDVI (Fig. S2c) S5) but differences in seasonal slope variation for a given PFT might also contribute. The saturation effects for both the airborne and satellite data were clearly reduced when aggregating to coarser spatial scales (Figs. S2d; Fig. S3 center panel vs. Fig. 5c left hand panel) and the spatio-temporal, global-scale saturation was not consistently apparent in all months (Fig. S6d). It should also be mentioned that the high level of spatial aggregation appears to partly explain the difference between low or moderate pixel-level (Fig. S3b) and very high PFT-level SIF-NIRVP correlations (Fig. 3b), especially for evergreen needleleaf forests where stronger differences are expected based on known physiological mechanisms 29,30 . Therefore, detailed analyses at finer spatial scales are needed. Such studies should consider using enhanced SIF products with higher signal quality and spatio-temporal resolution 43,47 or higher quality SIF retrievals that will be available in the future from new satellites 2,48 .
The variation of regression slopes between SIF and NIRVP deserves special consideration as it is theoretically related to variations in ΦF (see Methods, Eqn. 4). We observed considerable spatial and temporal variations of SIF-NIRVP regression slopes at large scales (Figs. S4, S3, S7). When aggregated to the level of PFTs, the regression slopes partly agree with spatial variations in photosynthetic capacity, Vcmax, especially when using a fixed intercept (Fig. S10). This is consistent with the known sensitivity of SIF and ΦF to Vcmax 49-51 and indicates the potential usefulness of combining SIF and NIRVP to extract ΦF. In contrast to the global scale, the variations in regression slopes at the site level (Fig. S1a) could mostly be caused by differences in instruments and SIF retrieval methods (see Table M1 in the Methods section) which are known to considerably affect SIF magnitude 31 . In the spatial domain, similar regression analyses as for the temporal domain could be conducted but we chose to use the simpler approximation by calculating the ratio of SIF/NIRVP instead as regression intercepts tended to vary relatively little and for the airborne data the regression approach is not applicable. At the global scale, we observed considerable temporal variations of the PFT-level SIF/NIRVP ratio (Fig. S4c) that appear to be meaningful in terms of seasonal leaf dynamics of leaf physiology. Interestingly, the temporal variations of SIF/NIRVP seem to peak earlier in the growing season compared to previously reported temporal variations of chlorophyll content 52 which is also known to covary with photosynthetic capacity 53 . This apparent discrepancy in seasonal dynamics of different remotely sensing-based proxies of Vcmax will require further evaluation, ideally on the basis of ground observations. At the landscape scale, we found considerable differences in the SIF/NIRVP ratio between crop fields based on the high-resolution airborne data (Fig. S2e) that might reflect differences in leaf physiology. Although our results appear promising regarding ΦF estimation and its potential for vegetation monitoring, further detailed investigation will be necessary to evaluate the quality of ΦF estimates and better understand its variations regarding physiological mechanisms.
We found that, overall, NIRVP showed stronger correlations to SIF compared to the other structural SIF proxies (Figs. S1c, S2a, S7a). In particular, the consistently lower SIF-APAR correlations compared to SIF-NIRVP correlations indicates an important role of fesc. At the site level, we used total APAR rather than the so called green APAR, which is the actually relevant part for SIF. However, this can unlikely explain the large differences in seasonal dynamics of APAR compared to SIF (Fig. S1a) as the differences between green and total APAR are expected to be strongest in the senescence phase of crops 54 . Regarding the better performance of NIRVP compared to FCVIP and EVI2P, we found indications that the latter two are more strongly affected by soil background in very sparse vegetation. This effect could be partly seen from site-level results where the difference to NIRVP was largest for the grassland site (Fig. S1c), which has low LAI for a considerable part of the year, but was most apparent at the global scale where soil background characteristics are most variable (Fig. S7c). Despite its better performance compared to other structural SIF proxies, NIRVP is still affected by soil background signals for very sparse vegetation which, together with lower SIF signal quality, could potentially explain lower SIF-NIRVP temporal correlations in shrublands, savanna and grassland ecosystems (Figs. 2a, S3a). Some straightforward practical strategies to further reduce the soil background signal in NIRVP have already been proposed recently 19,38 . However, although simple and robust, the NDVI-based approach to estimating the near-infrared reflectance of vegetation is limited both in terms of the degree of soil background suppression and the selection of the chlorophyllrelated, 'green' vegetation signal. To overcome these limitations, more refined approaches could be developed that might make use of hyperspectral observations to better separate the soil and (green) vegetation signals. NIRV alone, i.e. without taking PAR into account, showed strong performance at the global scale (Fig. S7b), but more detailed analyses revealed that NIRV considerably overestimates SIF in fall when vegetation is still green but PAR decreases (Fig. S4b). These findings indicate a clear advantage of NIRVP over NIRV even at longer time scales (e.g. weekly to monthly) due to seasonal variations of PAR.
It should be noted, however, that the better performance of NIRVP compared to NIRV for satellite snapshot observations may not be apparent when comparing data from different satellites due to differences in overpass times and, potentially, observation geometry that affects the atmospheric transmission of upwelling light to the sensor (Fig. S7b). Differences between NIRVP and NIRV regarding their correlation to SIF and GPP are expected to become more important when using data from upcoming geostationary satellite missions such as GeoCarb, TEMPO and Sentinel-4 2 . As canopy-level PAR cannot be directly observed from airplanes or satellites, either a simple approach that approximates PAR via the downwelling NIR radiance 22,37,40 , or more complex methods involving atmospheric radiative transfer modelling 38,55 or machine learning 56 can be used. The radiance-based approach has previously been shown to have comparable performance with direct PAR observations at the site level 22 .
All evidence indicates that SIF and NIRVP are complementary measurements. SIF offers two distinct advantages for improved vegetation monitoring. First, the characteristics of its emission and retrieval make SIF insensitive to soil (emission) and less sensitive to thin clouds (retrieval) than the reflectance or radiance measurements involved in NIRVP 57 (Table 1). Second, SIF contains unique physiological information in the form of ΦF 7,29 . ΦF is thought to explain the faster and stronger stress response of SIF compared to structural variables such as APAR, NDVI or EVI (and, in all likelihood, also NIRVP) as observed in extreme events such as drought or heat waves 21,26,35,36,58 . However, these apparent advantages of SIF can be offset by considerable practical limitations in terms of data availability, spatio-temporal resolution and signal quality 1,12,59 (Fig. 6). NIRVP, in contrast, has long term data records 39 with high signal quality and, increasingly, very high spatio-temporal resolution 33,38 (Table 1) and therefore has advantages over SIF with respect to its structure and radiation component. Apart from the individual advantages of SIF and NIRVP, they can be effectively used in combination for at least two purposes. First, the ratio SIF/NIRVP or the corresponding regression slope can be used to estimate the physiological component of SIF, ΦF 22 (see Methods and Figs. S3d, S10). As ΦF estimation amplifies SIF retrieval noise 22 , however, very high quality SIF products should be used to avoid the need for aggregation to coarser scales. Second, since SIF and NIRVP share the same structure and radiation components (APAR x fesc) and NIRVP typically has higher signal quality than SIF, evaluating the SIF-NIRVP relationship can be used to assess the quality of SIF retrievals beyond diurnal variations that are strongly driven by PAR 31 . NIRVP might therefore prove helpful in further improving SIF retrieval methods as they continue to be refined 31,60,61 .
Overall, our study demonstrates the importance of canopy structure and solar radiation for understanding variations of SIF and GPP over a large range of spatio-temporal scales. We therefore expect NIRVP, which can capture most of these variations, to be more widely applied in future research on remote estimation of GPP, crop yield modelling and other related subjects. Furthermore, making more effective use of the physiological information in SIF by extracting it with the help of NIRVP might result in improved GPP estimation and new insights on vegetation dynamics.

Theoretical framework
Canopy-level far-red SIF, which we exclusively consider in this manuscript, can be decomposed into three mechanistic components, namely the absorbed fraction of photosynthetically active radiation (APAR), the canopy escape fraction, and the fluorescence emission yield, ΦF 46 : Our study relies on the previously established result 19 that, except for very low fractional vegetation cover, fesc for far-red SIF, which we exclusively consider in this manuscript, can be well approximated in the following way: where fPAR is the fraction of photosynthetically active radiation absorbed and NIRV is the near-infrared reflectance of vegetation estimated as NDVI x NIR 33 , where NIR stands for near-infrared reflectance and NDVI is the Normalized Difference Vegetation Index 62,63 . When substituting Eqn. 2 into Eqn. 1, using APAR = fPAR x PAR, and using the definition we obtain the approximation which is the basis of our rationale that SIF can be approximated by NIRVP assuming relatively small variability of ΦF compared to the variability of NIRVP. It is clear from Eqn. 4 that ΦF can be estimated as the ratio of SIF/NIRVP or as the slope in the linear regression of SIF vs. NIRVP. Including an intercept term in the regression can account for either imperfect SIF retrieval (i.e. an offset), soil background impacts on NIRVP, or both.
We use the convenient shorthand notation introduced in Eqn. 3 also more generally for other vegetation indices (VI) that we consider potential structural SIF proxies in the way VIP = VI x PAR. Thus, NDVI becomes NDVIP etc. which is also used for EVI2 64 and FCVI 32 .

Data sets and processing
Site data. We used previously published data from a total of six sites located in South Korea (rice) 14 , China (corn 1) 65 , France (wheat) 66 , Spain (natural grassland) 16,67,68 and the United States (corn 2 and soybean) 40 . Only the rice paddy, corn 2 and soybean sites were irrigated/flooded. Tower-mounted spectrometers with sub-nanometer spectral resolution were used for SIF retrieval at all sites. Except for the wheat site that used the TriFLEX instrument 69  resolutions. An overview of all site-level and larger scale SIF datasets including information on the retrieval algorithm used is given in Table M1 below. NIRV 33 was calculated from co-located spectrometers covering the visible-nearinfrared spectral range at a lower spectral resolution as we did not observe disadvantages compared to using the same sensor as for SIF retrieval and the latter also partly did not cover the red band. NIR and red reflectance bands were calculated as averages of 600-650 nm and 800-850 nm, respectively, and used to calculate NIRV, EVI2 64 and approximate FCVI 32 as the simple difference vegetation index NIR -red. Except for the rice paddy site which had a hemispheric viewing geometry for the upwelling radiation measurements, all other sites had a narrow angle field of view at nadir. PAR data was acquired with quantum sensors. NIRVP was calculated as NIRV × PAR. APAR was measured with either quantum (wheat, corn, soy) or LED (rice) sensors above and below the canopy at all sites except the grassland site.
For the rice site, fPAR had to be gap-filled for part of the green-up part of the growing season using a radiative transfer model 14,71 . A Hampel outlier filtering (window length of 12 days, threshold parameter equals 3) was conducted for the grassland data to filter out strong outliers in the SIF time series. For all sites, data between 8 am and 4 pm local time were selected. More details on methods and instrumentation can be found in the references given in Table M1.
To investigate if part of our results on the strong SIF-NIRVP relationship are consistent with the current theoretical understanding of SIF, we used simulations with the process-based model SCOPE 72,73 for the rice paddy site. To obtain a realistic scenario, the simulations were based on in-situ observations of meteorological conditions and relevant vegetation parameters in the rice paddy site. More details can be found in a previous publication from which the simulation outputs were reused here 17 . Airborne SIF data. We used data from the high performance airborne imaging spectrometer HyPlant FLUO, which was specifically designed to be used for SIF retrieval 74,75 and has been used in the preparation for the upcoming FLEX satellite mission 48 . The 2018 dataset we used (Table M1) is based on an improved processing chain, which results in high quality SIF retrievals 60,75 . The data was acquired on June 29, 2018 at 12:30 MEST at 680 m above ground level in at the agricultural research station Campus Klein-Altendorf in western Germany (50°37′N, 6°59′E). More details can be found in the relevant references 60,75 . We exclusively used original 1 m spatial resolution data in all analyses except Fig. S2d. NIRVP was estimated from FLUO at-sensor radiance data to ensure minimal differences compared to SIF retrievals in terms of sensor and processing aspects. The approach of using NIR radiance as proxy for the product of NIR reflectance times PAR was previously introduced as NIRVR 22 and was found to show good performance in terms of correlation to SIF and GPP at the site level 22 Satellite SIF data. We primarily used satellite SIF retrievals from the TROPOMI instrument on Sentinel-5P 76 . The instantaneous data in 2018 was gridded at 0.05 degree spatial resolution. In an approach that is conceptually similar to the one used for the airborne data (i.e. NIRVR 22 ), NIRVP was estimated by multiplying the TROPOMI NIR radiance at 759 nm with MODIS NDVI, which was aggregated to 0.05 degree resolution. MODIS rather than TROPOMI NDVI was used as no surface reflectance product is currently available for TROPOMI and the red band would have to be atmospherically corrected. We used the TROPOMI NIR radiance at 759 nm, as it was provided in the original SIF data product and no atmospheric correction is necessary due to its location in an atmospheric window 76,77 . This latter aspect ensures a direct comparability with the corresponding SIF retrievals as atmospheric correction for NIR at other wavelengths could introduce biases or artefacts leading to discrepancies between SIF and NIRVP. As NIR radiance is very sensitive to clouds, we applied a cloud filtering. For this, we used the VIIRS-based cloud product 78 with a threshold of 0.35 for the cloud fraction, which can reduce both direct cloud effects and indirect effects on the validity of using downwelling NIR radiance as a proxy for PAR. In addition to the cloud filtering, data with SIF signal uncertainty (1-σ retrieval error) larger than 0.55 mW m -2 sr -1 nm -1 as well as SIF values > 4 or < -2 mW m -2 sr -1 nm -1 were excluded from the analysis. MODIS NDVI was based on daily red and NIR nadir-adjusted surface reflectance products (MCD43D62, MCD43D63). SIF and NIRVP relationships were evaluated for instantaneous data, 8-day, 16-day and monthly composites. For evaluating results per PFT, we used the MODIS MCD12C1 land cover product. For the comparison of different variables to NIRVP in terms of correlation to SIF, we relied on the same MODIS products as used for NDVI and multiplied the vegetation indices by BESS PAR 55 , which is also based on MODIS products. We approximated FCVI 32 by using the red band rather than the average of visible reflectance which is not available from MODIS. EVI2 was chosen rather than EVI as the blue band is sensitive to atmospheric correction errors and relying only on red and NIR bands permits a more direct comparison among the three indices (NIRV, FCVI and EVI2) in terms of the equation used. Previous satellite-based results showing better performance of NIRV compared to EVI 79 suggest that EVIP would not outperform NIRVP. For NIRVP, units of radiance were converted to PAR units in the same way as for the airborne data (see above).
To evaluate the effect of PAR on the SIF-NIRVP relationship, we compared NIRVP to NIRV. Since no surface reflectance product is available for TROPOMI, we normalized at sensor NIR radiance by the cosine of the solar zenith angle following previous studies 80 .
As additional test, we compared MODIS NIRVP with CSIF 43 , which is a machine learning product based on OCO-2 SIF retrievals, MODIS reflectance and fPAR products. Since the original version of the product did not cover 2018, we used the recently updated version 2 of the product. The CSIF data was available at 0.05 degree spatial and 4-day temporal resolution.
For all satellite analyses, the fixed value 0.1 was substracted from NDVI to partially account for soil background 33 . Negative values of NIRVP and NIRV were excluded as they are typically caused by negative NDVI values related to snow.

Global GPP.
To evaluate the SIF-GPP relationship at the global scale, we used the ensemble RS+Meteo FLUXCOM GPP product 44,45 , which uses machine learning algorithms to up-scale eddy covariance tower observations to the globe.
Analyses. For the slope analyses in SIF-NIRVP, and SIF-GPP as well as NIRVP-GPP relationships, we conducted linear regression with either variable or constant intercept as there were artefacts for evergreen broadleaf forest due to the distribution of the data (only high values). The constant intercept was determined by averaging the median intercept in each PFT. For the case of temporal regression, evergreen broadleaf forest was excluded from the intercept calculation.
We relied on squared Pearson correlation as the main performance metric as it is equivalent to the coefficient of determination of linear regression with an intercept and a single explanatory variable.