On the impact of urbanisation on CO2 emissions

We use a globally consistent, time-resolved data set of CO2 emission proxies to quantify urban CO2 emissions in 91 cities. We decompose emission trends into contributions from changes in urban extent, population density and per capita emission. We find that urban CO2 emissions are increasing everywhere but that the dominant contributors differ according to development level. A cluster analysis of factors shows that developing countries were dominated by cities with the rapid area and per capita CO2 emissions increases. Cities in the developed world, by contrast, show slow area and per capita CO2 emissions growth. China is an important intermediate case with rapid urban area growth combined with slower per capita CO2 emissions growth. Urban per capita emissions are often lower than their national average for many developed countries, suggesting that urbanisation may reduce overall emissions. However, trends in per capita urban emissions are higher than their national equivalent almost everywhere, suggesting that urbanisation will become a more serious problem in the future. An important exception is China, whose per capita urban emissions are growing more slowly than the national value. We also see a negative correlation between trends in population density and per capita CO2 emissions, highlighting a strong role for densification as a tool to reduce CO2 emissions.


INTRODUCTION
Cities are responsible for close to 70% of global CO 2 emissions associated with energy consumption 1 . In North America, the proportion reaches 80% depending upon the definition of emissions scope and urban boundary 2,3 . Furthermore, cities could add over 2 billion people this century with global urban area tripling by 2030 4,5 . While concern mounts over the potential lockin of high-emitting infrastructure, many cities also have taken leadership on greenhouse gas mitigation, pledging ambitious reduction targets 6,7 . Quantifying trends in urban CO 2 emissions is critical to understanding near-term urban emission trajectories. Identifying major contributions to these trends will help expose the factors driving emissions over the longer term. These drivers are points of mitigation leverage, a goal of several urban alliances such as the Global Covenant of Mayors or the C40 Cities Alliance 8 .
Given the rate of urbanisation, it is important to establish whether, on average, urbanisation contributes to increased national/global CO 2 emissions. This has been a source of considerable debate with the consensus that developing cities are generally wealthier and more energy-intensive than the rural areas from which they draw their population. Thus, urban dwellers will consume more energy when compared to rural lifestyles, such that urbanisation per se, drives increased CO 2 emissions 9 . A countervailing view is that, beyond some stage in its development, economic growth in cities comes from low-emissions service industries so that urbanisation will be a decreasing or even negative contributor to national/global CO 2 emission trends 10 . This is a version of the Environmental Kuznets Curve (EKC) 11 . The more general statement of the EKC, that economic growth will first worsen but then improve environmental outcomes has been both theoretically and empirically controversial 12 .
To elucidate the role of urbanisation in emissions trends we would ideally like a quantitative analysis of the factors driving emissions. We could then ask whether such factors acted differently in urban and non-urban settings and during different stages of urban development. Such data is not available globally or for a long enough period for our needs. We can, however, decompose urban emissions into underlying factors and study the trajectories of different cities through the space defined by these factors. The approach is motivated by 13 who performed a similar analysis for national emissions. They decomposed emissions according to the KAYA identity as [a product of population, per capita GDP and the carbon intensity of the economy. They were thus able to distinguish pathways of emissions growth undertaken by regions or countries. Our decomposition must account for changes in city area and cannot include economic data since we lack this at the needed resolution. We can track the effects of changing population density and per capita emissions in hopes of revealing the variety of development pathways.
Most previous studies of trends in urban emissions have been limited by either space or time. these studies are simplest within one country where definitions of urban boundaries and emissions are more likely to be homogeneous. Examples include: Malaysia 14 , Turkey 15 , U.S. 16,17 , Japan 18 , U.K. 19 . Efforts at data harmonization, either by the researcher or regional agencies, allowed multicountry studies e.g.: African region 20,21 , developed countries 22 , developing countries 23,24 , and Europe 25 . Some recent studies have been able to extend the domain of such studies to the globe. Wu et al., 26 used measurements from the Orbiting Carbon Observatory (OCO-2) to quantify emissions from 20 cities. While this platform removes some of the restrictions of self-reported or proxy emissions data it limits the spatiotemporal extent for inferences and enforces a meteorological definition of city boundaries. Crippa et al., 27 used the Emissions Database for Global Atmospheric Research (EDGAR) 28 and an urban classification based on the Global Human Settlement Layer (GHSL) 29 to derive longer-term trends in urban emissions. They found that CO 2 emissions had grown rapidly for large cities in emerging areas while they have not in high-income countries. They also showed considerable variability in per capita emissions but noted that developed countries appear to have decoupled economic growth from emissions, at least in large cities. One limitation of this study is the spatial resolution of EDGAR (0.1 o ) and temporal resolution of the underlying population database (roughly five years). Here we extend the study of Crippa et al., 27 by using a higher spatial resolution (30 arc seconds) and a higher native temporal resolution (1 year). We wish to understand the contributions of bulk urban characteristics (area, population density and per capita emissions) make to trends in emissions.
To probe this question, we must separate the contributions of population growth and per capita CO 2 emissions trends from total CO 2 emissions growth. Such analyses require a time-series of the urban extent and CO 2 emissions with global coverage and enough duration to establish trends. No direct data set allows this. In particular, integrated measures of urban emissions or energy consumption such as fuel sales or electricity flows cannot disentangle the contributions to change. There are now reliable, remotely sensed proxies of urban energy consumption or CO 2 emissions which meet these criteria. By combining these with measures of urban extent and some underlying emissions contributors we can generate a global picture of the interaction of urbanisation and CO 2 emissions.
Previous studies have studied the relationships between emergent and intrinsic properties of cities (such as emissions and size). Gudipudi et al., 30 used the traditional Kaya identity 31 to examine the relationship between emissions and underlying drivers. Ribeiro et al., 17 used production functions to relate parameters such as population and area and emissions. Both these studies (as with most others) use static data sets. As one byproduct of our approach we will test whether a fit to crosssectional data (static in time) plus an index of urban development suffice to explain emission trends. Our target is the trend in emissions. Bettencourt et al., 32 noted that scaling properties derived from temporal and spatial (cross-sectional in their terminology) analyses were not equivalent.
Our study establishes these trends directly for a group of cities and examines drivers of these trends. For this we require a data set with global coverage and a considerable time-span. these have not, to our knowledge, been available before, at least using consistent definitions.
The other reason to monitor urban emissions directly is more practical. The United Nations Framework Convention on Climate Change 33 suggests monitoring the spatiotemporal variations of GHG emissions to inform international climate change policy 13,34 . Atmospheric concentration measurements combined with onground information are emerging as the means to best achieve the combination of accurate emissions tracking and detailed source characterization of emissions in urban areas [35][36][37] .
The structure of this paper is as follows: The "result" section describes the analytical method, a modified form of the Kaya identity, and the data sources used in our analysis. The "discussion" section describes the resulting trends in these emission contributors and summarises our major findings, a cluster analysis to place the trends in a regional context, and placement of urban emissions within the national context to demonstrate the potential impact cities may have on future national trends. The "method" comments on the implications and caveats of the results. Figure 1 shows the emission trends for the 91 cities in our analysis (trend values of each city are provided in supplementary data as Table 1). Overall, we see rapid increases in urban CO 2 emissions averaging 4.7%/yr. These averages conceal considerable variability across cities with emission trends ranging from −2.8%/yr (Madrid Spain) to 11.0%/yr (Xi'an, China). On average, the dominant contributor for CO 2 emissions growth is the change in urban area, (3.5%/yr). The change in population density contributes −0.6%/yr, indicating that cities continue to sprawl as they grow. The trend in per capita CO 2 emissions makes a positive contribution (averaging 2.2%/yr).

Trends
There are also significant relationships among the trends. Table 2 shows the correlations among the contributors across the 91 cities. It is important to stress that these are not temporal correlations but represent the relationship between trends in the three contributor variables across the 91-city sample.
The correlation of the area trend with the other two contributors is to be expected: cities that grow fast in areal extent, see declines in population density and increases in per capita CO 2 emissions. More surprising is the correlation between the two intensive contributors, population density and per capita CO 2 emissions. The relationship suggests that cities experiencing declines in population density, have increasing per capita CO 2 emissions providing direct evidence of the impact of changing urban form on CO 2 emissions.

Cluster analysis
While there is considerable spatial variability in trends and their contributors, some significant patterns emerge among classes of cities. We investigate these by performing a cluster analysis using the three contributors in Eq. 3.
Cluster analysis is a method for objectively identifying groupings in multi-dimensional data. Here we use a centroid-based technique: If each point is described by N parameters then these define its coordinate in an N-dimensional space. Clusters are defined so that the distance from every point in a cluster to its centroid (defined as the average of all the coordinates in the cluster) is less than that to the centroid of any other cluster. The number of clusters is set by the user and is generally chosen by considering the change in some metric of the analysis as a function of the chosen number of clusters. Here we use the Calinski-Harabasz score 38 which is roughly the ratio of the average distance between members of a cluster to that between clusters. One seeks the maximum amount of information available before we move from delineating truly isolated clusters to partitioning randomly distributed points within clusters. For our analysis we use the KMeans function from the python scikit-learn package 39 . Figure 2 shows Calinski-Harabasz metric as a function of the number of clusters. Optimal choices for the number of clusters occur at inflections in this curve, with the segments between demonstrating partition of randomly distributed points. For our case the optimal choice is four.
Henceforth we focus on our choice of four clusters. Figure 3 shows the cluster assignment for each city while Table 3 shows the cluster characteristics.
The clusters can be classified according to their overall emission trends. This yields two high-growth clusters, one intermediate and one low-growth cluster.
Cluster 1 shows a moderate positive area trend with a larger positive per capita emissions trend. It is dominated by cities in the developing world, mostly the Asian subcontinent. Cluster 2 exchanges these contributions with the area trend contributing more than the per capita emission trend. It is dominated by Chinese cities. Clusters 3 and 4 show similar area trends but are differentiated by their per capita emissions trends. Cluster 3, with a positive per capita emissions trend, contains mostly cities throughout the developing world in addition to the two largest cities in Australia, Sydney and Melbourne. Cluster 4, with the lowest positive emissions trend, is the only cluster to show a negative per capita emissions trend. It consists almost entirely of cities in the developed world. The presence of Baghdad in this cluster suggests the role of conflict in reducing per capita emissions.

National emissions impact of urbanisation
We define the rate of urbanisation as the trend in the proportion of the population living in cities. Let us define the national population as P, the urban fraction of the population as c, the urban per capita emissions as u and the non-urban per capita emissions as n. We can write the national emissions as The role of urbanisation in trends in national emissions is the contribution of dc dt to dF dt , i.e., P (u − n). We also know that the national per capita emissions e are given by Some manipulation yields the coefficient of dc dt in Eq. 2 as P uÀe 1Àc . Thus, urbanisation contributes positively to the trend in national emissions if urban per capita emissions are higher than the national average and vice versa. Changes in the role of urbanisation hence depend on d dt ðu À eÞ, the trend in urban versus national per capita emissions. We can calculate the difference in per capita emissions for a reference year and the trend in this difference over our data set. We calculate the emissions for a reference year by fitting a linear regression to the per capita emissions and calculating the value of the resulting fit at the reference year (in this case 2010.5 representing the 2010  average). Figure 4 shows both these values for 39 countries containing cities in our data set. We average reference per capita emissions and trends for multiple cities in one country. The averaging is populationweighted. Egypt had too few points in its national emissions to allow calculation of trends and it is therefore excluded. Figure 4 (panel a) shows that the current role of urbanisation is mixed. There is a tendency for developed countries to have urban per capita emissions lower than national emissions. This is by no means universal and several developing countries show the same behaviour. The case for trends in per capita emissions (panel b) is less equivocal. Here most countries show more rapid growth in urban than national emissions though again there are exceptions. There is little difference between developed and developing countries in this regard. One important exception is China whose 2010 urban per capita emissions are larger than the national value but with much slower growth. The results themselves do not shed light on whether this striking anomaly is a result of particularly emissions-efficient growth in China's cities or the result of an explicit policy to move high-emissions industries away from cities, trading-off urban and non-urban CO 2 emissions 40 . Cheng et al., 41 came to a similar conclusion using a traditional version of the Kaya identity and a cross-section of Chinese cities for the period 2000-2016.

DISCUSSION
There are several caveats to the analysis presented above. First, the spatial structure of CO 2 emissions calculated here is deduced from the distribution of satellite-derived nighttime lights starting in the early 1990s [42][43][44] . These are used to downscale country-level fossil fuel emissions provided by various national and international agencies [45][46][47] . When considering urban CO 2 emissions, the important quantity is the proportion of nighttime lights irradiance arising from the city compared to the country as a whole. There are two potential problems with this proxy. First, trends in the proportion of urban to country nighttime lights that arise from the different penetration rates of lighting technologies in urban and rural areas will contaminate our results. We expect these to introduce noise rather than bias since the take-up of this technology is highly variable. This problem reaches its most acute with the reference emissions for 2010.
The other problem is that the time-averaged spatial distribution of emissions might not be perfectly represented by nighttime lights. If this misrepresentation occurs, the temporal variation driven by urban growth will be incorrect. This problem highlights the need for better spatial proxies for individual emission sectors but at the moment these are not available. Recent advances in the EDGAR emissions product 28 may provide a path forward but will require extreme care in how proxies are used to downscale national statistics. Analysis by Doll et al., 42 suggests that nighttime lights are a reasonable proxy for temporal snapshots of spatial Fig. 3 Cluster assignments for the 91 cities in the study. The clusters are grouped into four and their characteristics are described in Table 3.  emissions but there are few independent data sets to assess trends. There is also some ambiguity in the quantity represented by the ODIAC results displayed in Fig. 4. The intensity of nighttime lights is a mixed indicator of emissions (scope 1) and energy consumption (scope 2). The comparisons with bottom-up inventories carried out by Asefi-Najafabady et al., 45 suggest this is not a major limitation for this application. Furthermore, the role of urban development in overall emissions is also a mixed scope 1 and scope 2 problem, so it is likely the nighttime lights distribution captures the relevant dynamics.
The results of this study have implications both for CO 2 emissions projections and preferred modes of urban development. First, we note the range of trends in per capita emissions. The per capita CO 2 emissions trend is the largest single contributor to the urban CO 2 emissions trend explaining, alone, 75% of the variance. Reducing this trend in the developing world seems, from our analysis, to be a general and powerful mitigation pathway. Cluster 2 (largely China) offers evidence that this is possible. In developed countries, our analysis suggests that the evolution of urban density is an opportunity for mitigation. Figure 4 does not show clear differences between developed and developing countries in the current role of urbanisation. The comparison of trends does not suggest any simple relationship as in the UKC. If there are discernible contributors of per capita CO 2 emissions trends, these might be valuable points of policy intervention to limit emissions growth. Possible candidates include an urban form 48 and economic specialisations.
The correlations among contributors of urban emissions also contain pointers for policy intervention. Table 1 shows that far from densifying, rapidly growing cities are generally thinning. This is associated with a growth in per capita CO 2 emissions. There is also a direct correlation between trends in population density and per capita emissions. Cities choosing a development path with greater population density are also minimising their growth in per capita CO 2 emissions. The historical link between urban planning and carbon efficiency should motivate city managers to strengthen policies on urban density. Figure 4 also carries an important lesson for studies of urban development and emissions. The panel a describes a snapshot in time. The snapshot suggests that per capita emissions decrease relative to national totals as cities develop. The trend analysis (panel b) suggests this is not the case. This highlights again the importance of data sets that can probe the temporal and spatial aspects of urban development and the risks of using a static view to predict evolution. The complex relationship between size and emissions trend is support by Crippa et al., 27 , showing that different urban categories had different trends.
This work is an opening exploration of a potentially rich data set. While we have captured most of the world's largest cities, some are missing due to incomplete Landsat data or the impossibility of determining an urban boundary in a large agglomeration such as the U.S. East Coast. Nonetheless, we should broaden the coverage of the data, in particular to include the mass of smaller cities which are also changing rapidly.
We stress that this analysis is descriptive rather than casual. Trends in multiple contributors may have common underlying drivers. Also, we do not consider energy flows (either embodied or direct) between urban and non-urban regions. An important future task is to investigate the underlying drivers of the relationships exhibited here. For example, how important is the trend in per capita GDP as an explanatory variable and can we learn anything about the carbon efficiency of the economies of different cities. This requires considerable care since many of the data sets attempting to spatially allocate economic activity also use nighttime lights as proxies, confounding the required independence of the explanatory variable.
We analysed trends of CO 2 emissions for 91 of the world's largest cities using algorithmically generated urban boundaries overlaid on gridded fossil fuel CO 2 emissions data. The average growth rate of 4.4%/yr reflects rising CO 2 per capita emissions globally and the growth in our chosen cities. With a modified Kaya identity as a framework, we decomposed urban CO 2 emissions into three contributing variables: area, population density and per capita CO 2 emissions. The trend in area contributes to CO 2 emissions growth across almost all cities while the trend in per capita CO 2 emissions makes a large contribution in most developing countries. Population density and per capita CO 2 emissions trends correlate negatively, suggesting a relationship between changing urban form and per capita CO 2 emissions. For our reference year of 2010, the per capita emissions in developed  countries are generally lower than the national average, while those in developing countries are generally higher. With the strong exception of China, emissions trends are generally larger for our chosen cities than the national averages suggesting that urbanisation will play an increasing role in driving national emissions and highlighting the importance of mitigation policies for cities.

Datasets and exceptions
Our task is to calculate trends in fossil fuel emissions for major cities, decompose these into their dominant contributors and investigate possible patterns in these contributors. Our chosen contributors are population, area and per capita emissions. Our analysis requires data for urban extent, fossil fuel CO 2 emissions and population. Urban extent is generated from the Built-up, Nighttime lights and Travel Time for Urban Size (BUNTUS) algorithm of Luqman et al., 49 . This algorithm defines a metric based on land cover classification, nighttime lights intensity and travel time to an urban centroid. All contiguous 30 arc-second pixels scoring above a threshold are included and the algorithm accounts for non-urban islands such as large parks or open space inside cities. We commenced our analysis with 91 cities chosen mainly by population 50 . The 91 cities span 39 countries with nineteen in China, twelve in the United States and nine in India. All other countries represented have three or fewer cities. Our study period covers 2000-2018, the longest period for which all our required data sets exist. Some large cities are excluded from the study. There are two reasons for this. Firstly, the necessary imagery may not exist across enough of our study period. The usual gap is the Landsat imagery necessary to characterise the urban boundary. The second is that some cities exist as parts of such large agglomerations that their boundaries cannot be defined by physical data. The clearest example is New York City which forms part of a larger agglomeration on the East Coast of the U.S. The excluded cities are listed in Table 1. There is a problem of land cover classification for some cities. As noted by Luqman et al., 49 , a common trajectory for growing cities is that two regions defined as urban by BUNTUS fuse to a single region. This obviously changes area and consequently total emissions suddenly, complicating trend analysis. Where this occurs we carry out the analysis for the whole period and include any city defined within the largest boundary of our chosen city (usually 2018). For example, Beijing commences with 13 cities in 1998 and finishes with one in 2018. The name we assign to the final city is its name in 2018.
Gridded population estimates come from the LandScan product 51 . LandScan is a global population database depicting an ambient (24-hour average) population distribution. The Land-Scan methodology disaggregates subnational census information through a suite of dynamically adaptable algorithms using spatial data, imagery-derived spatial products, and manual corrections. LandScan exploits spatial data and imagery analysis technologies in a multi-variable asymmetric modeling approach 52 . LandScan data represents an average, or ambient, population that integrates diurnal movements and collective travel habits into a single measure 52 . This is different from purely residential population maps but is better suited for comparison with emissions which include both residential and nonresidential activities of the target population.
CO 2 emissions are taken from The Open-Data Inventory for Anthropogenic Carbon dioxide (ODIAC) 53 . ODIAC is a global high resolution (1 km × 1 km) fossil fuel CO 2 emission data product 53,54 . ODIAC is based on spatial disaggregation of CO 2 emission estimates made by the Carbon Dioxide Information Analysis Center (CDIAC) 55 . CDIAC emissions are estimated by fuel type (solid, gas, and liquid fuels, bunker fuel, and gas flares) plus cement production, rather than the emission sector that is often used for the national inventory compilation 56 . The ODIAC spatial disaggregation is done in two steps. First, emissions from point sources (mainly power plants) are estimated and mapped using the power plant emission estimates and geolocation taken from a global power plant database. The rest of the emissions (country total minus point source emissions), which we refer to a non-point source emissions, are distributed using the spatial distribution of satellite-observed nighttime lights (NTL) intensities 53,54 . Non-point source emissions are disaggregated to a 1 km × 1 km spatial resolution using Defense Meteorological Satellite Program (DMSP) calibrated radiance and Visible Infrared Imaging Radiometer Suite (VIIRS) NTL datasets, with mitigated saturation effect, developed by National Oceanic and Atmospheric Administration's (NOAA) Earth Observation Group 57 . The calibrated radiance NTL data is a merged product of the regular DMSP NTL product and benefits from reduced gain observations 58 . Oda et al., 53 show an improved spatial emissions distribution from the original publication by Oda and Maksyutov, 54 due to the use of the calibrated radiance data. We calculate emissions for each city in each year by summing all emissions from ODIAC for that year which lie within the polygon defined by BUNTUS. All the data values are provided as Supplementary Table 2.

Kaya Identity
We proceed by analogy with economics which frequently decomposes Gross Domestic Product (GDP) as a product of three terms GDP ¼ population participation productivity (4) Raupach et al., 13 used a decomposition for national emissions. We write urban CO 2 emissions using a modified Kaya identity 59 as a product of urban area, population density and per capita CO 2 emissions. E ¼ Ape (5) where E is the total CO 2 emissions, A the urban area, p the population density (persons per unit area) and e the per capita CO 2 emissions (tons carbon per person). We use upper case for extensive and lower case for intensive variables. Following Raupach et al., 13 we use a logarithmic transformation to decompose the proportional trend in CO 2 emissions as where δ represents a proportional trend defined by and is usually expressed as a percentage per year. For a quantity x we calculate δx as follows: 1. We start with a time-series x(t) which is often sparse since some years lack urban boundary data (see Luqman et al., 49 for an explanation). 2. Fit a linear regression L(t) = a + bt to x(t) 3. Calculate x as Lðt ref Þ where t ref midpoint of our study period.
x is hence an estimate of the average assuming linearity with time. 4. We repeat this procedure for E, A, p and e.
We stress that while expressions like Eq. 5 are mathematical identities they are not statement of causality but may elucidate underlying causes. We apply the modified Kaya identity to our 91 cities.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
We used the data from the following sources in our analysis. The city boundary data (BUNTUS) is available at http://thebuntus.com/paper_page.html. The gridded population data (LandScan) is available at ref. 51 and gridded fossil fuel CO 2 emission dataset (ODIAC) is available at ref. 53  Received: 10 February 2022; Accepted: 31 January 2023;