Skip to main content
Analysis of the potential of NLP techniques to identify climate change themes in Canadian social media textual content

Analysis of the potential of NLP techniques to identify climate change themes in Canadian social media textual content

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Supplementary Files

Authors

Negar Shabanpour, Stephane Roche , Sehl Mellouli

Abstract

Social media discussions about climate change offer valuable insights into how the public views climate issues and their willingness to engage in personal climate actions. Understanding individual climate actions is crucial because households are responsible for more than 70% of global carbon emissions, and lifestyle changes alone can reduce carbon emissions by approximately 15%. This research examines 17926 textual entries from two Canadian subreddits (r/ClimateCrisisCanada, r/AskACanadian) from January 2023 to December 2024 to identify individual activities and behaviours associated with greenhouse gas (GHG) emissions discussed in Canadian climate discourse, addressing this gap in understanding authentic climate behavior priorities. We employed BERTopic (Bidirectional Encoder Representations from Transformers for topic modelling) in three configurations: baseline class-based Term Frequency-Inverse Document Frequency (c-TF-IDF), Maximal Marginal Relevance (MMR) diversity optimization, and KeyBERT semantic enhancement. This multi-configuration approach was essential because individual climate behaviors are discussed using diverse terminology and overlapping contexts, requiring different analytical lenses to capture the full spectrum of behavioral discussions without missing nuanced distinctions. Thematic analysis revealed that Canadians predominantly discuss four key behavioural domains in climate discourse: energy management (representing the largest category with 3-6 topics depending on model configuration), transportation choices (consistently 2 topics across all models), dietary decisions (2-3 topics), and consumption patterns (1 topic focused primarily on waste management and recycling). Energy discussions proved most diverse, encompassing residential heating solutions, renewable energy adoption, and nuclear power preferences, whilst transportation maintained remarkable consistency across models with distinct themes of sustainable mobility options and vehicle technology transitions. The comparative modelling approach demonstrated that different BERTopic configurations capture complementary aspects of climate behaviour discourse, with KeyBERT's semantic enhancement providing the most detailed categorization of technical solutions. These findings provide empirical evidence for understanding Canadian individual climate action priorities, offering essential insights for targeted climate policy and communication strategies.

DOI

https://doi.org/10.31223/X52F6S

Subjects

Environmental Studies

Keywords

climate behavior, social media analysis, topic modeling, BERTopic, individual climate action, Canadian climate discourse

Dates

Published: 2026-06-17 23:55

Last Updated: 2026-06-17 23:55

License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
The authors declare no competing interests

Metrics

Views: 12

Downloads: 0