Forecasting Marine Heatwaves using Machine Learning

Recently, severe warm-water episodes have occurred frequently against a background trend of global ocean warming. Sea Surface Temperature anomalies have an impact on the integrity of marine ecosystems which is an important part of the Earth’s climate system. The drastic effects of Marine Heatwaves on aquatic life have been on a steady incline in the recent years, damaging aquatic ecosystems resulting in enormous loss of marine life. The study of Marine Heatwaves has arisen as a fast-rising topic of inquiry. Operational forecasting and early warning systems that can predict such events can help in proactive planning and better mitigation strategies. In this study, the potential of machine learning models, namely Random Forest and N-BEATS, was evaluated to predict sea surface temperature on a seasonal scale using the NOAA OISST v2.1 dataset. The predicted sea surface temperature data was then used to forecast the occurrence of Marine Heatwaves up to a year in advance. The proposed models were tested across four historical Marine Heatwave events around the world. The results showed that the models were able to capture the onset, trend, and extent of the extreme events accurately.


Introduction
The rising effects of anthropogenic climate change are causing an increase in the likelihood and intensity of short-term oceanic warming events also known as Marine Heatwaves (MHWs) . These high-temperature extreme events have large-scale impacts on natural ecosystems and subsequent socioeconomic consequences. For e.g., the 2014-2016 MHW event in the Northeast Pacific Ocean caused a mass mortality event where 62 thousand sea birds (Uria Aalge) were found dead (Piatt et al., 2020). Similarly, the 2014-2019 MHW event in the Gulf of Alaska contributed to an economic loss of US$ 103 million every year to the fishery industry (Barbeaux et al., 2020). A repertoire of climate resilient approaches, including improved marine heatwave forecasting, proactive resource management, and enhanced resilience, is urgently needed.
There are a limited number of studies that have tried to evaluate the potential of sea surface temperature forecasts to predict MHW. The study by (Jacox et al., 2019) is one of the earliest attempts to predict MHWs. The study tried to evaluate the occurrence of four MHW events between 2014 and 2016 in the California Current System (CCS) using 8 different coupled climate models. All the trained models were able to capture the rising temperature anomalies beginning in late 2013. The models correctly predicted warmer than average summer temperatures in 2014. During the second anomaly event beginning in late 2014, none of the models, including those which were trained 2 months before the event, were able to predict the rising temperatures due to the wind stress anomalies. The Authors noted that the major hurdle in using climate models for MHW prediction is that their coarse resolution makes them unable to model fine-scale ocean processes making them difficult to predict MHW events at smaller regions like the CCS.
The authors in  investigated the potential of the Australian Community Climate and Earth System Simulator Seasonal version 1 (ACCESS-S1) ocean-atmosphere model to predict the 2020 marine heatwaves in the Great Barrier Reef on a sub-seasonal scale. The model was driven with the NOAA's daily and monthly mean interpolated outgoing longwave radiation and wind velocity, air-sea heat fluxes from the ERA5 dataset. The model was correctly able to predict the onset of MHWs a week in advance but was not able to capture the end of the MHW. The recent work by authors in  also explored the use of ACCESS-S2 for developing monthly MHW forecasts, which achieved a hit rate of up to 40 percent when forecasting 4 months ahead. The authors emphasized the importance of developing more accurate seasonal forecasts.
There have been numerous studies in the past decade that have tried to predict SST, ranging from physical equationsbased ocean-climate models to recent deep learning architectures such as CNN's (Saxena, 2021), Convolutional Long Short-Term Memory (Conv-LSTM) Networks, etc. (Xiao et al., 2019) recently showed that machine learning models such as Long short-term memory (LSTM) Deep Neural Network model and AdaBoost ensemble are effective in predicting short and mid-term daily SST in the range of 1 to 10 days. (Wolff et al., 2020) is a recent study where the authors reviewed the potential of Generalised Additive Models (GAMs), Random Forest (RF), XGBoost, Multi-layer Perceptron (MLP) and Long Short Term Memory (LSTM) networks to predict the SST for 562 days. The results showed that LSTM performed poorly compared to other models as it failed to capture the high-frequency variations in the input dataset. RF performed the best among all the models, however the ensemble average of all the models showed even higher accuracy.
As demonstrated in the previous literature, forecasting SSTs accurately at a region-specific scale is the key towards accurate and reliable MHW forecasts.
In this work, the performances of mainly two data-driven models, which are Random Forests and N-BEATS, are used to forecast the occurrences of MHW events on a seasonal scale i.e., monthly forecasts. The proposed models are then tested on four historical MHW events across the world by using Hit Rate and overall model accuracy.

Data and Study Areas
In our experiments, we used SST data from the Optimum Interpolation Sea Surface Temperature (OISST) v2.1 dataset (Huang et al., 2021). OISST is an analysis dataset constructed by combining observations from different platforms (satellites, ships, buoys and Argo floats) on a 0.25 degree global grid. A spatially complete SST map is produced by interpolating to fill in the observations missing from the dataset. OISST v2.1 data are available from NOAA National Centers for Environmental Information for the time period starting from September 1, 1981 and updated daily. We used the monthly average SST values from the OISST dataset. It was noticed that during the Gulf of Alaska and the Berring Sea Event, the sea surface temperature increased by 1-2 degrees through September 2016 (Walsh et al., 2018). In the Northeast Pacific the temperature anomalies reached up-to 1.76°C throughout 2014 and 2015. On the West Coast of Australia, the SST anomalies reached as high as 5.1°C during the Ningaloo Niño event in 2011 (Benthuysen et al., 2014) and during East China Seas MHW event in 2016, the sea surface anomalies rose by 2°C (Tan and Cai, 2018).

Methods
To classify MHWs, we used the definition presented in (Scannell et al., 2020) which defines MHW as an event when SST exceeds the monthly climatological 90th percentile for at least a month using monthly data from January 1986 to December 2020. To detect MHWs in the OISST dataset we used the Python package Ocetrac (Scannell et al., 2021).
The Darts package (Herzen et al., 2021) available for Python was used to implement the below-mentioned time-series forecasting models. The training data for a study area consisted of data until the MHW event occurred. The test data was the year in which MHW events occurred. To predict the value at a time step we used previous 180 months data as lags.

Random Forest
Random Forest is a tree-based machine learning algorithm that consists of many individual decision trees that operate as an ensemble. Each individual tree in the RF gives a class prediction and the class with the most votes becomes the model's prediction. The reason that the RF model works well is that many uncorrelated trees will outperform any of the individual trees (Breiman, 2001), the low correlation being important as a better result of the problem statement can be achieved. To ensure that the behaviour of each individual tree is not too correlated, RF uses mainly two methods, Bagging (Bootstrap Aggregation) where decision trees are extremely sensitive to the data that they are trained on and Feature Randomness where in a decision tree, each time there is a split in the node, a feature is chosen such that it produces the most separation in the observations of the left sub-tree and the right sub-tree. In oceanography, RF has been shown to be an accurate way of modelling time series data, e.g. (Wolff et al., 2020), (Liu et al., 2015). In our experiments, we used 100 estimators and set the lags to 180.

N-BEATS
Neural Basis Expansion Analysis for Interpretable Time Series or N-BEATS is a recent time series forecasting deep neural architecture. Backward and forward residual linkages, as well as a very deep stack of fully connected layers, form the foundation of the architecture. If the target output size is L, the time series' input data size will be an integer multiple n of L. Stacks of numerous basic Blocks, a trend, and a seasonality stack are used to process the n * L dimensional input. Every basic Block starts with a four-layer fully connected stack, which is then split into two pieces, each of which connects to another set of fully connected layers. A Double (Backcast and Forecast) Residual Stacking topology is used to organize each Stack. Every subsequent Block receives an n * L dimensional vector as its input, which is the result of an element-wise subtraction of the previous Block's Backcast output and input. A Stack's output consists of an n * L dimensional Stack Backcast Outputs that are fed into the next Stack and an L dimensional Stack Forecast Output that is element-wise summed with the relevant outputs from each Stack to generate the final L-dimensional Forecast output vector. To summarise, the overall design consists of two stacks, with the trend stack being accompanied by the seasonality stack (Oreshkin et al., 2020), as well as a double residual stacking topology mixed with the forecast-backcast principle. When comparing the performances of different univariate time-series forecasting methods, such as in financial data forecasting, N-BEATS has been found to be one of the best performing models (Karanikola et al., 2022). In our model, we used 8 layers, 12 stacks, 180 lags and trained up to 100 epochs.

Sea Surface Temperature Forecast Evaluation
To evaluate the SST predicted by different models, two well-known error evaluation metrics, viz. Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) have been used as defined below, where, F i is the forecasted SST Value, O i is the observed SST value, and n is the number of coordinate points. The results are described in Table 1.

Marine Heatwave Forecast Evaluation
To assess the accuracy and effectiveness of the MHW forecasts we used two metrics defined by :

Hit Rate
Hit rate can be explained as the ratio of correctly predicted MHW events and the total number of events. The 'hits' mean that the occurrence of an MHW event has been correctly captured by the model, i.e., a true-positive heatwave prediction and 'misses' refer to a false-negative heatwave prediction. The hits and misses are first calculated for each coordinate point for a month and for a period of 12 months prediction and then the average hit rate for each study area for the entire year is calculated. The value for each hit rate for the study areas are described in Table 2.

Accuracy
Accuracy can be explained as the correct predictions of the model over the total coordinate points of a study area. The accuracy for each location point is calculated for each study area for a period of 12 months prediction and then the average accuracy for each study area for the entire year is calculated. The value for each accuracy for the study areas are described in Table 2.

Discussion
It was noticed that both RF and N-BEATS performed similarly in predicting SST for all the four-study areas. They did not outperform each other as one of them predicted better in one study area than the other and vice-versa. Both the models were able to capture the trend, development, and extent of MHW events in all the study areas. In the Northeast Pacific region, the SST forecasts had the highest RMSE and MAE metrics despite which the MHW forecast proved to be accurate enough to get an accuracy of above 90 percent. This could be explained by the phenomenon that the NEP region has had consistently high temperatures since 2014 (Benthuysen et al., 2014) which lowered the threshold for MHW detection. In the East China Sea, both the models achieved an accuracy of about 85% and the same hit rate of 86.5%. In AL region, RF performed better in SST forecasting (Fig. 2 & 3) and the accuracy of both the models were about 88% (Fig. 4, 5 & 6). The results demonstrate that machine learning methods can be used to forecast SST and MHW events accurately with low computational cost at a seasonal scale and can be adapted to other geographical locations with varying conditions.

Conclusion
This study demonstrated the potential of machine learning methods, particularly N-BEATS and RF, for predicting MHW events on a seasonal scale. The monthly average SST values from the OISST v2.1 is used to forecast the occurrence of extreme events up to a year in advance and the proposed models were then tested across four different geographies across the world. Results showed that both the models were able to forecast the onset, extent and decline of the MHWs accurately with no significant difference in accuracy. Future improvements could include using methods like Graph Neural Networks (GNNs) to better model the spatio-temporal relations between geographical points.
Funding Statement This work received no specific grant from any funding agency, commercial or not-for-profit sectors.
Competing Interests The authors declare no competing interests.
Data Availability Statement All of the data and code used in this study are available in public and can be found here: https://doi.org/10.5281/zenodo.5920401 Ethical Standards The research meets all ethical guidelines, including adherence to the legal requirements of the study country.