Urban Running Activity Detected Using a Seismic Sensor during COVID-19 Pandemic

Human foot traffic in urban environments provides essential information for city planners to manage the urban resources and urban residents to plan their activities. Compared to camera or mobile-based solutions, seismic sensors detect human footstep signals with fewer privacy concerns. However, seismic sensors often record signals generated from multiple sources, particularly in an urban outdoor environment. We compare the spectra of natural and urban events commonly observed in a park in Singapore. For each three-second seismic data, we define hierarchical screening criteria to identify footsteps based on the spectrum of the signal and its envelope. We derive the cadence of each runner by detecting the primary frequency of the footstep signals. The resulting algorithm achieves higher accuracy and higher temporal resolution for weak and overlapping signals compared to existing methods. Runner statistics based on 4-month long seismic data show that urban running activities have clear daily and weekly cycles. Lockdown measures to mitigate COVID-19 pandemic promoted running activities, particularly over the weekends. Cadence statistics show that morning runners on average perform better than evening runners.


Introduction
Human footsteps have been widely studied for the rich information they carry, not only about the individuals, but also about the surrounding environment and their interactions. Automatic footstep detection has been applied for indoor applications in fall detection [1], patient monitoring [2], individual physical and emotional health evaluation [3], indoor surveillance [4], customer behavior analysis [2], as well as providing occupancy, crowd dynamics, and human flow information [5].
Footstep events are easily detected from preprocessed time-domain seismic vibration data recorded indoor according to their periodic and spiky features. Pan et al. [2,6] extracted the step events (SEs) using an anomaly detection method where a SE is detected if the sum of the squared values within a small window is beyond three standard deviations above the mean of the modeled Gaussian noise. Tang et al. [5] separated the footstep signals from the background noise using an energythreshold-based method in which footstep signals can be extracted if the maximum energy among the sensor array is larger than a dynamically updated threshold based on the distribution of Gaussian noise. Li et al. [3] adopted the Short Time Average over Long Time Average (STA/LTA) method to detect the footstep signals. Clemente et al. [7] applied the S/L-Kurt Algorithm [8] to extract the footstep signals as well as fall down vibrations.
For outdoor footstep applications such as outdoor surveillance [9][10][11][12][13][14], an automatic footstep detection methods for seismic data recorded in a relatively quiet outdoor environment are similar to those in an indoor environment. The impulsive and periodic footstep features can also be extracted using time domain methods. Kurtosis [15] is a statistical method based on the amplitude distribution of the vibration signals within a short time window. Footstep signals can be detected with higher Kurtosis due to their spiky signatures than motor vehicles and background noise.
Lacombe et al. [16] and Anghelescu et al. [13] detected the footstep signals by first calculating the kurtosis from the seismic data and then calculating the cadences to further determine if the signals are footstep signals. Koc et al. [12] proposed a slow and quick adaptive thresholds algorithm to identify footsteps and vehicles. Time domain footstep detection methods are easy to implement and efficient. However, they rely heavily on the large footstep amplitudes compared to the ambient noise and often fail to identify or misidentify footsteps when the signal to noise ratio (SNR) of the footstep signals is low. Time-frequency methods, on the other hand, not only capture the nonstationary characteristic of the signals, but also partially remove the noise in the data. Xing et al. [17] proposed an SVD-based adaptive threshold denoising algorithm in wavelet domain to remove the noise in the seismic data, and then detected the footstep signals base on the wavelet energy. However, they did not discuss if the method can distinguish footstep signals from high-energy vibration generated by motor vehicles in an urban outdoor environment.
Houston et al. [9] proposed a spectrum analysis method to detect the footstep signals from the spectrogram of the band-pass filtered and down-sampled envelope of seismic signals. Footsteps are then identified by verifying the value and SNR of primary frequency as well as second or third harmonic. The basic principle of the spectrum analysis method [9] is illustrated in Figure   1. Figure 1a shows a 1-min raw data, Figure 1b is the filtered data obtained using a 40-100 Hz band-pass filter, and Figure 1c is the envelope of Figure 1b after a 10 Hz low-pass filter. Figure 1d, 1e and 1f shows the normalized spectrum obtained from the footstep signals, background noise, and signals generated by motor vehicles, respectively. There are three clear spikes in the spectrum of the footstep signals besides the zero Hz peak and they are harmonics. However, in Figure 1e and 1f, the spectra are relatively flat and the energy is low beyond zero Hz. With these different characteristics, Houston et al. [9] identify strong footstep signals whose primary amplitude is above 11 dB SNR and the harmonic above 7 dB from seismic data recorded 40 meters away from the road.
In the complex urban environments, however, footstep signals can be very noisy due to the high-level ambient noise coming from various kinds of natural and anthropological activities. The seismometer we use to record data in the West Coast Park in Singapore is about 5 meters to the footway in the park and about 7 meters to the Harbour Drive ( Figure 2). Large volume of motor vehicles enter and leave the 24-hour Parsar Panjang port terminal [18]. As a result, strong noise generated by the motor vehicles were continuously recorded in the data. These strong near-field traffic noise may bury the footstep signals ( Figure 3a). About 1.8 km Northeast to the sensor, there is the West Coast Highway. The motor vehicles running on the highway also contribute to the low frequency part of the data. Besides, frequent thunderstorms in Singapore resulted in highenergy spiky signals in the recorded data ( Figure 3d). Thus, the algorithm proposed by Houston et al. [9] is insufficient to detect the footstep signals from other overlapping signals. Following the same workflow of Figure 1, Figure 3c shows an example where weak footsteps cannot be identified, while Figure 3f shows an example where thunderstorms may be misidentified. To overcome these challenges, we adapt the spectrum analysis method proposed by Houston et al. [9] to achieve the following aims: 1. To distinguish thunderstorm signals from footstep signals.
2. To identify footstep signals with low SNR.
3. To provide automatic and stream processing for long-term monitoring of runner activities in urban environments.
The organization of this article is as follows. In Section II, we discuss the modified runner identification algorithm in detail. in Section III, we provide hourly runner activity monitoring results obtained from 4-month long monitoring seismic data. In Section IV, we discusses the performance of the modified algorithm before we summarize our conclusions in Section V. The runner detection algorithm is summarized in Algorithm 1. In this section, we give a detailed description for data preprocessing, detection, counting, and discuss parameter selections in each step.

Data preprocessing
We load seismic data in t a segments, bandpass them between 40 and 100 Hz, and calculate the normalized spectrum of the filtered data in 3-second non-overlapping windows. On these spectra, we label the thunderstorm signals based on the energy recorded in the 80 to 100 Hz frequency band. Figure 4a, 4b, and 4c shows the normalized spectrum obtained from band-pass filtered Figure 1: Basic principle of footstep detection using the spectrum analysis method proposed by Houston et al. [9]: (a) 1-minute raw data, (b) filtered data after applying a 40-100 Hz band-pass filter, (c) envelope of the band-pass filtered data displayed in (b), (d), (e), and (f) represents the normalized spectrum obtained from the footstep signal (labeled by the red dash box in (c) with number (I)), background noise ((labeled by the red dash box in (c) with number (II))) and traffic noise generated by motor vehicles (labeled by the red dash box in (c) with number (III)), respectively.

Algorithm 1 Automatic runner and cadence identification
Input: raw data: d(t), with length t a parameters: t w , f min , f max , hf l , hf h , a mn , f n , pf min , pf max , ε, a p Output: runner numbers: r num cadence of each runner: C, with length r num Initialization: Preprocessing:: Calculate the spectrum of each column in d bp (t w , N ) and then normalize with maximum amplitude: Identification: while i < N do if A(i) < a mn then Compute pf, sf using the ith trace of D e if (pf min < pf < pf max )& (2 − ε < sf /pf < 2 + ε)& (D lf (pf ) > a p ) then label(i) = 1 c ← pf * 60 append c to C r num + + i+ = ntw else: i + + end if else: i + + end if end while footstep signals, near-field traffic noise generated by motor vehicles and spiky signals generated by thunderstorms, respectively. We can see the spectrum generated by the thunderstorm signals ( Figure 4c) has much higher energy at higher frequencies compared to the other two. In our field data application, if the mean of the normalized amplitudes from 80 Hz to 100 Hz is larger than 20% of the maximum amplitude, the signal is identified as generated by thunderstorms. We then calculate the envelope of the 40-100 Hz band limited data and corresponding normalized spectrum.

Footstep detection
The detection criteria are: (a) The normalized spectrum obtained from the 3-second low-pass filtered envelope should have a strong primary. It should be larger than a predefined threshold a p ; (b) the primary frequency pf should be in the frequency band pf min to pf max ; (c) The second harmonic should exist in the normalized amplitude spectrum. The frequency of the second harmonic sf should satisfy the constraint 2 − ε < pf /sf < 2 + ε, where ε represents how much pf /sf deviates from 2 and it results from the noise in the data. In the field data example, we assign   a p = 0.14. Since the cadence frequency of human footsteps is in the frequency band of 1-4 Hz [19] and the typical walking range is between 0.6-2Hz [20], we set pf min and pf max as 2 Hz and 4 Hz, respectively. Finally, we choose ε as 0.2, which means we allow sf /pf to have a 20% deviation due to the noise from other sources in the data.

Runner counting
We use a single-digit binary label to denote the footstep detection in each 3-second window. Our observation shows that runners on average generate detectable seismic signals that last for 12 seconds. Therefore, scanning through each 3-second window and for each new detection of footstep, we assume that the subsequent 9 seconds of data record the signals from the same runner. When a single runner is identified, the peak frequency of his/her footstep signals is recorded to calculate the cadence as: . Figure 5 illustrates how the algorithm detects the footstep signals and counts the runners and their cadences. Figure 5a shows two challenging cases when the SNR are extremely low. The first case involves overlapping footstep and motor vehicle with SNR = -0.29 dB, while the second case highlights a soft step with SNR = 0.52 dB. Figure 5b is the envelope of Figure 5a after a 10 Hz low-pass filter. Figure 5c and 5d are the normalized spectrum obtained from the signals marked by the two red dash boxes, respectively. The strong primary and second harmonic (marked by the red ellipses) are used to identified as the footstep signals. Thus, the corresponding segments are labeled to 1 (Figure 5e). Figure 5f and 6g show the corresponding runner counting vector and the cadence vector, respectively. In this 1-min recording, the algorithm detects two runners in total.

Long-term runner activity monitoring
On April 7, 2020, the first day of CB in Singapore, we deployed a seismic sensor to record the footstep signals generated by runners in the West Coast Park. The seismic sensor (SmartSolo IGU-16HR 3C with 5 Hz corner frequency) is replaced for recharging each month. The sampling rate of the recorded seismic data is 500 Hz. We process all the data recorded from April 7 to August 30 and count the runners, together with their corresponding cadences in each hour each day. Figure   6a and 6b shows the average runner numbers in each hour from Monday to Sunday over the weeks during and after CB, respectively. There were two peaks of runner activities, one in the morning from 7:00 am to 8:00 am, and the other in the evening from 18:00 pm to 19:00 pm. Runner activities during and after CB, were relatively stable in the morning, whereas runner activities in the evening decreased significantly as soon as CB ended, especially on weekends. Runner activities maintain a stable weekly cycle during and after CB, with constantly more runners over the weekends than during the work days. Nonetheless, the runner count shows that the preferred workout time over the weekend has changed after CB. During CB, the largest number of runners work out on Sunday evenings. After CB, the peak running hours are shifted to Saturday evenings. This suggests a heavy psychological stress that a normal working week places on urban residents.
We collect the cadence data of the runners both for the morning runners (from 6:00 am to 12:00 pm) and the evening runners (from 16:00 pm to 22:00 pm), and show the corresponding cadence statistics in Figure 7a and 7b, respectively. The most common cadences are 160 steps/min and 180 steps/min. We assume that runners paced at 180 steps/min and above are fast, professional-equivalent runners [21,22] and compare the percentage of fast runners in the morning and in the evening (Fig 8). This observation suggests that people running in the weekday morning on average have a better performance than those running in the evening. While the evening runners with high cadences stayed relatively stable during the 4-month period, morning runners with high cadences fell by about 5%.

Discussion
The adapted algorithm performs well on distinguishing footstep signals from noise generated by other sources. The open source codes we provide are very efficient. It takes less than 2 minutes to read the data from a Network Attached Storage (NAS) device and process one-day long seismic data. Thus, we can steam the processing if the data are acquired in real time. The algorithm has the potential to achieve real-time processing and information gathering. Besides, all the parameters   in the codes can be tuned according to the data.
However, when counting the runners using the adapted algorithm, a good decision of the time duration of the footstep signals generated by one runner is important. We can get this information by observing the time duration of the footstep signals in the filtered data. This parameter may vary with different datasets since the distance between the sensor and the road may be different.
We use 3-second windows to detect the footstep signals, since we find empirically that this is the minimum window size which preserves the harmonics in the spectrum of footstep signals. The temporal resolution of the algorithm is hence 3 seconds, which is sufficient for runner detection in the park, but might not be sufficient for occupancy detection in crowed areas such as malls and schools.

Conclusions
In this paper, we adapt the footstep detection algorithm proposed by Houston et al. [9] and apply it to the seismic data recorded in an urban outdoor environment in Singapore during the COVID-19 pandemic. The adapted algorithm performs well in distinguishing the footstep signals from the signals generated by motor vehicles, frequent thunderstorms in Singapore, and the background noise. We demonstrate that the adapted algorithm identifies runners from overlapping and weak footstep signals and achieves a temporal resolution of 3 seconds. The highly efficient algorithm has the potential to process the data in real time and is suitable for long-term runner activity monitoring. Runner and cadence data in Singapore show that the COVID-19 pandemic encouraged more people to work out and runners statistically have a better performance in the morning than in the evening. These results provide valuable information for residents who intend to workout for better health and performance.