Classification of Multi-Temporal Images using Machine Learning

In past, there has been a lot of research related to the image-based technique in remote sensing from which object-based classification is giving great results among all the techniques. This paper presents a new approach where we have mixed both OBIA (Object-Based Image Analysis) & supervised classification. And with this novel approach, our team aims to do classification as well as analysis for the change detection over time. The data used in this study is high-resolution Multispectral 4-band images from 2017 to 2019 (i.e. 3.0 m) provided by the PlanetScope satellite of region Chandigarh, India. Here the data has been pre-processed through passing it in a pipeline of steps and used a Multi-resolution segmentation algorithm and classify the 7 classes through supervised learning using 3 algorithms Maximum Likelihood (ML), Support Vector Machine (SVM), Mahalanobis Distance (MD). And out of the three, SVM and ML has given the highest Overall Accuracy of 95.21% & Kappa Coefficient = 0.9159 and Overall Accuracy 91.91% & Kappa Coefficient = 0.8860. Altogether; this is a highly effective approach for classification and detecting the change in Urban area or Rural area or forest area than simply using OBIA or pixel-based approach.


Introduction:
There are 3 techniques for Image Classification in remote sensing (Costa et al., 2017): Unsupervised (Whyte et al., 2018) Classification, Supervised (Costa et al., 2017;Jovanovi et al., 2010;Wieland et al., 2016) Classification, and Object Base Classification. Multi-temporal images are combined imagery of multiple images of the same place or location taken at different points of time. By doing comprehensive research for the selection of location, so that at least 5-7 classes can be classified depending on the geo-location. The former image taken is from April 2017 and the latter one is from May 2019.
In the traditional approach, Land classification (Yu et al., 2016) & change was detected with pixelbased (Whiteside et al., 2011) image detection that was being replaced by object-based image detection later on due to less accuracy compared to object-based and more time-consuming. Object-based image analysis (OBIA) includes pixels first being grouped into objects based on spectral similarity as geological unit. In OBIA there are two main approaches (i.e. segmentation and classification). Now the limiting point in OBIA is that it works mostly with high-resolution data in order to get desired results as well as the process is lengthy. Overall, until now Object-based is considered as one of the best options for calculating the multi-temporal data like Land-Coverage.
Considering all this, a new approach is introduced where OBIA (Jingping et al., 2016) is used for segmentation and supervised learning is used for classification of the classes, instead of using OBIA for both purposes. For this research, the following classes were classified: Road, Urban-Area, Barren-Land, Water, Wooden-land, Grass-Land and Fields. While doing this process it was discovered that if 2 classes are very near to each other like Grass-Land and Fields in this case it is recommended to This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing make a new class. Kappa coefficient is used for calculating the accuracy of the classification done by supervised learning algorithms. Furthermore, to find change detection over time the technique utilized here is thematic change. This new method works great with high resolution as well as medium resolution images.
Study Area, Data-set and Software Used:

Study Area:
This study focuses on Chandigarh City located in India which is a perfect suit for this research because of its geo-location and well-structured planning of the area. The reason to choose this piece of land is that it has a combination of the urban area, rural area, water bodies, mountains, agricultural areas.

Data:
The High-quality data is provided by Planet (Planet Team (2017)) via Planet's Education and Research Program. One image is dated from 11 April 2017 and the second image is from 4 May 2019. All the images are of high quality (i.e. 3m 4-band images can refer Table 1 for more detail) but here the catch is to merge all the images and to resize to get the desired location. To resize both the images equally shapefile is being used with all the coordinates of the location that is shown in Figure 1. This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

Software:
For image pre-processing Gis-Software (Verbeeck et al., 2011) is used to perform all the necessary tasks. Apart from this Ecognition Developer 9.0 software is used, which is specifically developed for OBIA and for the ROI (region of interest) creation that will be used as a training data set for machine learning algorithms and for thematic change Envi 5.1 software is utilized.

Methodology
Before getting into technical depth the methodology here is divided into 3 parts and it will have further subparts in it as shown in Figure 2. Also, note one thing that satellite Image with 4-band(i.e. blue, green, red, nir) is a must for this approach to get the same output as in this research.
This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

Pre-processing Image
A pipeline is being created for pre-processing the image where initially the image is being merged using mosaicking and then resized using the shapefile (Whyte et al., 2018;Zhong et al., 2019). Furthermore to make the shapefile the coordinates of the desired location have been pin-pointed by user preference so both the images are resized evenly shown in Figure 3.
This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

OBIA
In this research, segmentation (Hay et al., 2005;Whyte et al., 2018) can't be performed directly due to a couple of tasks that are to be performed before hand like image layer mixing using equalizing histogram with three-layer mix otherwise the colour of the image gets dull. Once this step is completed, our next task is to define ruleset for segmentation where a process tree should be created which is shown below.
In the process select the multi-resolution algorithm where first image object domain is to be set to pixel level as there aren't any objects for now. Secondly, in the segmentation settings the hyperparameter and image layer weights of the algorithm should be Blue:Green:Nir: Red to 1:1:2:1. Nonetheless the last hyper-parameter is composition of homogenous (Castillejo-gonzález et al., 2014;Whiteside et al., 2011) criterion where the shape is set to 0.2 and compactness (Verbeeck et al., 2011) to 0.8. In the end, export the image with all the 4-bands with a file extension of .tif file. Now the process tree is ready to run and will get a segmented image as shown in the figure which is converted from pixels to objects refer to Figure 4. This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

Classification
In earlier studies, the classification part was done with OBIA thus Machine learning (Chen et al., 2018;Wieland et al., 2016) has been incorporated to take care of this part. The segmented image is taken as an input and load it in Envi as a raster file after this makes the ROI (i.e. region of interest) file also referred to as training data in terms of ML in which 7 specified classes are used as shown in Table 2. While specifying classes (Gu et al., 2015;Yu et al., 2016) , use ROI type as a polygon to choose the pixels from the image to get better training data. Once ROI is done then the next part to follow is the classification part. Here Envi is utilized to perform supervised algorithm on both the images which will mimic the classification that is being done by obia as mentioned in previous studies. Moreover, the input file and ROI are to be passed together when choosing any supervised algorithm. Then select all the classes that the user wants to classify, simply leave the hyperparameter as it is, and start the classification process. The computation takes 10 minutes to an hour depending on the size of the image and machine used. During this research, this was tested on i5 and i7 powered machines with 8GB of ram.

Table 2
Classes used in this paper. Overview about the ROI classes with colour of each class used for the classification and brief class description.
If the user didn't get the desired results from visualizing the image then it is recommended to start with less training data at first and then increase it accordingly as it is completely trying and viewing method. One can see the figure after the classification with the desired output in Figure 5 then move on to the next step.

ROI Classes Classes Colour Class Description Wood land
Non-wetland which contains large presence of medium to large trees. Urban land Area which have artificial surfaces such as houses and buildings Mountain Landform which rises above or area which is elevated from the surrounding land Field Non-wetland class where farming is present Grassland Non-wetland where small grass or bushes Water Exposed fresh or saline surface water Roads Artificial surface to travel on roads, foot-paths and Highways Barren land Non-wetland bare land with no or very less vegetation cover This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing Now to extract the data from the classified image, it has to be converted from raster to vector (Jovanovi et al., 2010) so the database file will be created that is .vcf file. Remember this part will take time and computation power as well. It takes around 2 hours to 8 hours and it depends on the size of the image, as size and time are directly proportional in this case the larger the area more time it will be consumed.

Change detection
Now take the .vcf file which consists of class_name, class_id, parts, length, and area as headers. Here make a new filtered database file with the top 100 entries from each class to observe that changes occurred in the area. For this task excel sheet was used to perform sorting using the filter to get top largest 100 entries from length and area of the class. Once this task is finished the user will have 2 database files from both the images. Now before proceeding further another ROI is to be created for ground truth (Whyte et al., 2018;Wieland et al., 2016) that is being utilized in accuracy assessment. So, after the creation of 2nd ROI one can simply reconcile it with ROI map on both the classified image. Now go to Classification and select post-classification in envi (Hay et al., 2005;Jingping et al., 2016) from that further select Confusion Matrix and, in that select, using ground truth ROI, as a result, the overall accuracy plus the kappa-coefficient (Verbeeck et al., 2011) will be calculated by the software.
This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing And in the last step to observe the change detection (Jovanovi et al., 2010) , one has to provide both the classified image from both the timeline and thus calculate the thematic change in order to find out the change.

Results
The results suggest that this novel approach for classification is promising because in the first place the pre-processing of the data is done by setting certain parameters and weights in image bands (Jingping et al., 2016) before giving it to segmentation. After this process an overall accuracy of 90+ for both the images with a kappa coefficient of 0.915 for SVM (Support Vector Machine) (Ma et al., n.d.;Wieland et al., 2016) May image and 0.886 for ML (Maximum Likelihood) April image as shown in Table 3. And both the algorithm that is used in this research has provided the best results. Thus, SVM is great for urban classification and wooden land while ML is good for classifying road and grassland. One can say that the data here is a delicious cake and the Machine learning algorithm provides the icing on this cake. This approach takes less time than the general approach where the classification is done in OBIA. Not only it is taking less time but also gives great accuracy result. The output of change detection over a duration of time is shown in Table 4 and for thematic change refer to Figure 6.  This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

Discussion
From the past studies, it has already been known that object-based approach provides superior results over pixel-based. The main reason behind this is it allows to incorporate the textual and spatial features (Gu et al., 2015). However, the approach our team developed is not only giving better results than traditional pixel-based approach but also gives a competitive performance to object-based approach Moreover, the multi-resolution (Figure 4) algorithm can't be ignored which plays an important role in the object-based approach. So, decided to keep the multi-resolution (Hay et al., 2005;Whyte et al., 2018) algorithm because of it distinct benefits but the software that does it is quite expensive (i.e. eCognition Developer 9.0 (Jingping et al., 2016;Ma et al., n.d.)), although there are always other options like open-source or economical ways, then there is a trade-off as the algorithm will perform differently and may not give the desired result.
Here, comes the twist after performing the OBIA, decided to replace the traditional classification process and change it with ML ( Figure 2). So, the challenge was how to replace it and get the whole thing working. Accordingly, the defining of rule-set in the object-based approach for the classification is got finally replaced by ROIs (Table 2) where the training dataset was collected and classified into all the classes of one's interest. Here, the recommendation would be to start with small training data and change accordingly for the classes that do not perform well. Also do remember that there may be conflicts in the classes when they are classified by the ML algorithm. For instance, for the first time we have only kept field as a class and didn't included grassland or barren land. As a result, every land was being either classified as a field or classified as no classes even though it should have been classified as grassland or barren land. In order, to overcome this problem it was decided to add these classes that are mentioned in the above line as well. After all this, once the classification process is over then do test visually that the algorithm is giving the desired results.
Here, if the algorithm is performing well then proceed further for the accuracy. But if not then increase the training data for the classes that one thinks is not performing well. Now the question is how to select the ML algorithm that is being used in this research. By doing comprehensive (Costa et al., 2017;Wieland et al., 2016) exploration decided to work with Support Vector Machine & Maximum Likelihood (Costa et al., 2017;Yu et al., 2016) and both performed well ( Figure 5). From the research, it was found/ categorized that SVM is giving good results for classes like Urban-land, Fields, Water while Maximum Likelihood is giving good results for the small classes like road and wooden land. The key feature that really stands' out in this approach is the classification part in which the model is doing great with a small amount of training data. Now, referring to the data (i.e. Image quality) any medium quality (Mitkari et al., 2017;Yu et al., 2016) to high-quality image will work with this approach. And if one cannot get hands-on high-resolution (Hay et al., 2005;Verbeeck et al., 2011) data, then go with sentinel-2 (Whyte et al., 2018) data rather than going for landsat-2 (Wieland et al., 2016). This study was done keeping mainly India in mind as India is a developing nation so the Chandigarh area was selected for this research.
Now talking about the accuracy, got a pretty great accuracy with kappa coefficient value (Table 3). The study didn't stop here and also found the change detection for both the image for which thematic (Jovanovi et al., 2010) change was performed(Table 4 & Figure 6). This approach is new and quite promising that will surely help in future research. All in all, this approach opens new avenues to explore rather than to only use OBIA.
This article is a non-peer reviewed preprint uploaded at EarthArXiv, and in preparation for submission to International Journal of Remote Sensing

Conclusion
This is a novel approach that is introduced after the object-based approach and pixel-based approach. And the main motto is getting more accurate results that will benefit the monitoring of change detections and classification and will help in the remote sensing field. Already object-based was giving great results but is limited to high-resolution data mostly. So, here this approach comes into the picture and breaks this limitation as well as it will save both resources and time. This study was mostly performed in comparison with Object-based approach only because based on the performance object-based approach is more accurate than pixel-based. Future research has scope to conduct studies by experimenting on different ML algorithms and even apply deep-learning algorithms. There are certain deep-learning research already underway but it will take time to achieve the accuracy that this approach and object-based approach provides.