DOI QR코드

DOI QR Code

Performance Analysis of Cloud-Net with Cross-sensor Training Dataset for Satellite Image-based Cloud Detection

  • Kim, Mi-Jeong (Defense AI Technology Center, Agency for Defense Development) ;
  • Ko, Yun-Ho (Department of Mechatronics Engineering, Chungnam National University)
  • Received : 2022.02.06
  • Accepted : 2022.02.22
  • Published : 2022.02.28

Abstract

Since satellite images generally include clouds in the atmosphere, it is essential to detect or mask clouds before satellite image processing. Clouds were detected using physical characteristics of clouds in previous research. Cloud detection methods using deep learning techniques such as CNN or the modified U-Net in image segmentation field have been studied recently. Since image segmentation is the process of assigning a label to every pixel in an image, precise pixel-based dataset is required for cloud detection. Obtaining accurate training datasets is more important than a network configuration in image segmentation for cloud detection. Existing deep learning techniques used different training datasets. And test datasets were extracted from intra-dataset which were acquired by same sensor and procedure as training dataset. Different datasets make it difficult to determine which network shows a better overall performance. To verify the effectiveness of the cloud detection network such as Cloud-Net, two types of networks were trained using the cloud dataset from KOMPSAT-3 images provided by the AIHUB site and the L8-Cloud dataset from Landsat8 images which was publicly opened by a Cloud-Net author. Test data from intra-dataset of KOMPSAT-3 cloud dataset were used for validating the network. The simulation results show that the network trained with KOMPSAT-3 cloud dataset shows good performance on the network trained with L8-Cloud dataset. Because Landsat8 and KOMPSAT-3 satellite images have different GSDs, making it difficult to achieve good results from cross-sensor validation. The network could be superior for intra-dataset, but it could be inferior for cross-sensor data. It is necessary to study techniques that show good results in cross-senor validation dataset in the future.

Keywords

1. Introduction

Since satellite images generally include clouds in the atmosphere, it is essential to detect or mask clouds before satellite image processing. Change detection technique to detect changes such as floods and forest fires as well as changes in military purposes is an example which processes satellite images. However, most satellite images contain clouds randomly, making it difficult to detect changes. Therefore, it is necessary to perform cloud detection before change detection.

Clouds detection techniques can be largely divided into method using physical properties and method using only images. Fmask is the most popular cloud detection algorithm using physical properties (Foga, 2017; Qiu, 2019). Fmask is a program that automatically detects and masks clouds, cloud shadows, snow and water. It was developed for Landsat-based satellite images and Sentinel-based satellite images. Clouds in the Fmask are detected based on characteristics such as “white”, “bright”, “cold” and “high”. Therefore, auxiliary data as well as all bands’ images are required. However, with the recent development of deep learning, techniques using only images of specific bands are being studied. Because cloud detection in deep learning determines the presence or absence of clouds for each pixel, techniques applied to image segmentation are being used. Not only the initial CNN algorithm but also various techniques using the modified U-Net have been applied to cloud detection. However, obtaining accurate training datasets is more important than a network configuration in image segmentation field for cloud detection. In previous research, QA band of Landsat8 images or manually annotated training datasets that have been publicly opened were used to create ground truth (GT) images (Hughes, 2014; Mohajerani, 2019). Recently, cloud detection GT of KOMPSAT-3 images was released through AIHUB site.

The KOMPSAT-3 images have a spatial resolution of 2 m, which is higher than that of the Landsat8 images of 30 m. The KOMPSAT-3 cloud dataset released through AIHUB site is more diverse and accurate than other existing datasets. In this paper, the network trained by the existing Landsat8 cloud dataset was compared to the network trained by the KOMPSAT-3 cloud dataset. It makes it possible to understand a performance dependence of the network on the dataset and to identify the need for developing a general purpose network that is more robust for cross-sensor validation. Chapter 2 explains general datasets and networks for cloud detection. Chapter 3 describes the dataset, network structure and loss function used in the experiment. Chapter 4 shows the experimental results and research directions based on simulation results.

2. Related Work

Images acquired in Landsat8 satellite and Sentinel2 satellite are easily accessible and are widely used in the deep learning research of satellite image. The commonly used datasets for cloud detection are L8-Biome, L8- SPARCS and L8-Cloud datasets. The L8-biome dataset provided by USGS is labeled dataset of 6000*5400 Landsat8 images with three classes: cloud, thin cloud and clear. The L8-SPARCS dataset originally created by M. Joseph Hughes (Hughes, 2014) is composed of the 1000*1000 sub-scenes of the Landsat8 images with five classes: cloud, cloud shadow, snow/ice, water and clear. The L8-Cloud dataset provided by a Cloud-Net author is comprised of 95 labeled Landsat8 images. The S2-Hollstein dataset divides images acquired from the Sentinel2 satellite into six classes of cloud, circus, snow, shadow, water and clarity in pixel units, not images. As mentioned in the previous chapter, cloud dataset of KOMPSAT-3 images with thick clouds, thin clouds, cloud shadows and clear label were recently released through AIHUB site.

There are many reference algorithms for cloud detection. Most of them use modified versions of the U-Net architecture like as RS-Net (Jeppesen, 2019), Seg-Net (Badrinarayanan, 2017) and Cloud-Net (Mohajerani, 2019) and so on. Each network used a variety of datasets to show their results and were validated in heterogenous manners. RS-Net used L8-Biome dataset and L8-SPARCS dataset to train two networks and validate each of networks. Seg-Net used mixed data of L7-Irish and L8-Biome datasets. Cloud- Net used L8-Cloud dataset which is manually created by author based on Fmask. These differences make it difficult to determine which network shows a better overall performance.

3. Experimental Setting

1) Dataset

The L8-Cloud dataset and the KOMPSAT-3 dataset were used to clearly confirm the effectiveness of the network.

In order to use Cloud-Net network which recently showed good performance, L8-Cloud dataset was selected among datasets using Landsat8 satellite images. The L8-Cloud dataset used for comparison was downloaded from Kaggle site. The L8-Cloud dataset is composed of 384*384 4 channel (RGB and NIR band) images and GT with two classes: cloud and clear. The total training data were 34, 701 patches.

The KOMPSAT-3 cloud dataset is composed of 130 images of about 6000*5000 size acquired from KOMPSAT-3/3A satellites and corresponding GT. Each image is a 32-bit TIFF image overlapping RGB and NIR images. Ground truth of each image is provided as a 24-bit PNG image consisting of thick clouds in red, thin clouds in light green, cloud shadows in yellow and clean in black. In order to make input patch as 384*384 size which is identical with L8-Cloud dataset’s patches, each image was cut without overlap. The total training data were 24, 360 patches of which 20% were used for validation during the training process. For comparison with Cloud-Net (Mohajerani, 2019), the KOMPSAT-3 dataset ‘s GT was processed by labeling only thick clouds into clouds and converting them into two classes as shown in Fig. 1.

OGCSBN_2022_v38n1_103_f0001.png 이미지

Fig. 1. GT transformation (a) Original GT into four classes (b) Converted GT into two classes.

Patches were randomly used as input images by applying augmentation such as rotation, flipping, and zooming. Since low batch size and low learning rate or high batch size and high learning rate converge network well (Kandel, 2020), the batch size was set to 12, the start learning rate to 1e-4. The other hyper parameters were set to be the same. The input image of the network was resized to 192*192.

2) Network

Various networks including DeepLabv3 are used in image segmentation field. In this experiment, the network used in Cloud-Net (Mohajerani, 2019) was applied as shown in Fig. 2. This network consists of contracting arm (light green, orange), bridge (sky blue) and expanding path (gray) based on residual U-Net. Unlike a typical U-Net, skip connection added all the results of the contracting arm through the yellow enhanced feedback block in Fig. 2. Contracting arm was constructed based on a residual block as shown in Fig. 3. The light green color in Fig. 3 refers to the basic contracting arm and the improved contracting arm refers to the orange configuration in which two convolution block, batch normalization, and ReLu Activation are added to the contracting arm. Bridge includes dropout based on a contracting arm. The total number of parameters in this network is 36, 465, 793.

OGCSBN_2022_v38n1_103_f0002.png 이미지

Fig. 2. Cloud-Net network structure.

OGCSBN_2022_v38n1_103_f0003.png 이미지

Fig. 3. The structure of contracting arm, improved contracting arm and bridge.

3) Loss

Among the region-based loss functions, the Jaccard loss function is generally used as follows.

\(\begin{gathered} \operatorname{Jacc}(y, \hat{y})=\frac{T P}{T P+F P+F N}=\frac{|y \cap \hat{y}|}{|y \cup \hat{y}|}=, \\ \frac{|y \cap \hat{y}|}{|y|+|\hat{y}|-|y \cap \hat{y}|} \\ \text { Jaccard Loss }(y, \hat{y})=\frac{\sum y * \hat{y}+\alpha}{\sum y+\sum \hat{y}-\sum y^{*} \hat{y}+\alpha} \end{gathered}\)

In order to prevent gradient exploding, smoothing factor αwas added. In order to solve the problem when ∑ybecomes 0, a loss function was added as follows.

Complementary Jaccard Loss(y, yˆ) =

\(1-\frac{\sum(1-y) *(1-\widehat{y})+\alpha}{\sum(1-y)+\sum(1-\widehat{y})-\sum(1-y) *(1-\widehat{y})+\alpha},\)

when ∑y = 0

4. Experimental Results

DGX A100 containing 128 CPU cores and 8 A-100 GPUs was used to train large amounts of data. The maximum epoch was set to 200. As shown in Fig. 4, the results show that the loss value of both datasets are converged at about epoch 100 and the convergence loss value of the KOMPSAT-3 dataset was about two times that of the L8-Cloud dataset.

OGCSBN_2022_v38n1_103_f0004.png 이미지

Fig. 4. Convergence Loss according to training dataset.

1) Intra-Dataset Validation

In order to verify the network trained by KOMPSAT- 3 dataset, the test data obtained from the same sensor and procedure were used. The identical data was not used for training data and test data. The results show that cloud detection results are visually good as shown in Fig. 5. As shown in the red circle of the right column in Fig. 5, it was confirmed that the performance was insufficient in the details due to insufficient training. The average performance indicators for 13 test images were shown in Table 1.

Table 1. Average Performance Index (KOMPSAT-3 Intra- Dataset)

OGCSBN_2022_v38n1_103_t0001.png 이미지

OGCSBN_2022_v38n1_103_f0005.png 이미지

Fig. 5. Intra-dataset experiment results of network trained by KOMPSAT-3 dataset.

In order to verify the network trained by L8-Cloud dataset, the test data obtained from the same sensor and procedure were used. The results show that cloud detection results are visually good for their intra-dataset as shown in Fig. 6. The average performance indicators for 20 test images were shown in Table 2.

Table 2. Average Performance Index (L8-Cloud Intra- Dataset)

OGCSBN_2022_v38n1_103_t0002.png 이미지

OGCSBN_2022_v38n1_103_f0006.png 이미지

Fig. 6. Intra-dataset experiment results of network trained by L8-Cloud dataset.

2) Cross-sensor Validation

Cross-sensor validation was simulated to confirm the performance of the network that showed good performance on Cloud-Net [4]. The test data from intra-dataset of KOMPSAT-3 dataset were used as input into the network trained by L8-Cloud dataset. The results are shown in Fig. 7. The performance indexes for 13 test images are shown in Table 3. Most of images showed high precision and specificity, low recall and jaccard index values. These values indicate that False Positive is small and False Negative is large and the network can’t detect the class well but is highly trustable when it works.

OGCSBN_2022_v38n1_103_f0007.png 이미지

Fig. 7. Cross-sensor validation experiment results of network trained by L8-Cloud dataset.

Table 3. Average Performance Index (Cross-sensor)

OGCSBN_2022_v38n1_103_t0003.png 이미지

5. Conclusion

To verify the network for cloud detection, cross sensor validation was performed. Cloud-Net which recently shows good cloud detection performance was trained by two datasets: L8-Cloud dataset and KOMPSAT-3 dataset. The test data from intra-dataset of KOMPSAT-3 dataset were used. The simulation results show that the training convergence loss of L8- Cloud dataset is smaller than that of KOMPSAT-3 dataset, but Cloud-Net trained with KOMPSAT-3 dataset shows good performance on L8-Cloud dataset. Landsat8 and KOMPSAT-3 satellite images have different GSDs, and even in the same band, the wavelength is different, making it difficult to achieve good results from cross-sensor validation. Although the network shows superior results for intra-dataset, it can show poor results for cross-sensor data. It is necessary to study techniques that show good results in cross-senor validation in the future.

References

  1. Badrinarayanan, V., A. Kendall, and R. Cipolla, 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12): 2481-2495. https://doi.org/10.1109/TPAMI.2016.2644615
  2. Foga, S., P. L. Scaramuzza, S. Guo, Z. Zhu, R.D. Dilley Jr, T. Beckmann, G.L. Schmidt, J.L. Dwyer, M.J. Hughes, and B. Laue, 2017. Cloud detection algorithm comparison and validation for operational Landsat data products, Remote Sensing of Environment, 194: 379-390. https://doi.org/10.1016/j.rse.2017.03.026
  3. Hughes, M.J. and D.J. Hayes, 2014. Automated detection of cloud and cloud shadow in single-date Landsat imagery using neural networks and spatial post-processing, Remote Sensing, 6(6): 4907-4926. https://doi.org/10.3390/rs6064907
  4. Jeppesen, J.H., R.H. Jacobsen, F. Inceoglu, and T.S. Toftegaard, 2019. A cloud detection algorithm for satellite imagery based on deep learning, Remote Sensing of Environment, 229: 247-259. https://doi.org/10.1016/j.rse.2019.03.039
  5. Kandel, I. and M. Castelli, 2020. The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset, ICT Express, 6(4): 312-315. https://doi.org/10.1016/j.icte.2020.04.010
  6. Mohajerani, S. and P. Saeedi, 2019. Cloud-Net: An end-to-end cloud detection algorithm for Landsat 8 imagery, Proc. of 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, Jul. 28-Aug. 2, pp. 1029-1032.
  7. Qiu, S., Z. Zhu, and B. He, 2019. Fmask 4.0:Improved cloud and cloud shadow detection in Landsats 4-8 and Sentinel-2 imagery, Remote Sensing of Environment, 231: 111205. https://doi.org/10.1016/j.rse.2019.05.024