Stacked Autoencoder Based Feature Extraction and Superpixel Generation for Multifrequency PolSAR Image Classification

Gadhiya, Tushar; Tangirala, Sumanth; Roy, Anil K.

doi:10.1007/978-3-030-34872-4_37

Tushar Gadhiya¹⁴,
Sumanth Tangirala¹⁴ &
Anil K. Roy¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11942))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1277 Accesses
1 Citations

Abstract

In this paper we are proposing classification algorithm for multifrequency Polarimetric Synthetic Aperture Radar (PolSAR) image. Using PolSAR decomposition algorithms 33 features are extracted from each frequency band of the given image. Then, a two-layer autoencoder is used to reduce the dimensionality of input feature vector while retaining useful features of the input. This reduced dimensional feature vector is then applied to generate superpixels using simple linear iterative clustering (SLIC) algorithm. Next, a robust feature representation is constructed using both pixel as well as superpixel information. Finally, softmax classifier is used to perform classification task. The advantage of using superpixels is that it preserves spatial information between neighboring PolSAR pixels and therefore minimizes the effect of speckle noise during classification. Experiments have been conducted on Flevoland dataset and the proposed method was found to be superior to other methods available in the literature.

You have full access to this open access chapter, Download conference paper PDF

PolSAR image classification based on Laplacian Eigenmaps and superpixels

Article Open access 21 November 2017

Research on hyper-spectral remote sensing image classification by applying stacked de-noising auto-encoders neural network

Article 15 March 2021

Hyperspectral Image Classification Using Denoised Stacked Auto Encoder-Based Restricted Boltzmann Machine Classifier

Keywords

1 Introduction

Synthetic Aperture Radar (SAR) has been popularized in recent years as a technique that captures high resolution microwave images of the earth surface. With the SAR technique, an image can be taken regardless of weather conditions or time of the day unlike optical sensors. The other major reason to use SAR is its operability over multiple frequency bands, viz, from the X band to P band. The penetrability of the L and the P bands allows SAR to capture data from even below ground level. In case of PolSAR it draws information of the target in four polarization states which makes it an information rich technique. These are some of the reasons that establish the superiority of SAR over optical data capturing techniques.

One of the very early approaches of classification of multifrequency PolSAR data was attempted using DNN (Dynamic Neural Network) [1]. Deep learning based multiplayer autoencoder network was also proposed [2]. It uses Kronecker product of eigenvalues of coherency matrix to combine multiple bands information. Recently, an Optimized Wishart Network (OWN) for classification of multifrequency PolSAR data was reported [3].

Superpixel algorithms in conjunction with deep neural networks have gained popularity for capturing spatial information of a PolSAR image. Hou et al. presented [4] a way of using Pauli decomposition of PolSAR image to generate superpixels and autoencoder to extract features from the coherency matrix of each PolSAR pixel. Prediction of the network was then used to run a KNN (k nearest neighbors) algorithm in each superpixel to determine the class of the complete superpixel. Guo et al. introduced a method to apply Fuzzy clustering algorithm over PolSAR images to generate superpixels [5]. This method considered only those pixels which are similar to their neighbors, while pixels which are in all probability badly conditioned were ignored. In another work [6] Cloude decomposition features were used to generate superpixels and CNN (Convolutional Neural Network) to perform the classification. Adaptive nonlocal approach for extracting spatial information was also proposed [7]. It uses stacked sparse autoencoder to extract robust features.

In this paper we propose a classification algorithm for multifrequency PolSAR images. It is organized as follows: Sect. 2 discusses the proposed network architecture; Sect. 3 explains the experiments conducted on the Flevoland dataset; Sect. 4 concludes with our observations and the discussions based on the results.

2 Proposed Methodology

Phase information is very useful for PolSAR image classification. Since we are using a real valued neural network it may not be able to relate amplitude and phase information. It considers the real and imaginary components as separate features. This brings the need of decomposition features to the network instead of the raw features of coherency matrix. We have created 33 dimensional feature vector extracted from one frequency band of a PolSAR image as shown in Table 1. Hence combining information of all three bands we get 99 dimensional feature vector corresponding to each PolSAR pixel.

Freeman decomposition [8] has 3 features which describe the power of single-bounce(odd-bounce), double-bounce and volume scattering of the incident waves. The Huynen decomposition [9] aims to form a single scattering matrix to model the scattering mechanism of the surface. The Cloude decomposition [10] aims to model the surface scattering using the parameters such as Entropy, Anisotropy and other angles which describe the scattering mechanism. The Krogager decomposition [11] aims to factorize the scattering matrix as the combination of a sphere, a di-plane and a helix. The Yamaguchi decomposition [12] adds a Helix scattering component to the Freeman decomposition to model complicated man-made structures.

Table 1. List of extracted features

Full size table

As shown in Fig. 1 our proposed network architecture contains three modules. The first module is a two layer autoencoder network. The purpose of this module is to reduce the dimensionality of the input vector by learning efficient representation of combined PolSAR frequency bands information.

Let $\mathbf {X_i}$ be the input feature vector of $i^{th}$ PolSAR pixel. Let $\mathbf {W_{11}}, \mathbf {W_{12}}, \mathbf {b_{11}}$ and $\mathbf {b_{12}}$ be weights and biases of two encoder layers. A hidden representation of input feature vector $\mathbf {X_i}$ can be calculated as $\mathbf {h_i} = f(\mathbf {W_{12}}^Tf(\mathbf {W_{11}}^T\mathbf {X_i} + \mathbf {b_{11}})+\mathbf {b_{12}})$, where f is a tanh activation function. Let $\mathbf {W_{21}}, \mathbf {W_{22}}, \mathbf {b_{21}}$ and $\mathbf {b_{22}}$ be weights and biases of two decoder layers. A reconstructed input vector can be calculated as $\mathbf {X_i}^{'} = f(\mathbf {W_{22}}^Tf(\mathbf {W_{21}}^T\mathbf {h_i} + \mathbf {b_{21}})+\mathbf {b_{22}})$. We have used mean square error to train the autoencoder. The cost function of first module of the proposed network is given as follows:

$$\begin{aligned} \begin{aligned} J_1&= \beta (\mathbf {\Vert W_{11} \Vert ^2_2 + \Vert W_{12} \Vert ^2_2 + \Vert W_{21} \Vert ^2_2 + \Vert W_{22} \Vert ^2_2}) \\&+\,\alpha \frac{1}{N} \sum _{i=1}^{N} \Vert \mathbf {X_i - X_i}^{'}\Vert ^2 +\gamma \sum _{t=1}^{U_1} KL(\rho \Vert \rho _t). \end{aligned} \end{aligned}$$

(1)

Here $\beta $ is a regularization parameter, $\alpha $ is the learning rate, $\gamma $ is the sparsity parameter, N is total number of training samples, $U_1$ is a size of reduced dimensional feature vector $\mathbf {h_i}, \rho $ is a sparsity parameter, $\rho _t$ is the average activation value of the $t^{th}$ hidden unit and $KL(\rho \Vert \rho _t)$ is a Kullback-Leibler divergence which encourages sparsity in the hidden representation $\mathbf {h_i}$. For our application we have set the value of $U_1=5$. Once the training of the first module is complete we disconnect the decoder layers.

Next, we feed the entire PolSAR image as an input to this network to obtain hidden representations of all pixels of the PolSAR image. We will use this hidden representation of all pixels of the PolSAR image to generate superpixels. This process has two advantages: (i) it contains feature information of all bands and (ii) its dimensionality is substantially reduced in comparison to the input feature vector. To generate superpixels we have used an algorithm similar to simple linear iterative clustering (SLIC) [14]. Instead of giving an RGB image as input, we will give the hidden representation of PolSAR image obtained from using first module of the proposed network as input. Using SLIC we measure the distance between any two pixels which is given by Eq. 2:

$$\begin{aligned} D = \frac{m}{s}D_s + D_h, \end{aligned}$$

(2)

where $D_s = \sqrt{(x_i-x_j)^2+(y_i - y_j)^2}$ and $D_h = ||\mathbf {h_i}-\mathbf {h_j} ||_2$. Here m is a parameter controlling the relative weight between $D_s$ and $D_h$ and s is a size of search space [14]. $(x_i, y_i)$ and $(x_j, y_j)$ are the positions of $i^{th}$ and $j^{th}$ PolSAR pixels on Euclidean plane. This distance measure was finally used for superpixel generation.

The second module of our proposed architecture combines each pixel and corresponding superpixel information to construct robust feature vector. This is done by letting $\mathbf {S_j}$ be the $j^{th}$ superpixel and $\mathbf {h_i} \in \mathbf {S_j}$. Let $\mathbf {c_j}$ be the cluster center of $\mathbf {S_j}$. To extract robust feature using both pixel and superpixel information input for second autoencoder $\mathbf {H_i}$ can be constructed as $\mathbf {H_i} = [\mathbf {h_i}; \mathbf {c_j}]$ [7]. Since the dimensionality of the hidden representation $h_i$ is low, a single layer autoencoder is sufficient for an effective reconstruction. Let $\mathbf {r_i} = f(\mathbf {W_{13}}^T\mathbf {H_i} + \mathbf {b_{13}})$ be the activation value obtained at hidden layer of second autoencoder. Let $\mathbf {H_i^{'}} = f(\mathbf {W_{23}}^T\mathbf {r_i} + \mathbf {b_{23}})$ be the reconstructed input. The cost function for the second autoencoder can now readily be described by:

$$\begin{aligned} \begin{aligned} J_2 = \beta (\Vert \mathbf {W_{13}} \Vert ^2_2 + \Vert \mathbf {W_{23}} \Vert ^2_2) + \lambda \frac{1}{N} \sum _{i=1}^{N} \Vert \mathbf {H_i} - \mathbf {H_i}^{'}\Vert ^2&+ \alpha \sum _{t=1}^{U_2} KL(\rho \Vert \rho _t). \end{aligned} \end{aligned}$$

(3)

Here $U_2$ is the size of feature vector $\mathbf {r_i}$. Once the training of the second module of the autoencoder is complete we again disconnect the decoder layer. Output of the second module of autoencoder contains both pixel and superpixel information. Finally we use softmax classifier to obtain predicted probability distribution.

3 Experiments

With this theoretical model we conduct the following experiments. We start with the details of dataset chosen for our experiments. After that we analyze the performance of the proposed network for different band combinations. Finally we will perform the analysis of performance for different feature decompositions. The complete experiment on the proposed network architecture was implemented using Python 3.6 and it was executed on a 1.60 GHz machine with 8 GB RAM for all the experiments.

Experiments have been conducted on a dataset of Flevoland [15], an agricultural tract in the Netherlands, whose data was captured by the NASA/Jet Propulsion Laboratory on 15 June, 1991. This dataset has often been viewed as the benchmark dataset for PolSAR applications. The intensities after Pauli decomposition of the dataset have been used to form an RGB image as shown in Fig. 2(a–c). The ground truth of the data set shown in Fig. 2(d) identifies a total of 15 classes of land cover.

Table 2. Class-wise accuracies for different band combinations

Full size table

We have evaluated the accuracy of each class with respect to all possible combinations of the data acquired in the three frequency bands. Please note that all the overall accuracy mentioned in the paper is the precision. Figure 3 shows classification maps obtained by the proposed method and Table 2 shows classwise accuracies for all possible band combinations. From Table 2 it can be observed that while the network with just the C band information recognizes Onions and Lucerne with high accuracy, it fails in the case of classes such as Beet or Oats. While in the case of the C band, the network fails to recognize Wheat accurately. It can be noted that the lack of information to recognize Wheat is compensated by the help of the L band. In the case of the P band, the network fails to accurately identify Onions and Peas but correctly identifies the majority of the Rapeseed and Fruit. It can also be observed that the L band performs better individually than the C or the P bands.

A total of six PolSAR decomposition methods have been applied to the PolSAR data and their features have been given as input to the proposed network. The significance of each decomposition technique is evident from the Table 3. It can be seen that the Krogager and Freeman decompositions fail to provide enough information to recognize Beet and Onions. On the other hand Cloude decomposition and Huynen decomposition provide sufficient information for these crops respectively.

Table 3. Class-wise accuracies for different features

Full size table

The proposed method for classification of multifrequency PolSAR image is also compared with other methods available in the literature. Comparison of overall accuracies is reported in Table 4. ANN [2] and OWN [3] have used small subset of Flevoland dataset containing 7 classes. On the other hand Stein-SRC [16] used ground truth with 14 classes. For parity in comparison with our results we have calculated the overall accuracy of the proposed method using ground truth of 7, 14 as well as 15 classes. We observe from Table 4 that the proposed method outperforms all the three methods [2, 3, 16].

Table 4. Classification’s overall accuracy comparison

Full size table

4 Conclusion

In this paper a classification network for multifrequency PolSAR data is proposed. Proposed network involves three modules. First module reduces the dimensionality of the input feature vector. Output of the first module has been used to generate superpixels. The second module constructs robust feature vector using each pixel and its corresponding superpixel information. Finally the last module of the proposed network conducts classification using softmax classifier. It is observed that combining multiple frequency bands information improves overall classification accuracy. We validated our proposed network on Flevoland dataset resulted in 99.59% overall classification accuracy. Experimental result shows that this proposed network outperforms other reported methods available in the literature.

References

Chen, K.S., Huang, W.P., Tsay, D.H., Amar, F.: Classification of multifrequency polarimetric SAR imagery using a dynamic learning neural network. IEEE Trans. Geosci. Remote Sens. 34, 814–820 (1996)
Article Google Scholar
De, S., Ratha, D., Ratha, D., Bhattacharya, A., Chaudhuri, S.: Tensorization of multifrequency PolSAR data for classification using an autoencoder network. IEEE Geosci. Remote Sens. Lett. 15, 542–546 (2018)
Article Google Scholar
Gadhiya, T., Roy, A.K.: Optimized wishart network for an efficient classification of multifrequency PolSAR data. IEEE Geosci. Remote Sens. Lett. 15, 1720–1724 (2018)
Article Google Scholar
Hou, B., Kou, H., Jiao, L.: Classification of polarimetric SAR images using multilayer autoencoders and superpixels. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 9, 3072–3081 (2016)
Article Google Scholar
Guo, Y., Jiao, L., Wang, S., Wang, S., Liu, F., Hua, W.: Fuzzy superpixels for polarimetric SAR images classification. IEEE Trans. Fuzzy Syst. 26, 2846–2860 (2018)
Article Google Scholar
Hou, B., Yang, C., Ren, B., Jiao, L.: Decomposition-feature-iterative-clustering-based superpixel segmentation for PolSAR image classification. IEEE Geosci. Remote Sens. Lett. 15, 1239–1243 (2018)
Article Google Scholar
Hu, Y., Fan, J., Wang, J.: Classification of PolSAR images based on adaptive nonlocal stacked sparse autoencoder. IEEE Geosci. Remote Sens. Lett. 15, 1050–1054 (2018)
Article Google Scholar
Freeman, A., Durden, S.L.: A three-component scattering model for polarimetric SAR data. IEEE Trans. Geosci. Remote Sens. 36, 963–973 (1998)
Article Google Scholar
Huynen, J.R.: Phenomenological theory of radar targets. Electronmagnetic Scattering (1970)
Google Scholar
Cloude, S.R., Pottier, E.: An entropy based classification scheme for land applications of polarimetric SAR. IEEE Trans. Geosci. Remote Sens. 35, 68–78 (1997)
Article Google Scholar
Krogager, E.: New decomposition of the radar target scattering matrix. Electron. Lett. 26, 1525–1527 (1990)
Article Google Scholar
Yamaguchi, Y., Moriyama, T., Ishido, M., Yamada, H.: Four-component scattering model for polarimetric SAR image decomposition. IEEE Trans. Geosci. Remote Sens. 43, 1699–1706 (2005)
Article Google Scholar
Zhou, Y., Wang, H., Xu, F., Jin, Y.: Polarimetric SAR image classification using deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett. 13, 1935–1939 (2016)
Article Google Scholar
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012)
Article Google Scholar
Dataset: AIRSAR, NASA 1991. Retrieved from ASF DAAC on 7 December 2018
Google Scholar
Yang, F., Gao, W., Xu, B., Yang, J.: Multi-frequency polarimetric SAR classification based on riemannian manifold and simultaneous sparse representation. Remote Sens. 7(7), 8469–8488 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, India
Tushar Gadhiya, Sumanth Tangirala & Anil K. Roy

Authors

Tushar Gadhiya
View author publications
You can also search for this author in PubMed Google Scholar
Sumanth Tangirala
View author publications
You can also search for this author in PubMed Google Scholar
Anil K. Roy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tushar Gadhiya .

Editor information

Editors and Affiliations

Tezpur University, Tezpur, India
Bhabesh Deka
Indian Statistical Institute, Kolkata, India
Pradipta Maji
Indian Statistical Institute, Kolkata, India
Sushmita Mitra
Tezpur University, Tezpur, India
Dhruba Kumar Bhattacharyya
Indian Institute of Technology Guwahati, Guwahati, India
Prabin Kumar Bora
Indian Statistical Institute, Kolkata, India
Sankar Kumar Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gadhiya, T., Tangirala, S., Roy, A.K. (2019). Stacked Autoencoder Based Feature Extraction and Superpixel Generation for Multifrequency PolSAR Image Classification. In: Deka, B., Maji, P., Mitra, S., Bhattacharyya, D., Bora, P., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2019. Lecture Notes in Computer Science(), vol 11942. Springer, Cham. https://doi.org/10.1007/978-3-030-34872-4_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-34872-4_37
Published: 25 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34871-7
Online ISBN: 978-3-030-34872-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Stacked Autoencoder Based Feature Extraction and Superpixel Generation for Multifrequency PolSAR Image Classification

Abstract

Similar content being viewed by others

PolSAR image classification based on Laplacian Eigenmaps and superpixels

Research on hyper-spectral remote sensing image classification by applying stacked de-noising auto-encoders neural network

Hyperspectral Image Classification Using Denoised Stacked Auto Encoder-Based Restricted Boltzmann Machine Classifier

Keywords

1 Introduction

2 Proposed Methodology

3 Experiments

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Stacked Autoencoder Based Feature Extraction and Superpixel Generation for Multifrequency PolSAR Image Classification

Abstract

Similar content being viewed by others

PolSAR image classification based on Laplacian Eigenmaps and superpixels

Research on hyper-spectral remote sensing image classification by applying stacked de-noising auto-encoders neural network

Hyperspectral Image Classification Using Denoised Stacked Auto Encoder-Based Restricted Boltzmann Machine Classifier

Keywords

1 Introduction

2 Proposed Methodology

3 Experiments

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation