, 44:53 | Cite as

ALDL: a novel method for label distribution learning

  • MAINAK BISWASEmail author


Data complexity has increased manifold in the age of data-driven societies. The data has become huge and inherently complex. The single-label classification algorithms that were discrete in their operation are losing prominence since the nature of data is not monolithic anymore. There are now cases in machine learning where data may belong to more than one class or multiple classes. This nature of data has created the need for new algorithms or methods that are multi-label in nature. Label distribution learning (LDL) is a new way to view multi-labelled algorithms. It tries to quantify the degree to which a label defines an instance. Therefore, for every instance there is a label distribution. In this paper, we introduce a new learning method, namely, angular label distribution learning (ALDL). It is based on the angular distribution function, which is derived from the computation of the length of the arc connecting two points in a circle. Comparative performance evaluation in terms of mean-square error (MSE) of the proposed ALDL has been made with algorithm adaptation of k-NN (AA-kNN), multilayer perceptron, Levenberg–Marquardt neural network and layer-recurrent neural network LDL datasets. MSE is observed to decrease for the proposed ALDL. ALDL is also highly statistically significant for the real world datasets when compared with the standard algorithms for LDL.


Machine learning multi-label classification multi-label learning label distribution learning 



algorithm adaptation k-nearest neighbour


angular distribution function


angular label distribution learning


artificial neural network


label distribution learning




layer-recurrent network


multilayer perceptron


mean-square error



We are thankful to the Media Lab Asia, Department of Electronics and Information Technology (DEITY), Ministry of Communications and Information Technology, Government of India, for providing us support for carrying out this work as a part of the sponsored project.


  1. 1.
    Zhang M L and Zhou Z H 2007 ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40: 2038–2048CrossRefGoogle Scholar
  2. 2.
    Zhou Z H et al 2012 Multi-instance multi-label learning. Artif. Intell. 176: 2291–2320MathSciNetCrossRefGoogle Scholar
  3. 3.
    Zhang Y, Zincir-Heywood N and Milios E 2005 Narrative text classification for automatic key phrase extraction in web document corpora. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management, pp. 51–58Google Scholar
  4. 4.
    Li T, Ogihara M and Li Q 2003 A comparative study on content-based music genre classification. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 282–289Google Scholar
  5. 5.
    Boutell M R et al 2004 Learning multi-label scene classification. Pattern Recogn. 37: 1757–1771CrossRefGoogle Scholar
  6. 6.
    Tsoumakas G and Ioannis K 2007 Multi-label classification: an overview. Int. J. Data Ware. Min. 3: 1–13CrossRefGoogle Scholar
  7. 7.
    Tsoumakas G, Ioannis K and Ioannis V 2011 Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23: 1079–1089CrossRefGoogle Scholar
  8. 8.
    Zhu S, Ji X, Xu W and Gong Y 2005 Multi-labelled classification using maximum entropy method. In: SIGIR, pp. 274–281Google Scholar
  9. 9.
    Ho T K 1998 The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20: 832–844CrossRefGoogle Scholar
  10. 10.
    Breiman L 1996 Bagging predictors. Mach. Learn. 24: 123–140zbMATHGoogle Scholar
  11. 11.
    Nasierding G, Abbas Z K and Grigorios T 2010 A triple-random ensemble classification method for mining multi-label data. In: IEEE International Conference on Data Mining Workshops, pp. 49–56Google Scholar
  12. 12.
    Read J, Bernhard P and Geoff H 2008 Multi-label classification using ensembles of pruned sets. In: Eighth IEEE International Conference on Data Mining, pp. 995–1000Google Scholar
  13. 13.
    Read J et al 2011 Classifier chains for multi-label classification. Mach. Learn. 85: 333MathSciNetCrossRefGoogle Scholar
  14. 14.
    Geng X 2016 Label distribution learning. IEEE Trans. Knowl. Data Eng. 28: 1734–1748CrossRefGoogle Scholar
  15. 15.
    Geng X and Luo L 2014 Multilabel ranking with inconsistent rankers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 3742–3747Google Scholar
  16. 16.
    Vapnik V N and Vlamimir V 1998 Statistical Learning Theory, Vol. 1, New York, WileyGoogle Scholar
  17. 17.
    Larose D T and Larose C D 2014 Discovering knowledge in data: an introduction to data mining. John Wiley & Sons, New JerseyGoogle Scholar
  18. 18.
    Geng X, Smith-Miles K and Zhou Z H 2009 Facial age estimation by multilinear subspace analysis. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 865–868Google Scholar
  19. 19.
    Geng X, Yin C and Zhou Z H 2013 Facial age estimation by learning from label distributions. IEEE Trans. Pattern Anal. Mach. Intell. 35: 2401–2412CrossRefGoogle Scholar
  20. 20.
    Han J, Pei J and Kamber M 2011 Data mining: concepts and techniques. Elsevier, MA, USA Google Scholar
  21. 21.
    Zhang M L and Zhou Z H 2005 A k-nearest neighbor based algorithm for multi-label classification. In: IEEE International Conference on Granular Computing, pp. 718–721Google Scholar
  22. 22.
    Raab D H and Green E H 1961 A cosine approximation to the normal distribution. Psychometrika. 26: 447–50MathSciNetCrossRefGoogle Scholar
  23. 23.
    Eisen M B et al 1998 Cluster analysis and display of genome-wide expression patterns. In: Proceedings of the National Academy of Sciences, pp. 14863–14868Google Scholar
  24. 24.
    Lyons M et al 1998 Coding facial expressions with gabor wavelets. In: Automatic Face and Gesture Recognition, 1998. Proceedings of Third IEEE International Conference, pp. 200–205Google Scholar
  25. 25.
    Ahonen T, Abdenour H and Matti P. 2006 Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell.. 28: 2037–2041CrossRefGoogle Scholar
  26. 26.
    Sanger, T D 1989 Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2: 459–473CrossRefGoogle Scholar
  27. 27.
    Hagan M T et al 1996 Neural Network Design, vol. 20, Boston, PWS Publishing Company.Google Scholar
  28. 28.
    Liu Q and Jun W 2008 A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming. IEEE Trans. Neural Netw. 19: 558–570CrossRefGoogle Scholar

Copyright information

© Indian Academy of Sciences 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringNational Institute of Technology GoaPondaIndia

Personalised recommendations