Advertisement

INSIDER: An Android Application for Automatic Categorization of News Items by Using LDA

  • Pratima SarkarEmail author
  • Natasha Sah
  • Abhinoy Pradhan
Conference paper
  • 77 Downloads
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 12)

Abstract

The objective of this work is to propose a real time newsfeed system for an Android smart phone. This work proposes a system for automatic categorization of news items into a standard set of categories. Newsfeed system i.e. ‘INSIDER’ changes the things around a little. This android application is useful to provide relevant and up to date information as per user’s requirement. It provides a platform to personalize and organize their news feed based on their interest. The main purpose of implementing this system is to increase the accessibility of important notices. It classifies the messages category wise. Latent Dirichlet Allocation (LDA) topic modeling technique is used to achieve the goal of this work. LDA algorithm is “generative probabilistic model” basically works on discrete data. This work is validated using a data set taken from “newsapi.org”.

Keywords

Topic modeling Latent Dirichlet Allocation Text categorization 

References

  1. 1.
    Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(5), 993–1022 (2003)zbMATHGoogle Scholar
  2. 2.
    Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)CrossRefGoogle Scholar
  3. 3.
    Forman, G., Guyon, I., Elisseeff, A.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3(7–8), 1289–1305 (2003)zbMATHGoogle Scholar
  4. 4.
    Tong, Z., Zhang, H.: A text mining research based on LDA topic modelling. In: Conference: The Sixth International Conference on Computer Science, Engineering and Information Technology, pp. 201–210 (2016)Google Scholar
  5. 5.
    Kou, Z.: Stacked graphical learning. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, December 2007Google Scholar
  6. 6.
    Li, J., Sun, M.: Scalable term selection for text categorization. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Learning (EMNLP-CoNLL), pp. 774–782 (2007)Google Scholar
  7. 7.
    Kang, J.H., Lerman, K., Getoor, L.: LA-LDA: a limited attention topic model for social recommendation. In: Greenberg, A.M., Kennedy, W.G., Bos, N.D. (eds.) Social Computing, Behavioral-Cultural Modeling and Prediction, SBP 2013. Lecture Notes in Computer Science, vol. 7812. Springer, Heidelberg (2013)Google Scholar
  8. 8.
    Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., Welling, M.: Fast collapsed Gibbs sampling for latent Dirichlet allocation. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, New York (2008)Google Scholar
  9. 9.
    Salton, G., Wong, A., Yang, A.C.S.: A vector space model for automatic indexing. Commun. ACM 18, 229–237 (1975)CrossRefGoogle Scholar
  10. 10.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringSikkim Manipal UniversityTadongIndia

Personalised recommendations