A Classification Model for Modeling Online Articles

  • Rula Alhalaseh
  • Ali Rodan
  • Azmi AlazzamEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1187)


Due to the constant evolvement of the web and the viral spread of online news on social media, predicting the popularity of a news article became a topic of interest to many categories of people ranging from marketing personnel to politicians. In this paper, we focus on comparing four classification algorithms on a dataset consisting of 39000 news articles taken from Mashable website. The articles were classified into two classes: Popular and not popular. Four different machine learning algorithms were used for classification of the data (KNN, Naïve bayes, Adaboost, and decision tree). Finally, the four classification methods were compared with each other.


Ada Boost KNN Naïve Bayes Decision Tree 


  1. 1.
    UCI machine learning repository. Accessed 16 Sept 2019
  2. 2.
    Mashable. Accessed 20 Sept 2019
  3. 3.
    Fernandes, K., Vinagre, P., Cortez, P.: A proactive intelligent decision support system for predicting the popularity of online news. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds.) EPIA 2015. LNCS (LNAI), vol. 9273, pp. 535–546. Springer, Cham (2015). Scholar
  4. 4.
    Tatar, A., Amorim, M., Fdida, S., Antoniadis, P.: A survey on predicting the popularity of web content. J. Internet Serv. Appl. 5(1), 1–20 (2014)CrossRefGoogle Scholar
  5. 5.
    Ahmed, M., Spagna, S., Huici, F., Niccolini, S.: A peek into the future: predicting the evolution of popularity in user generated content. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 607–616. ACM (2013)Google Scholar
  6. 6.
    Lee, J., Moon, S., Salamatian, K.: An approach to model and predict the popularity of online contents with explanatory factors. In: ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Canada, pp. 623–630 (2010)Google Scholar
  7. 7.
    Kaltenbrunner, A., Gomez, V., Lopez, V.: Description and prediction of Slashdot activity. In: Web Conference, LA-WEB 2007, pp. 57–66. IEEE, Latin American (2007)Google Scholar
  8. 8.
    SlashdotMedia: Slashdot: News for nerds, stuff that matters (2016). Accessed 11 Sept 2019
  9. 9.
    Szabo, G., Huberman, B.: Predicting the popularity of online content. Commun. ACM 53(8), 80–88 (2010)CrossRefGoogle Scholar
  10. 10.
    Tatar, A., Antoniadis, P., De Amorim, M., Fdida, S.: From popularity prediction to ranking online news. Soc. Network Anal. Min. 4(1), 1–12 (2014)Google Scholar
  11. 11.
    Lee, J., Moon, S., Salamatian, K.: Modeling and predicting the popularity of online contents with cox proportional hazard regression model. Neurocomputing 76(1), 134–145 (2012)CrossRefGoogle Scholar
  12. 12.
    Roja, B., Asur, S., Huberman, B.: The pulse of news in social media: forecasting popularity. In: Proceedings of the 6th International AAAI Conference on Weblogs and Social Media, ICWSM (2012)Google Scholar
  13. 13.
    Sasa, P., Osborne, M., Lavrenko, V.: RT to Win! Predicting message propagation in Twitter. In: ICWSM, Spain(2011)Google Scholar
  14. 14.
    Xuandong, L., Hu, X., Fang, H.: Is your story going to spread like a virus? Machine learning methods for news popularity prediction. In: CS229 (2015)Google Scholar
  15. 15.
    Hensinger, E., Flaounas, I., Cristianini, N.: Modelling and predicting news popularity. Pattern Anal. Appl. 16(4), 623–635 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Freund, Y., Schapire, R.: Decision-Theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Stehman, S.: Selecting and interpreting measures of thematic classification accuracy. Remote Sens. Environ. 62(1), 77–89 (1997)CrossRefGoogle Scholar
  18. 18.
    Altman, N.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)MathSciNetGoogle Scholar
  19. 19.
    Quinlan, J.: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)Google Scholar
  20. 20.
    Weka 3 - Data Mining with Open Source Machine Learning Software in Java, (2016). Accessed 14 Sept 2019

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Information TechnologyUniversity of JordanAmmanJordan
  2. 2.Computer Information ScienceHigher Colleges of TechnologyAl AinUAE

Personalised recommendations