Skip to main content

Predicting High Impact Academic Papers Using Citation Network Features

  • Conference paper
Trends and Applications in Knowledge Discovery and Data Mining (PAKDD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7867))

Included in the following conference series:

Abstract

Predicting future high impact academic papers is of benefit to a range of stakeholders, including governments, universities, academics, and investors. Being able to predict ‘the next big thing’ allows the allocation of resources to fields where these rapid developments are occurring. This paper develops a new method for predicting a paper’s future impact using features of the paper’s neighbourhood in the citation network, including measures of interdisciplinarity. Predictors of high impact papers include high early citation counts of the paper, high citation counts by the paper, citations of and by highly cited papers, and interdisciplinary citations of the paper and of papers that cite it. The Scopus database, consisting of over 24 million publication records from 1996-2010 across a wide range of disciplines, is used to motivate and evaluate the methods presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Australian Government: Australia in the Asian Century White Paper (2012)

    Google Scholar 

  2. Department of Industry, Innovation, Science, Research and Tertiary Education: 2012 National Research Investment Plan (2012)

    Google Scholar 

  3. Office of the Chief Scientist of Australia: Health of Australian Science (2012)

    Google Scholar 

  4. Price, D.: Networks of scientific papers. Science 149(3683), 510–515 (1965)

    Article  Google Scholar 

  5. Castellano, C., Radicchi, F.: On the fairness of using relative indicators for comparing citation performance in different disciplines. Archivum Immunologiae et Therapiae Experimentalis 57(2), 85–90 (2009)

    Article  Google Scholar 

  6. Radicchi, F., Fortunato, S., Castellano, C.: Universality of citation distributions: Toward an objective measure of scientific impact. Proc. Natl. Acad. Sci. USA 105(45), 17268–17272 (2008)

    Article  Google Scholar 

  7. Waltman, L., van Eck, N.J., van Raan, A.F.: Universality of citation distributions revisited. J. Am. Soc. Inf. Sci. Technol. 63(1), 72–77 (2012)

    Article  Google Scholar 

  8. Small, H.: Tracking and predicting growth areas in science. Scientometrics 68(3), 595–610 (2006)

    Article  Google Scholar 

  9. Upham, S., Small, H.: Emerging research fronts in science and technology: patterns of new knowledge development. Scientometrics 83(1), 15–38 (2010)

    Article  Google Scholar 

  10. Adams, J.: Early citation counts correlate with accumulated impact. Scientometrics 63(3), 567–581 (2005)

    Article  Google Scholar 

  11. Manjunatha, J.N., Sivaramakrishnan, K.R., Pandey, R.K., Murthy, M.N.: Citation prediction using time series approach KDD cup 2003 (task 1). SIGKDD Explor. Newsl. 5(2), 152–153 (2003)

    Article  Google Scholar 

  12. Shibata, N., Kajikawa, Y., Matsushima, K.: Topological analysis of citation networks to discover the future core articles. J. Am. Soc. Inf. Sci. Technol. 58(6), 872–882 (2007)

    Article  Google Scholar 

  13. Castillo, C., Donato, D., Gionis, A.: Estimating number of citations using author reputation. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 107–117. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Yan, R., Tang, J., Liu, X., Shan, D., Li, X.: Citation count prediction: learning to estimate future citations for literature. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 1247–1252 (2011)

    Google Scholar 

  15. Yogatama, D., Heilman, M., O’Connor, B., Dyer, C., Routledge, B.R., Smith, N.A.: Predicting a scientific community’s response to an article. In: 0, pp. 594–604 (2011)

    Google Scholar 

  16. Bettencourt, L., Kaiser, D., Kaur, J., Castillo-Chávez, C., Wojick, D.: Population modeling of the emergence and development of scientific fields. Scientometrics 75(3), 495–518 (2008)

    Article  Google Scholar 

  17. Goffman, W., Newill, V.A.: Generalization of epidemic theory: An application to the transmission of ideas. Nature 204(4955), 225–228 (1964)

    Article  Google Scholar 

  18. Barabási, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Article  MathSciNet  Google Scholar 

  19. Burt, R.S.: Structural holes: the social structure of competition. Harvard University Press, Cambridge (1992)

    Google Scholar 

  20. Chen, C.: Predictive effects of structural variation on citation counts. J. Am. Soc. Inf. Sci. Technol. 63(3), 431–449 (2012)

    Article  Google Scholar 

  21. Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., Pellegrino, D.: Towards an explanatory and computational theory of scientific discovery. J. Informetr. 3(3), 191–209 (2009)

    Article  Google Scholar 

  22. Adams, J., Jackson, L., Marshall, S.: Bibliometric analysis of interdisciplinary research. Report to Higher Education Funding Council for England (2007)

    Google Scholar 

  23. Larivière, V., Gingras, Y.: On the relationship between interdisciplinarity and scientific impact. J. Am. Soc. Inf. Sci. Technol. 61(1), 126–131 (2009)

    Article  Google Scholar 

  24. Nankani, E., Simoff, S.: Predictive analytics that takes in account network relations: A case study of research data of a contemporary university. In: Proceedings of the 8th Australasian Data Mining Conference, AusDM 2009, pp. 99–108 (2009)

    Google Scholar 

  25. Scopus: Scopus custom technical requirements, Version 2.0 (2009)

    Google Scholar 

  26. Guo, H., Weingart, S., Börner, K.: Mixed-indicators model for identifying emerging research areas. Scientometrics 89(1), 421–435 (2011)

    Article  Google Scholar 

  27. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  28. Liaw, A., Wiener, M.: Package ‘randomForest’: Breiman and Cutler’s random forests for classification and regression (2012)

    Google Scholar 

  29. R Documentation: Fitting linear models (2012)

    Google Scholar 

  30. Therneau, T.M., Atkinson, E.: An introduction to recursive partitioning using the RPART routines (2011)

    Google Scholar 

  31. R Documentation: Test for association/correlation between paired samples (2012)

    Google Scholar 

  32. Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)

    Google Scholar 

  33. Shibata, N., Kajikawa, Y., Sakata, I.: Link prediction in citation networks. J. Am. Soc. Inf. Sci. Technol. 63(1), 78–85 (2012)

    Article  Google Scholar 

  34. McNamara, D.: A new method for the prediction of emerging fields of research. Honours thesis, Australian National University (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McNamara, D., Wong, P., Christen, P., Ng, K.S. (2013). Predicting High Impact Academic Papers Using Citation Network Features. In: Li, J., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science(), vol 7867. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40319-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40319-4_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40318-7

  • Online ISBN: 978-3-642-40319-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics