, Volume 115, Issue 2, pp 817–832 | Cite as

Building direct citation networks

  • Bruno Miranda Henrique
  • Vinicius Amorim Sobreiro
  • Herbert Kimura


Citation networks are the basis for main path analysis (MPA), which has become an important tool in bibliometric studies. MPA can be used to map the main body of work of a scientific field, highlighting its most important literature and chronological evolution. Its uses goes from surveying the state of the art of a given subject to selecting study material for new research. MPA is conducted on a citation network and there is a well established literature accounting for methods of finding the most relevant paths. However, the details of how the citation network is actually built are not richly described in the specialized literature. Manually relating the available references of a given field would prove to be a difficult task. Given this context, we propose an automatic method, providing a simple algorithm for building citation networks with computer implementations and preventing cyclic paths. The algorithm is built quantitatively and is applicable to studies on the mechanisms of any science field. As an example, we go through every proposed step to select the papers which constitute the main path of the literature on forecasting stock prices using machine learning techniques.


Citation networks Main path analysis Bibliometrics Algorithm 



We thank the EiC. Prof. Wolfgang Glänzel and the anonymous reviewers for their careful reading of our paper and their comments and suggestions.


  1. Andersen, J. P., Bøgsted, M., Dybkær, K., Mellqvist, U. H., Morgan, G. J., Goldschmidt, H., et al. (2015). Global myeloma research clusters, output, and citations: A bibliometric mapping and clustering analysis. PLoS ONE, 10(1), e0116,966.CrossRefGoogle Scholar
  2. Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting techniques-part II: Soft computing methods. Expert Systems with Applications, 36(3), 5932–5941.CrossRefGoogle Scholar
  3. Barbieri, N., Ghisetti, C., Gilli, M., Marin, G., & Nicolli, F. (2016). A survey of the literature on environmental innovation based on main path analysis. Journal of Economic Surveys, 30(3), 596–623.CrossRefGoogle Scholar
  4. Batagelj, V. (2003). Efficient algorithms for citation network analysis. ArXiv preprint arXiv:cs/0309023.
  5. Batagelj, V., & Mrvar, A. (1998). Pajek-program for large network analysis. Connections, 21(2), 47–57.zbMATHGoogle Scholar
  6. Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.CrossRefGoogle Scholar
  7. Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374. URL 6.CrossRefGoogle Scholar
  8. Chen, T., & Chen, F. (2016). An intelligent pattern recognition model for supporting investment decisions in stock market. Information Sciences, 346(1), 261–274.CrossRefGoogle Scholar
  9. Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80(1), 340–355.CrossRefGoogle Scholar
  10. Donaldson, R. G., & Kamstra, M. (1999). Neural network forecast combining with interaction effects. Journal of the Franklin Institute, 336(2), 227–236.CrossRefGoogle Scholar
  11. Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science, 178(4060), 471–479.CrossRefGoogle Scholar
  12. Garfield, E. (1979). Is citation analysis a legitimate evaluation tool? Scientometrics, 1(4), 359–375.CrossRefGoogle Scholar
  13. Garfield, E. (2009). From the science of science to Scientometrics: Visualizing the history of science with HistCite software. Journal of Informetrics, 3(3), 173–179.CrossRefGoogle Scholar
  14. Hummon, N. P., & Doreian, P. (1989). Connectivity in a citation network: The development of DNA theory. Social networks, 11(1), 39–63.CrossRefGoogle Scholar
  15. Kamstra, M., & Donaldson, G. (1996). Forecasting combined with neural networks. Journal of Forecast, 15(1), 49–61.CrossRefGoogle Scholar
  16. Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5), 5311–5319.CrossRefGoogle Scholar
  17. Kessler, M. M. (1963). Bibliographic coupling between scientific papers. Journal of the Association for Information Science and Technology, 14(1), 10–25.Google Scholar
  18. Laboissiere, L. A., Fernandes, R. A., & Lage, G. G. (2015). Maximum and minimum stock price forecasting of Brazilian power distribution companies based on artificial neural networks. Applied Soft Computing, 35(1), 66–74.CrossRefGoogle Scholar
  19. Liang, H., Wang, J. J., Xue, Y., & Cui, X. (2016). IT outsourcing research from 1992 to 2013: A literature review based on main path analysis. Information & Management, 53(2), 227–251.CrossRefGoogle Scholar
  20. Liu, J. S., Chen, H. H., Ho, M. H. C., & Li, Y. C. (2014). Citations with different levels of relevancy: Tracing the main paths of legal opinions. Journal of the Association for Information Science and Technology, 65(12), 2479–2488.CrossRefGoogle Scholar
  21. Liu, J. S., & Lu, L. Y. (2012). An integrated approach for main path analysis: Development of the Hirsch index as an example. Journal of the American Society for Information Science and Technology, 63(3), 528–542.MathSciNetCrossRefGoogle Scholar
  22. Liu, J. S., Lu, L. Y., Lu, W. M., & Lin, B. J. (2013a). Data envelopment analysis 1978–2010: A citation-based literature survey. Omega, 41(1), 3–15.CrossRefGoogle Scholar
  23. Liu, J. S., Lu, L. Y., Lu, W. M., & Lin, B. J. (2013b). A survey of DEA applications. Omega, 41(5), 893–902.CrossRefGoogle Scholar
  24. Lu, L. Y., & Liu, J. S. (2013). An innovative approach to identify the knowledge diffusion path: The case of resource-based theory. Scientometrics, 94(1), 225–246.MathSciNetCrossRefGoogle Scholar
  25. Ma, V. C., & Liu, J. S. (2016). Exploring the research fronts and main paths of literature: A case study of shareholder activism research. Scientometrics, 109(1), 33–52.CrossRefGoogle Scholar
  26. McCain, K. W. (1990). Mapping authors in intellectual space: A technical overview. Journal of the American society for information science, 41(6), 433–443.CrossRefGoogle Scholar
  27. Nalimov. V., & Mulchenko, Z. (1969). Naukometriya. Izuchenie razvitiya nauki kak informat-sionnogo protsessa [scientometrics. The study of science development as an information process].Google Scholar
  28. Narin, F. (1994). Patent bibliometrics. Scientometrics, 30(1), 147–155.CrossRefGoogle Scholar
  29. Olczyk, M. (2016). Bibliometric approach to tracking the concept of international competitiveness. Journal of Business Economics and Management, 17(6), 945–959.CrossRefGoogle Scholar
  30. Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33(6), 497–505.CrossRefGoogle Scholar
  31. Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications, 42(1), 259–268.CrossRefGoogle Scholar
  32. Price, D. J. D. S. (1965). Networks of scientific papers. Science, 149(3683), 510–515.CrossRefGoogle Scholar
  33. Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the Association for Information Science and Technology, 24(4), 265–269.Google Scholar
  34. Small, H. (1999). Visualizing science by citation mapping. Journal of the Association for Information Science and Technology, 50(9), 799–813.Google Scholar
  35. Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.CrossRefGoogle Scholar
  36. White, H. D., & Griffith, B. C. (1981). Author cocitation: A literature measure of intellectual structure. Journal of the Association for Information Science and Technology, 32(3), 163–171.Google Scholar
  37. Xiao, Y., Lu, L. Y., Liu, J. S., & Zhou, Z. (2014). Knowledge diffusion path analysis of data quality literature: A main path analysis. Journal of Informetrics, 8(3), 594–605.CrossRefGoogle Scholar
  38. Yeo, W., Kim, S., Lee, J. M., & Kang, J. (2014). Aggregative and stochastic model of main path identification: A case study on graphene. Scientometrics, 98(1), 633–655.CrossRefGoogle Scholar
  39. Zhong, X., & Enke, D. (2017). Forecasting daily stock market return using dimensionality reduction. Expert Systems with Applications, 67(1), 126–139.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2018

Authors and Affiliations

  • Bruno Miranda Henrique
    • 1
  • Vinicius Amorim Sobreiro
    • 1
  • Herbert Kimura
    • 1
  1. 1.Faculty of Economics, Management and AccountingUniversity of BrasiliaBrasíliaBrazil

Personalised recommendations