Abstract
Citation networks are the basis for main path analysis (MPA), which has become an important tool in bibliometric studies. MPA can be used to map the main body of work of a scientific field, highlighting its most important literature and chronological evolution. Its uses goes from surveying the state of the art of a given subject to selecting study material for new research. MPA is conducted on a citation network and there is a well established literature accounting for methods of finding the most relevant paths. However, the details of how the citation network is actually built are not richly described in the specialized literature. Manually relating the available references of a given field would prove to be a difficult task. Given this context, we propose an automatic method, providing a simple algorithm for building citation networks with computer implementations and preventing cyclic paths. The algorithm is built quantitatively and is applicable to studies on the mechanisms of any science field. As an example, we go through every proposed step to select the papers which constitute the main path of the literature on forecasting stock prices using machine learning techniques.
Notes
Pajek is freely available at http://mrvar.fdv.uni-lj.si/pajek/.
VOSviewer is freely available at http://www.vosviewer.com/.
References
Andersen, J. P., Bøgsted, M., Dybkær, K., Mellqvist, U. H., Morgan, G. J., Goldschmidt, H., et al. (2015). Global myeloma research clusters, output, and citations: A bibliometric mapping and clustering analysis. PLoS ONE, 10(1), e0116,966.
Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting techniques-part II: Soft computing methods. Expert Systems with Applications, 36(3), 5932–5941.
Barbieri, N., Ghisetti, C., Gilli, M., Marin, G., & Nicolli, F. (2016). A survey of the literature on environmental innovation based on main path analysis. Journal of Economic Surveys, 30(3), 596–623.
Batagelj, V. (2003). Efficient algorithms for citation network analysis. ArXiv preprint arXiv:cs/0309023.
Batagelj, V., & Mrvar, A. (1998). Pajek-program for large network analysis. Connections, 21(2), 47–57.
Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.
Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374. URL 6.
Chen, T., & Chen, F. (2016). An intelligent pattern recognition model for supporting investment decisions in stock market. Information Sciences, 346(1), 261–274.
Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80(1), 340–355.
Donaldson, R. G., & Kamstra, M. (1999). Neural network forecast combining with interaction effects. Journal of the Franklin Institute, 336(2), 227–236.
Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science, 178(4060), 471–479.
Garfield, E. (1979). Is citation analysis a legitimate evaluation tool? Scientometrics, 1(4), 359–375.
Garfield, E. (2009). From the science of science to Scientometrics: Visualizing the history of science with HistCite software. Journal of Informetrics, 3(3), 173–179.
Hummon, N. P., & Doreian, P. (1989). Connectivity in a citation network: The development of DNA theory. Social networks, 11(1), 39–63.
Kamstra, M., & Donaldson, G. (1996). Forecasting combined with neural networks. Journal of Forecast, 15(1), 49–61.
Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5), 5311–5319.
Kessler, M. M. (1963). Bibliographic coupling between scientific papers. Journal of the Association for Information Science and Technology, 14(1), 10–25.
Laboissiere, L. A., Fernandes, R. A., & Lage, G. G. (2015). Maximum and minimum stock price forecasting of Brazilian power distribution companies based on artificial neural networks. Applied Soft Computing, 35(1), 66–74.
Liang, H., Wang, J. J., Xue, Y., & Cui, X. (2016). IT outsourcing research from 1992 to 2013: A literature review based on main path analysis. Information & Management, 53(2), 227–251.
Liu, J. S., Chen, H. H., Ho, M. H. C., & Li, Y. C. (2014). Citations with different levels of relevancy: Tracing the main paths of legal opinions. Journal of the Association for Information Science and Technology, 65(12), 2479–2488.
Liu, J. S., & Lu, L. Y. (2012). An integrated approach for main path analysis: Development of the Hirsch index as an example. Journal of the American Society for Information Science and Technology, 63(3), 528–542.
Liu, J. S., Lu, L. Y., Lu, W. M., & Lin, B. J. (2013a). Data envelopment analysis 1978–2010: A citation-based literature survey. Omega, 41(1), 3–15.
Liu, J. S., Lu, L. Y., Lu, W. M., & Lin, B. J. (2013b). A survey of DEA applications. Omega, 41(5), 893–902.
Lu, L. Y., & Liu, J. S. (2013). An innovative approach to identify the knowledge diffusion path: The case of resource-based theory. Scientometrics, 94(1), 225–246.
Ma, V. C., & Liu, J. S. (2016). Exploring the research fronts and main paths of literature: A case study of shareholder activism research. Scientometrics, 109(1), 33–52.
McCain, K. W. (1990). Mapping authors in intellectual space: A technical overview. Journal of the American society for information science, 41(6), 433–443.
Nalimov. V., & Mulchenko, Z. (1969). Naukometriya. Izuchenie razvitiya nauki kak informat-sionnogo protsessa [scientometrics. The study of science development as an information process].
Narin, F. (1994). Patent bibliometrics. Scientometrics, 30(1), 147–155.
Olczyk, M. (2016). Bibliometric approach to tracking the concept of international competitiveness. Journal of Business Economics and Management, 17(6), 945–959.
Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33(6), 497–505.
Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications, 42(1), 259–268.
Price, D. J. D. S. (1965). Networks of scientific papers. Science, 149(3683), 510–515.
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the Association for Information Science and Technology, 24(4), 265–269.
Small, H. (1999). Visualizing science by citation mapping. Journal of the Association for Information Science and Technology, 50(9), 799–813.
Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.
White, H. D., & Griffith, B. C. (1981). Author cocitation: A literature measure of intellectual structure. Journal of the Association for Information Science and Technology, 32(3), 163–171.
Xiao, Y., Lu, L. Y., Liu, J. S., & Zhou, Z. (2014). Knowledge diffusion path analysis of data quality literature: A main path analysis. Journal of Informetrics, 8(3), 594–605.
Yeo, W., Kim, S., Lee, J. M., & Kang, J. (2014). Aggregative and stochastic model of main path identification: A case study on graphene. Scientometrics, 98(1), 633–655.
Zhong, X., & Enke, D. (2017). Forecasting daily stock market return using dimensionality reduction. Expert Systems with Applications, 67(1), 126–139.
Acknowledgements
We thank the EiC. Prof. Wolfgang Glänzel and the anonymous reviewers for their careful reading of our paper and their comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
This document was a collaborative effort.
Rights and permissions
About this article
Cite this article
Henrique, B.M., Sobreiro, V.A. & Kimura, H. Building direct citation networks. Scientometrics 115, 817–832 (2018). https://doi.org/10.1007/s11192-018-2676-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-018-2676-z