Advertisement

Topic-Aware Visual Citation Tracing via Enhanced Term Weighting for Efficient Literature Retrieval

  • Youbing Zhao
  • Hui WeiEmail author
  • Shaopeng Wu
  • Farzad Parvinzamir
  • Zhikun Deng
  • Xia Zhao
  • Nikolaos Ersotelos
  • Feng Dong
  • Gordon Clapworthy
  • Enjie Liu
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 737)

Abstract

Efficient retrieval of scientific literature related to a certain topic plays a key role in research work. While little has been done on topic-enabled citation filtering in traditional citation tracing, this paper presents visual citation tracing of scientific papers with document topics taken into consideration. Improved term selection and weighting are employed for mining the most relevant citations. A variation of the TF-IDF scheme, which uses external domain resources as references is proposed to calculate the term weighting in a particular domain. Moreover document weight is also incorporated in the calculation of term weight from a group of citations. A simple hierarchical word weighting method is also presented to handle keyword phrases. A visual interface is designed and implemented to interactively present the citation tracks in chord diagram and Sankey diagram.

Keywords

Text mining Citation tracing Data management Ontology Term weighting TF-IDF Visualization 

Notes

Acknowledgments

The research is supported by the FP7 Programme of the European Commission within projects Dr Inventor [FP7-ICT-611383] and CARRE [FP7-ICT-611140]. We would like to thank the European Commission for the funding and thank the project officers and reviewers for their indispensable support for both of the projects.

References

  1. 1.
    Wei, H., Zhao, Y., Liu, E., Wu, S., Deng, Z., Parvinzamir, F., Dong, F.: Management of scientific documents and visualization of citation relationships using weighted key scientific terms. In: DATA 2016, pp. 135–143 (2016)Google Scholar
  2. 2.
    Wei, H., Wu, S., Zhao, Y., Deng, Z., Ersotelos, N., Parvinzamir, F., Liu, B., Liu, E., Dong, F.: Data mining, management and visualization in large scientific corpuses. Edutainment 2016, 371–379 (2016)Google Scholar
  3. 3.
    Grolinger, K., HigashinoEmail, W., Tiwari, A., Capretz, M.: Data management in cloud environments: NoSQL and NewSQL data stores. J. Cloud Comput. Adv. Syst. Appl. Adv. Syst. Appl. 2(1), 2–22 (2013)CrossRefGoogle Scholar
  4. 4.
    Kivikangas, P., Ishizuka, M.: Improving semantic queries by utilizing UNL ontology and a graph database. In: Proceedings of the 6th IEEE International Conference on Semantic Computing, pp. 83–86 (2012)Google Scholar
  5. 5.
  6. 6.
    Tsai, F.S., Kwee, A.T.: Experiments in term weighting for novelty mining. Expert Syst. Appl. 38(11), 14094–14101 (2011)Google Scholar
  7. 7.
    Debole, F., Sebastiani, F.: Supervised term weighting for automated text categorization. In: Proceedings of the 2003 ACM Symposium on Applied Computing, pp. 784–788. ACM Press (2003)Google Scholar
  8. 8.
    Zhang, Y., Tsai, F.S.: Combining named entities and tags for novel sentence detection. In: Proceedings of the WSDM Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2009), pp. 30–34 (2009)Google Scholar
  9. 9.
    Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  10. 10.
    Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A study on term weighting for text categorization: a novel supervised variant of tf.idf. In: Proceedings of the 4th International Conference on Data Management Technologies and Applications, pp. 26–37 (2015)Google Scholar
  11. 11.
    Li, F., Pan, S.J., Jin, O., Yang, Q., Zhu, X.: Cross-domain co-extraction of sentiment and topic lexicons. In: Proceedings of the 50th Annual Meeting Association for Computational Linguistics: Long Papers (ACL 2012), vol. 1, pp. 410–419 (2012)Google Scholar
  12. 12.
    Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: Cross-domain text classification through iterative refining of target categories representations. In: Proceedings of the 6th International Conference on Knowledge Discovery & Information Retrieval (KDIR) (2014)Google Scholar
  13. 13.
    Alencar, A.B., Oliveira, M.C., Paulovich, F.V.: Seeing beyond reading: a survey on visual text analytics. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 2(6), 476–492 (2012)CrossRefGoogle Scholar
  14. 14.
    Fu, S.: A survey on visual text analytics (2015). http://www.cse.ust.hk/~sfuaa/data/pqe.pdf
  15. 15.
    Federico, P., Heimerl, F., Koch, S., Miksch, S.: A survey on visual approaches for analyzing scientific literature and patents. TVCG (2016)Google Scholar
  16. 16.
    Zhao, D., Strotmann, A.: Analysis and Visualization of Citation Networks. Synthesis Lectures on Information Concepts Retrieval and Services, vol. 7(1) (2015)Google Scholar
  17. 17.
    Chen, C.: CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. J. Am. Soc. Inf. Sci. Technol. 57(3), 359–377 (2006)CrossRefGoogle Scholar
  18. 18.
    Zhang, J., Chen, C., Li, J.: Visualizing the intellectual structure with paper-reference matrices. IEEE TVCG 15(6), 1153–1160 (2009)Google Scholar
  19. 19.
    Stasko, J., Choo, J., Han, Y., Hu, M., Pileggi, H., Sadana, R., Stolper, C.: Citevis: exploring conference paper citation data visually. Poster IEEE Vis. (2013)Google Scholar
  20. 20.
    Gorg, C., Liu, Z., Kihm, J., Choo, J., Park, H., Stasko, J.: Combining computational analyses and interactive visualization for document exploration and sense making in jigsaw. IEEE TVCG 19(10), 1646–1663 (2013)Google Scholar
  21. 21.
    Doerk, M., Riche, N., Ramos, G., Dumais, S.: Pivotpaths: strolling through faceted information spaces. IEEE TVCG 18(12), 2709–2718 (2012)Google Scholar
  22. 22.
    van Eck, N., Waltman, L.: CitNetExplorer: a new software tool for analyzing and visualizing citation network. J. Inf. 8(4), 802–823 (2014)CrossRefGoogle Scholar
  23. 23.
    Heimerl, F., Han, Q., Koch, S., Ertl, T.: CiteRivers: visual analytics of citation patterns. IEEE TVCG 22(1), 190–199 (2016)Google Scholar
  24. 24.
    ACM SIGGRAPH. www.siggraph.org
  25. 25.
  26. 26.
    Fensel, D., Hendler, J., Lieberman, H., Wahlster, W., Berners-Lee, T.: Sesame: An Architecture for Storing and Querying RDF Data and Schema Information. In: MIT Press eBook Chapters: Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential, pp. 197–222 (2005)Google Scholar
  27. 27.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan., V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia (2002)Google Scholar
  28. 28.
  29. 29.
    Huang, H., Dong, Z.: Research on architecture and query performance based on distributed graph database Neo4j. In: Proceedings of the 3rd International Conference Consumer Electronics, Communications and Networks (CECNet), pp. 533–536 (2013)Google Scholar
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
    Thakker, D., Sman, T., Lakin, P.: GATE Jape Grammar Tutorial, Version 1.0, A, Pictures, UK (2009)Google Scholar
  36. 36.
    Microsoft Academic Search (MAS) API. http://academic.research.microsoft.com/
  37. 37.
  38. 38.
    Riehmann, P., Hanfler, M., Froehlich, B.: Interactive sankey diagrams. In: Proceedings of the IEEE Symposium on Information Visualization, pp. 233–240 (2005)Google Scholar
  39. 39.
    Blei, M., Ng, Y., Jordan, I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)zbMATHGoogle Scholar
  40. 40.
    Havre, S., Hetzler, E., Whitney, P., Nowell, L.: Themeriver: visualizing thematic changes in large document collections. IEEE Trans. Vis. Comput. Graph. 8(1), 9–20 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Youbing Zhao
    • 1
  • Hui Wei
    • 1
    Email author
  • Shaopeng Wu
    • 1
  • Farzad Parvinzamir
    • 1
  • Zhikun Deng
    • 1
  • Xia Zhao
    • 1
  • Nikolaos Ersotelos
    • 1
  • Feng Dong
    • 1
  • Gordon Clapworthy
    • 1
  • Enjie Liu
    • 1
  1. 1.University of BedfordshireLutonUK

Personalised recommendations