Abstract
Applying data mining techniques to help researchers discover, understand, and predict research trends is a highly beneficial but challenging task. The existing researches mainly use topics extracted from literatures as objects to build predicting model. To get more accurate results, we use concepts instead of topics constructing a model to predict their rise and fall trends, considering the rhetorical characteristics of them. The experimental results based on ACL1965-2017 literature dataset show the clues of the scientific trends can be found in the rhetorical distribution of concepts. After adding the relevant concepts’ information, the predict model’s accuracy rate can be significantly improved, compared to the prior topic-based algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: Conference on Empirical Methods in Natural Language Processing, pp. 366–376, Association for Computational Linguistics (2010)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. USA 101(Suppl. 1), 5228–5235 (2004)
Anderson, A., Dan, M.F., Dan, J.: Towards a Computational History of the ACL: 1980–2008. In: ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries, pp. 13–21 (2013)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the International Conference on Machine Learning, pp. 113–120 (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J Mach. Learn. Res. Arch. 3, 993–1022 (2003)
Hall, D., Jurafsky, D., Manning, C.D.: Studying the history of ideas using topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, 25–27 October 2008, Honolulu, Hawaii, USA, A Meeting of Sigdat, A Special Interest Group of the ACL, pp. 363–371. DBLP (2008)
Shibata, N., Kajikawa, Y., Takeda, Y., Matsushima, K.: Detecting emerging research fronts based on topological measures in citation networks of scientific publications. Technovation 28(11), 758–775 (2008)
Shibata, N., Kajikawa, Y., Takeda, Y., Matsushima, K.: Comparative study on methods of detecting research fronts using different types of citation. J. Assoc. Inf. Sci. Technol. 60(3), 571–580 (2009)
Mane, K.K., Börner, K.: Mapping topics and topic bursts in PNAS. Proc. Natl. Acad. Sci. USA 101(Suppl 1), 5287–5290 (2004)
Guo, H., Weingart, S., Brner, K.: Mixed-indicators model for identifying emerging research areas. Scientometrics 89(1), 421–435 (2011)
Small, H.: Tracking and predicting growth areas in science. Scientometrics 68(3), 595–610 (2006)
Small, H.: Interpreting maps of science using citation context sentiments: a preliminary investigation. Scientometrics 87(2), 373–388 (2011)
Prabhakaran, V., Hamilton, W.L., Dan, M.F., Dan, J.: Predicting the Rise and Fall of Scientific Topics from Trends in their Rhetorical Framing. In: Meeting of the Association for Computational Linguistics, pp. 1170–1180 (2016)
Grineva, M., Grinev, M., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: International Conference on World Wide Web, WWW 2009, Madrid, Spain, April, pp. 661–670. DBLP (2009)
Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization. In: MMIES 08 Workshop on Multi-source Multilingual Information Extraction & Summar, vol. 64, pp. 17–24 (2008)
Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: Conference on Empirical Methods in Natural Language Processing, vol. 1, PP. 257–266 (2009)
Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retrieval 2(4), 303–336 (2000)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: National Conference on Artificial Intelligence, pp. 855–860. AAAI Press (2008)
Yuan, J., Gao, F., Ho, Q., Dai, W., Wei, J., Zheng, X., et al.: LightLDA: big topic models on modest computer clusters. 1351–1361 (2014)
Wang, X., Mccallum, A.: Topics over time: a non-Markov continuous-time model of topical trends. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424-433. ACM (2006)
Turney, P.D.: Learning to extract keyphrases from text. cs.lg/0212013(cs.LG/0212013) (2002)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)
Teufel, S.: Argumentative zoning: information extraction from scientific text (1999)
Liakata, M.: Zones of conceptualisation in scientific papers: a window to negative and speculative statements. In: The Workshop on Negation and Speculation in Natural Language Processing, pp. 1-4. Association for Computational Linguistics (2010)
Nguyen, T.D., Kan, M.-Y.: Keyphrase extraction in scientific publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-77094-7_41
Mihalcea, R.: Textrank: bringing order into texts. In: EMNLP, pp. 404–411 (2004)
Acknowledgement
The work is supported by National Key Research and Development Program of China (2017YFB1002101), NSFC key project (U1736204, 61661146007) and THUNUS NExT Co-Lab.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, J., Pan, L., Li, J., Du, X. (2019). Predicting Concept-Based Research Trends with Rhetorical Framing. In: Zhao, J., Harmelen, F., Tang, J., Han, X., Wang, Q., Li, X. (eds) Knowledge Graph and Semantic Computing. Knowledge Computing and Language Understanding. CCKS 2018. Communications in Computer and Information Science, vol 957. Springer, Singapore. https://doi.org/10.1007/978-981-13-3146-6_10
Download citation
DOI: https://doi.org/10.1007/978-981-13-3146-6_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3145-9
Online ISBN: 978-981-13-3146-6
eBook Packages: Computer ScienceComputer Science (R0)