Abstract
In this article we propose and evaluate a method to extract terminological relationships from microblogs. The idea is to analyze archived microblogs (tweets for example) and then to trace the history of each term. Similar history indicates a relationship between terms. This indication can be validated using further processing. For example, if the term t1 and t2 were frequently used in Twitter at certain days, and there is a match in the frequency patterns over a period of time, then t1 and t2 can be related. Extracting standard terminological relationships can be difficult; especially in a dynamic context such as social media, where millions of microblogs (short textual messages) are published, and thousands of new terms are coined every day. So we are proposing to compile nonstandard raw repository of lexical units with unconfirmed relationships. This paper shows a method to draw relationships between time-sensitive Arabic terms by matching similar timelines of these terms. We use dynamic time warping to align the timelines. To evaluate our approach we elected 430 terms and we matched the similarity between the frequency patterns of these terms over a period of 30 days. Around 250 correct relationships were extracted with a precision of 0.65. These relationships were drawn without using any parallel text, nor analyzing the textual context of the term. Taking into consideration that the studied terms can be newly coined by microbloggers and their availability in standard repositories is limited.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
internetlivestats. Internet Live Stats - Internet Usage & Social Media Statistics (2015). http://www.internetlivestats.com/one-second/. Accessed 1 Feb 2015
Grinev, M., et al.: Analytics for the realtime web. Proc. VLDB Endow. 4, 1391–1394 (2011)
Kwak, H., et al.: What is Twitter, a social network or a news media? In: The 19th International World Wide Web (WWW) Conference. Raleigh, NC, USA (2010)
Uherčík, T., Šimko, M., Bieliková, M.: Utilizing microblogs for web page relevant term acquisition. In: van Emde Boas, P., Groen, F.C.A., Italiano, G.F., Nawrocki, J., Sack, H. (eds.) SOFSEM 2013. LNCS, vol. 7741, pp. 457–468. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35843-2_39
Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on Twitter. In: Proceedings of the Fifth International Conference on Weblogs and Social Media (2011)
Daoud, M., Boitet, C., Kageura, K., Kitamoto, A., Mangeot, M., Daoud, D.: Building specialized multilingual lexical graphs using community resources. In: Lacroix, Z. (ed.) RED 2009. LNCS, vol. 6162, pp. 94–109. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14415-8_7
Cabre, M.T., Sager, J.C.: Terminology: Theory, Methods, and Applications, vol. xii, 247 p. John Benjamins Publishing (1999)
Kim, Y.G., et al.: Terminology Construction Workflow for Korean-English Patent MT, vol. X, 5 p. MT Summit, Thailand (2005)
Hartley, A., Paris, C.: Multilingual document production: from support for translating to support for authoring. Mach. Transl. 12(1–2), 109–129 (1997)
Daoud, M., et al.: Constructing multilingual preterminological graphs using various online-community resources. In: The Eighth International Symposium on Natural Language Processing, SNLP 2009, Thailand, Bangkok. IEEE (2009)
Daoud, M., et al.: Constructing multilingual preterminological graphs using various online-community resources. In: the Eighth International Symposium on Natural Language Processing, SNLP 2009, Thailand, pp. 116–121 (2009)
Daoud, M., et al.: Passive and active contribution to multilingual lexical resources through online cultural activities. In: NLPKE 2010, Beijing, China, 4 p. (2010)
Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57301-1_5
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 43–49 (1978)
Twitter. Twitter (2015). twitter.com. Accessed 1 Feb 2015
Speriosu, M., et al.: Twitter polarity classification with label propagation over lexical links and the follower graph. In: Proceedings of the First Workshop on Unsupervised Learning in NLP, EMNLP 2011, pp. 53–63 (2011)
Zhao, W.X., et al.: Topical keyphrase extraction from Twitter. In: HLT 2011 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 379–388 (2011)
Daoud, D., Alkouz, A., Daoud, M.: Time-sensitive Arabic multiword expressions extraction from social networks. Int. J. Speech Technol. 19, 249–258 (2015)
Keogh, E., Ratanamahatana, C.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7, 358–386 (2004)
Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11, 561–580 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Daoud, D., Daoud, M. (2018). Extracting Terminological Relationships from Historical Patterns of Social Media Terms. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)