Abstract
The accuracy of textual keyword extraction is a major factor which influences the text semantic processing. Up to now, there is still much room to improve the precision of textual keyword extraction. To solve the problem, this paper proposes a method to optimize the textual keyword using priori knowledge. First, some priori knowledge for keyword extraction is discussed. Then, a keyword quality evaluation method based on semantic distance between keywords is proposed to judge whether a keyword is good or bad. Next, a textual keyword optimization method is proposed based on the keyword evaluation. Finally, some experiments are carried out, the results of which show that the proposed method can improve the accuracy of keyword extraction on domain texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Awajan, A.: Keyword extraction from Arabic documents using term equivalence classes. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 14(2), 7 (2015)
Yan, J.: Text Representation. Encyclopedia of Database Systems, pp. 3069–3072 (2016). doi:10.1007/978-0-387-39940-9_420
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data. ACM, pp. 1–12 (2000)
Hakenberg, J.: Text clustering. Encyclopedia of systems biology, pp. 2156–2157 (2013)
Ganiz, M.C., Tutkan, M., Akyokus, S.: A novel classifier based on meaning for text classification. In: International Symposium on Innovations in Intelligent Systems and Applications, pp. 1–5 (2015)
Koh, T., Goto, Y., Cheng, J.: A fast duplication checking algorithm for forward reasoning engines. In: Knowledge-Based Intelligent Information and Engineering Systems. Springer, Berlin, pp. 499–507 (2008)
Wei, X., Zeng, D.D.: ExNa: an efficient search pattern for semantic search engines. Concurr. Comput. Pract. Exp. 28(15), 4107–4124 (2016)
Wei, X., Luo, X., Li, Q., et al.: Online comment-based hotel quality automatic assessment using improved fuzzy comprehensive evaluation and fuzzy cognitive map. IEEE Trans. Fuzzy Syst. 23(1), 72–84 (2015)
Wei, X., Luo, X.: Concept extraction based on association linked network. In: Sixth International Conference on Semantics Knowledge and Grid, pp. 42–49 (2010)
Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 60(1), 493–502 (1972)
Wang, N., Wang, P., Zhang, B.: An improved TF-IDF weights function based on information theory. In: International Conference on Computer and Communication Technologies in Agriculture Engineering, pp. 439–441. IEEE (2010)
Xia, T., Chai, Y.: An improvement to TF-IDF: term distribution based term weight algorithm. J. Softw. 6(3), 413–420 (2011)
Beisswanger, E., Schulz, S., Stenzhorn, H., et al.: BioTop: an upper domain ontology for the life sciences: a description of its current structure, contents and interfaces to OBO ontologies. Appl. Ontol. 3(4), 205–212 (2008)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
HowNet. http://www.keenage.com
Peng, J., Detchon, S., Choo, K.-K.R., Ashman, H.: Astroturfing detection in social media: a binary n-gram-based approach. Concurr. Comput. Pract. Exp. (2017)
Peng, J., Choo, K.-K.R., Ashman, H.: User profiling in intrusion detection: a review. J. Netw. Comput. Appl. 72, 14–27 (2016)
Peng, J., Choo, K.-K.R., Ashman, H.: Bit-level n-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70, 171–182 (2016)
Peng, J., Choo, K.-K.R., Ashman, H.: Astroturfing detection in social media: using binary n-gram analysis for authorship attribution. In: Proceedings of 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2016), pp. 121–128, 23–26 August 2016. IEEE Computer Society Press (2016)
Acknowledgments
This research is partly supported by the Science Foundation of Shanghai under Grant No. 16ZR1435500, by the National Science Foundation of China under Grant No. 61562020, 61300202, 61332018, 61403084, by Program of Science and Technology Commission of Shanghai Municipality under Grant No. 15530701300, 15XD15202000, 16511101700, by the technical research program of Chinese ministry of public security under Grant No. 2015JSYJB26), and by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant No. 71621002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Li, L., Wei, X., Xu, Z. (2018). Textual Keyword Optimization Using Priori Knowledge. In: Abawajy, J., Choo, KK., Islam, R. (eds) International Conference on Applications and Techniques in Cyber Security and Intelligence. ATCI 2017. Advances in Intelligent Systems and Computing, vol 580. Edizioni della Normale, Cham. https://doi.org/10.1007/978-3-319-67071-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-67071-3_16
Published:
Publisher Name: Edizioni della Normale, Cham
Print ISBN: 978-3-319-67070-6
Online ISBN: 978-3-319-67071-3
eBook Packages: EngineeringEngineering (R0)