Abstract
In this paper, we have proposed a fuzzy rough set-based semi-supervised learning algorithm (FRL) to label categorical noun phrase instances from a given corpus (unstructured web pages). Our model uses noun phrases which are described in terms of sets of co-occurring contextual patterns. The performance of the FRL algorithm is compared with the Tolerance Rough Set-based (TPL) algorithm and Coupled Bayesian Sets-based(CBS) algorithm. Based on average precision value over 11 categories, FRL performs better than CBS but not as good as TPL. To the best of our knowledge, fuzzy rough sets has not been applied to the problem of unstructured text categorization.
Keywords
This research has been supported by the NSERC Discovery grant. Special thanks to Cenker Sengoz and to Prof. Estevam R. Hruschka Jr.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cornelis, C., De Cock, M., Radzikowska, A.M.: Fuzzy rough sets: from theory into practice. Handbook of Granular Computing (2008)
De Cock, M., Cornelis, C.: Fuzzy rough set based web query expansion. In: Proceedings of Rough Sets and Soft Computing in IAT, pp. 9–16 (2005)
Mitchell, T., et al.: Never-ending learning. In: Proceedings of the AAAI 2015 (2015)
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11(5), 341–356 (1982)
Sengoz, C., Ramanna, S.: Learning relational facts from the web: a tolerance rough set approach. Pattern Recognit. Lett. 67(P2), 130–137 (2015)
Verma, S., Hruschka, E.R.: Coupled bayesian sets algorithm for semi-supervised learning and information extraction. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 307–322. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33486-3_20
Zadeh, L.: Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 177(19), 111–127 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bharadwaj, A., Ramanna, S. (2017). Fuzzy Rough Set-Based Unstructured Text Categorization. In: Mouhoub, M., Langlais, P. (eds) Advances in Artificial Intelligence. Canadian AI 2017. Lecture Notes in Computer Science(), vol 10233. Springer, Cham. https://doi.org/10.1007/978-3-319-57351-9_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-57351-9_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57350-2
Online ISBN: 978-3-319-57351-9
eBook Packages: Computer ScienceComputer Science (R0)