Abstract
Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of electronic information items. Classifications describe their contents using natural language labels, which has proved very effective in manual classification. However natural language labels show their limitations when one tries to automate the process, as they make it very hard to reason about classifications and their contents. In this paper we introduce the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language. Formal Classifications turn out to be some form of lightweight ontologies. This, in turn, allows us to reason about them, to associate to each node a normal form formula which univocally describes its contents, and to reduce document classification to reasoning about subsumption.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
The WWW Virtual Library project, see http://vlib.org/
Adami, G., Avesani, P., Sona, D.: Clustering documents in a web directory. In: Proceedings of Workshop on Internet Data management (WIDM 2003) (2003)
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)
Bouquet, P., Serafini, L., Zanobini, S.: Semantic coordination: a new approach and an application. In: Proc. of the 2nd International Semantic Web Conference (ISWO 2003), Sanibel Islands, Florida, USA (October 2003)
Mai Chan, L., Mitchell, J.S.: Dewey Decimal Classification: A Practical Guide. Forest P., U.S. (December 1996)
Giunchiglia, F., Shvaiko, P.: Semantic matching. In: Workshop on Ontologies and Distributed Systems, IJCAI (2003)
Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-match: an algorithm and an implementation of semantic matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Heidelberg (2004)
Gordon, A.D.: Classification, 2nd edn. Monographs on Statistics and Applied Probability. Chapman-Hall/CRC, Boca Raton (1999)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Johnson-Laird: Mental Models. Harvard University Press, Cambridge (1983)
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Fisher, D.H. (ed.) Proceedings of ICML 1997, 14th International Conference on Machine Learning, Nashville, US, pp. 170–178. Morgan Kaufmann Publishers, San Francisco (1997)
Lenat, D.B.: CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM 38(11), 33–38 (1995)
Magnini, B., Serafini, L., Speranza, M.: Making explicit the semantics hidden in schema models. In: Proceedings of the Workshop on Human Language Technology for the Semantic Web and Web Services, held at ISWC 2003, Sanibel Island, Florida (October 2003)
Miller, G.: WordNet: An electronic Lexical Database. MIT Press, Cambridge (1998)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.M.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2/3), 103–134 (2000)
Noy, N.F.: Semantic integration: a survey of ontology-based approaches. SIGMOD Rec. 33(4), 65–70 (2004)
The OpenNLP project, See: http://opennlp.sourceforge.net/
Sceffer, S., Serafini, L., Zanobini, S.: Semantic coordination of hierarchical classifications with attributes. Technical Report 706, University of Trento, Italy (December 2004)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading (1984)
Sun, A., Lim, E.-P.: Hierarchical text classification and evaluation. In: ICDM, pp. 521–528 (2001)
DMOZ: the Open Directory Project, See: http://dmoz.org/
Uschold, M., Gruninger, M.: Ontologies and semantics for seamless connectivity. SIGMOD Rec. 33(4), 58–64 (2004)
Wille, R.: Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications 23, 493–515 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Giunchiglia, F., Marchese, M., Zaihrayeu, I. (2006). Encoding Classifications into Lightweight Ontologies. In: Sure, Y., Domingue, J. (eds) The Semantic Web: Research and Applications. ESWC 2006. Lecture Notes in Computer Science, vol 4011. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11762256_9
Download citation
DOI: https://doi.org/10.1007/11762256_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34544-2
Online ISBN: 978-3-540-34545-9
eBook Packages: Computer ScienceComputer Science (R0)