Abstract
In this paper we propose a web document classification approach based on an extended version of Probabilistic Relational Models (PRMs). In particular PRMs have been augmented in order to include uncertainty over relationships, represented by hyperlinks. Our extension, called PRM with Relational Uncertainty, has been evaluated on real data for web document classification purposes. Experimental results shown the potentiality of the proposed model of capturing the real semantic relevance of hyperlinks and the capacity of embedding this information in the classification process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fersini, E., Messina, E., Archetti, F.: Granular modeling of web documents: Impact on information retrieval systems. In: Proc. of the 10th ACM International Workshop on Web Information and Data Management (2008)
Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: Proc. of the 16th International Joint Conference on Artificial Intelligence, pp. 1300–1309. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Getoor, L., Friedman, N., Koller, D., Taskar, B.: Learning probabilistic models of link structure. J. Mach. Learn. Res. 3, 679–707 (2003)
Getoor, L., Koller, D., Taskar, B., Friedman, N.: Learning probabilistic relational models with structural uncertainty. In: Proc. of the ICML 2000 Workshop on Attribute-Value and Relational Learning:Crossing the Boundaries, pp. 13–20 (2000)
Getoor, L., Segal, E., Taskar, B., Koller, D.: Probabilistic models of text and link structure for hypertext classification. In: IJCAI ’01 Workshop on Text Learning: Beyond Supervision (2001)
Kan, M.-Y., Thi, H.O.N.: Fast webpage classification using url features. In: CIKM ’05: Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 325–326. ACM, New York (2005)
Lu, Q., Getoor, L.: Link-based text classification. In: Proceedings of the 12th International Conference on Machine Learning, pp. 496–503 (2003)
Shen, D., Sun, J.-T., Yang, Q., Chen, Z.: A comparison of implicit and explicit links for web page classification. In: WWW ’06: Proceedings of the 15th international conference on World Wide Web, pp. 643–650. ACM, New York (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fersini, E., Messina, E., Archetti, F. (2010). Web Page Classification: A Probabilistic Model with Relational Uncertainty. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Computational Intelligence for Knowledge-Based Systems Design. IPMU 2010. Lecture Notes in Computer Science(), vol 6178. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14049-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-14049-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14048-8
Online ISBN: 978-3-642-14049-5
eBook Packages: Computer ScienceComputer Science (R0)