Abstract
We present compelling evidence that the World Wide Web is a domain in which applications can benefit from using first-order learning methods, since the graph structure inherent in hypertext naturally lends itself to a relational representation. We demonstrate strong advantages for two applications — learning classifiers for Web pages, and learning rules to discover relations among pages.
This research was supported in part by the Darpa HPKB program under contract F30602-97-1-0215.
Chapter PDF
References
W. W. Cohen. Learning to classify English text with ILP methods. In L. De Raedt, editor, Advances in Inductive Logic Programming. IOS Press, 1995.
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to extract symbolic knowledge from the World Wide Web. Technical report, Department of Computer Science, Carnegie Mellon Univ., 1998.
S. Džeroski and I. Bratko. Handling noise in inductive logic programming. In S.H. Muggleton and K. Furukawa, editors, Proc. of the 2nd International Workshop on Inductive Logic Programming.
T. Mitchell. Machine Learning. McGraw Hill, 1997.
J. R. Quinlan and R. M. Cameron-Jones. FOIL: A midterm report. In Proc. of the European Conf. on Machine Learning, pages 3–20, Vienna, Austria, 1993.
B. Richards and R. Mooney. Learning relations by pathfinding. In Proc. of the 10th Nat. Conf. on Artificial Intelligence, pages 50–55, San Jose, CA, 1992. AAAI Press.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Craven, M., Slattery, S., Nigam, K. (1998). First-order learning for Web mining. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026695
Download citation
DOI: https://doi.org/10.1007/BFb0026695
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive