Abstract
Many real world applications can be naturally formulated as a directed graph learning problem. How to extract the directed link structures of a graph and use labeled vertices are the key issues to infer labels of the remaining unlabeled vertices. However, directed graph learning is not well studied in data mining and machine learning areas. In this paper, we propose a novel Co-linkage Analysis (CA) method to process directed graphs in an undirected way with the directional information preserved. On the induced undirected graph, we use a Green’s function approach to solve the semi-supervised learning problem. We present a new zero-mode free Laplacian which is invertible. This leads to an Improved Green’s Function (IGF) method to solve the classification problem, which is also extended to deal with multi-label classification problems. Promising results in extensive experimental evaluations on real data sets have demonstrated the effectiveness of our approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abernethy, J., Chapelle, O., Castillo, C.: Web spam identification through content and hyperlinks. In: Proc. of International Workshop on Adversarial Information Retrieval on the Web (2008)
Bang-Jensen, J.: Digraphs: theory, algorithms and applications. Springer, Heidelberg (2008)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: WWW (1998)
Chen, G., Song, Y., Wang, F., Zhang, C.: Semi-supervised Multi-label Learning by Solving a Sylvester Equation. In: SDM (2008)
Cheng, H., Liu, Z., Yang, J.: Sparsity Induced Similarity Measure for Label Propagation. In: IEEE ICCV (2009)
Chung, F.: Laplacians and the Cheeger inequality for directed graphs. Annals of Combinatorics 9(1), 1–19 (2005)
Ding, C., He, X., Husbands, P., Zha, H., Simon, H.: PageRank, HITS and a unified framework for link analysis. In: ACM SIGIR (2002)
Ding, C., Simon, H., Jin, R., Li, T.: A learning framework using Green’s function and kernel regularization with application to recommender system. In: ACM SIGKDD (2007)
Ding, C., Zha, H., He, X., Husbands, P., Simon, H.: Link analysis: hubs and authorities on the World Wide Web. SIAM Review 256 (2004)
Giles, C., Bollacker, K., Lawrence, S.: CiteSeer: An automatic citation indexing system. In: Proc. of ACM Conf. on Digital libraries (1998)
Hein, M., Maier, M.: Manifold denoising. In: NIPS (2007)
Joachims, T., Cristianini, N., Shawe-Taylor, J.: Composite kernels for hypertext categorisation. In: ICML (2001)
Kessler, M.: Bibliographic coupling between scientific papers. American documentation 14(1), 10–25 (1963)
Lewis, D., Yang, Y., Rose, T., Li, F.: Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research (2004)
Meila, M., Pentney, W.: Clustering by weighted cuts in directed graphs. In: SDM (2007)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Stanford Digital Library Technologies Project (1998)
Pentney, W., Meila, M.: Spectral clustering of biological sequence data. In: AAAI (2005)
Shin, H., Hill, N., Ratsch, G.: Graph based semi-supervised learning with sharper edges. In: ECML (2006)
Small, H.: Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Am. Soc. for Info. Sci. Tech. 24(4), 265–269 (1973)
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: ACM Multimedia (2006)
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of music into emotions. In: ISMIR
Ueda, N., Saito, K.: Single-shot detection of multiple categories of text using parametric mixture models. In: ACM SIGKDD (2002)
Wang, H., Huang, H., Ding, C.: Image Annotation Using Multi-label Correlated Greens Function. In: IEEE ICCV (2009)
Wang, H., Huang, H., Ding, C.: Image Categorization Using Directed Graphs. In: ECCV (2010)
Yan, S., Wang, H.: Semi-supervised learning by sparse representation. In: SDM (2009)
Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: ACM SIGIR (2005)
Zhang, D., Mao, R.: Classifying networked entities with modularity kernels. In: ACM CIKM (2008)
Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS (2004)
Zhou, D., Huang, J., Schölkopf, B.: Learning from labeled and unlabeled data on a directed graph. In: ICML (2005)
Zhou, D., Schölkopf, B., Hofmann, T.: Semi-supervised learning on directed graphs. In: NIPS (2005)
Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: ACM SIGIR (2007)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: ICML (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Ding, C., Huang, H. (2010). Directed Graph Learning via High-Order Co-linkage Analysis. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-15939-8_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)