Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms

Koutra, Danai; Ke, Tai-You; Kang, U.; Chau, Duen Horng (Polo); Pao, Hsing-Kuo Kenneth; Faloutsos, Christos

doi:10.1007/978-3-642-23783-6_16

Danai Koutra²³,
Tai-You Ke²⁴,
U. Kang²³,
Duen Horng (Polo) Chau²³,
Hsing-Kuo Kenneth Pao²⁴ &
…
Christos Faloutsos²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6912))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3638 Accesses
43 Citations
1 Altmetric

Abstract

If several friends of Smith have committed petty thefts, what would you say about Smith? Most people would not be surprised if Smith is a hardened criminal. Guilt-by-association methods combine weak signals to derive stronger ones, and have been extensively used for anomaly detection and classification in numerous settings (e.g., accounting fraud, cyber-security, calling-card fraud).

The focus of this paper is to compare and contrast several very successful, guilt-by-association methods: Random Walk with Restarts, Semi-Supervised Learning, and Belief Propagation (BP).

Our main contributions are two-fold: (a) theoretically, we prove that all the methods result in a similar matrix inversion problem; (b) for practical applications, we developed FaBP, a fast algorithm that yields 2× speedup, equal or higher accuracy than BP, and is guaranteed to converge. We demonstrate these benefits using synthetic and real datasets, including YahooWeb, one of the largest graphs ever studied with BP.

Download to read the full chapter text

Chapter PDF

Graph based anomaly detection and description: a survey

Article 05 July 2014

A survey of Bayesian Network structure learning

Article Open access 17 January 2023

Bayesian computation: a summary of the current state, and samples backwards and forwards

Article Open access 11 June 2015

Keywords

References

Hadoop information, http://hadoop.apache.org/
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks 30(1-7) (1998)
Google Scholar
Chau, D.H., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: Tera-scale graph mining and inference for malware detection. In: SDM (2011)
Google Scholar
Chechetka, A., Guestrin, C.: Focused belief propagation for query-specific inference. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (May 2010)
Google Scholar
Christakis, N.A., Fowler, J.H.: The spread of obesity in a large social network over 32 years. New England Journal of Medicine 357(4), 370–379 (2007)
Article Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Efficient belief propagation for early vision. International Journal of Computer Vision 70(1), 41–54 (2006)
Article Google Scholar
Fowler, J.H., Christakis, N.A.: Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study. BMJ (2008)
Google Scholar
Gao, J., Liang, F., Fan, W., Sun, Y., Han, J.: Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models. In: NIPS (2009)
Google Scholar
Gonzalez, J., Low, Y., Guestrin, C.: Residual splash for optimally parallelizing belief propagation. In: AISTAT (2009)
Google Scholar
Haveliwala, T.H.: Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 784–796 (2003)
Google Scholar
Haveliwala, T., Kamvar, S., Jeh, G.: An analytical comparison of approaches to personalizing pagerank. Technical report, Stanford University (2003)
Google Scholar
Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6321, pp. 570–586. Springer, Heidelberg (2010)
Chapter Google Scholar
Kang, U., Chau, D.H., Faloutsos, C.: Mining large graphs: Algorithms, inference, and discoveries. In: ICDE, pp. 243–254 (2011)
Google Scholar
Kang, U., Tsourakakis, C., Faloutsos, C.: Pegasus: A peta-scale graph mining system - implementation and observations. In: IEEE International Conference on Data Mining (2009)
Google Scholar
Koren, Y., North, S.C., Volinsky, C.: Measuring and extracting proximity in networks. In: KDD, pp. 245–255. ACM, New York (2006)
Google Scholar
Kschischang, F., Frey, B., Loeliger, H.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)
Article MathSciNet MATH Google Scholar
Leskovec, J., Chakrabarti, D., Kleinberg, J.M., Faloutsos, C.: Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 133–145. Springer, Heidelberg (2005)
Chapter Google Scholar
Malioutov, D.M., Johnson, J.K., Willsky, A.S.: Walk-sums and belief propagation in gaussian graphical models. Journal of Machine Learning Research 7, 2031–2064 (2006)
MathSciNet MATH Google Scholar
McGlohon, M., Bay, S., Anderle, M.G., Steier, D.M., Faloutsos, C.: Snare: a link analytic system for graph labeling and risk detection. In: KDD (2009)
Google Scholar
Minkov, E., Cohen, W.: Learning to rank typed graph walks: Local and global approaches. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 1–8. ACM, New York (2007)
Chapter Google Scholar
Pan, J., Yang, H., Faloutsos, C., Duygulu, P.: Gcap: Graph-based automatic image captioning. In: MDDE (2004)
Google Scholar
Pandit, S., Chau, D., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: WWW (2007)
Google Scholar
Pearl, J.: Reverend Bayes on inference engines: A distributed hierarchical approach. In: Proceedings of the AAAI National Conference on AI, pp. 133–136 (1982)
Google Scholar
Tong, H., Faloutsos, C., Pan, J.: Fast random walk with restart and its applications. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, Springer, Heidelberg (2006)
Google Scholar
Weiss, Y.: Correctness of local probability propagation in graphical models with loops. Neural computation 12(1), 1–41 (2000)
Article Google Scholar
Yedidia, J., Freeman, W., Weiss, Y.: Understanding belief propagation and its generalizations. Exploring Artificial Intelligence in the New Millennium 8, 236–239 (2003)
Google Scholar
Yedidia, J., Freeman, W., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory 51(7), 2282–2312 (2005)
Article MathSciNet MATH Google Scholar
Zhu, X.: Semi-supervised learning literature survey (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, USA
Danai Koutra, U. Kang, Duen Horng (Polo) Chau & Christos Faloutsos
Dept. of Computer Science & Information Engineering, National Taiwan Univ. of Science & Technology, Taiwan
Tai-You Ke & Hsing-Kuo Kenneth Pao

Authors

Danai Koutra
View author publications
You can also search for this author in PubMed Google Scholar
Tai-You Ke
View author publications
You can also search for this author in PubMed Google Scholar
U. Kang
View author publications
You can also search for this author in PubMed Google Scholar
Duen Horng (Polo) Chau
View author publications
You can also search for this author in PubMed Google Scholar
Hsing-Kuo Kenneth Pao
View author publications
You can also search for this author in PubMed Google Scholar
Christos Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimioupolis, Ilisia, 15784, Athens, Greece
Dimitrios Gunopulos
Google Switzerland GmbH, Brandschenkestrasse 110, 8002, Zurich, Switzerland
Thomas Hofmann
Department of Computer Science, University of Bari “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Donato Malerba
Deptartment of Informatics, Athens University of Economics and Business, Patision 76, 10434, Athens, Greece
Michalis Vazirgiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koutra, D., Ke, TY., Kang, U., Chau, D.H.(., Pao, HK.K., Faloutsos, C. (2011). Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23783-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-23783-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23782-9
Online ISBN: 978-3-642-23783-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Graph based anomaly detection and description: a survey

A survey of Bayesian Network structure learning

Bayesian computation: a summary of the current state, and samples backwards and forwards

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Graph based anomaly detection and description: a survey

A survey of Bayesian Network structure learning

Bayesian computation: a summary of the current state, and samples backwards and forwards

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation