A Relative Feature Selection Algorithm for Graph Classification

Keneshloo, Yaser; Yazdani, Sasan

doi:10.1007/978-3-642-32741-4_13

Yaser Keneshloo⁴ &
Sasan Yazdani⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 186))

915 Accesses
1 Citations

Abstract

Graph classification is one of the most important research fields in data mining nowadays and many algorithms have been proposed to address this issue. In practice, labeling large or even medium-size graphs is a hard task and we need experts to do so. The biggest challenge in graph classification is extracting a set of proper features from graphs. Since graphs are represented by a complex data structure, this issue has been dealt with for a long time. Previous methods focused on extracting features from a certain class in a dataset. In this paper we propose a new feature selection method that extracts features from each graph rather than extracting them from a certain class in the dataset. We extract only frequent subgraphs as features. These subgraphs are chosen according to their number of occurrences in a graph. Moreover, we proposed a new formula which calculates the minimum number of occurrences required for a subgraph to be considered as frequent.We experimented on five real datasets and reached 7-17% higher accuracy than previously proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C.C.: On classification of graph streams. In: Proceedings of Eleventh SIAM International Conference on Data Mining, SDM 2011, Arizona, Mesa (2011)
Google Scholar
Aggarwal, C.C., Li, Y., Yu, P.S., Jin, R.: On dense pattern mining in graph streams. PVDLB 3(1), 975–984 (2010)
Google Scholar
Aggarwal, C.C., Zhao, Y., Yu, P.S.: On clustering graph streams. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, Columbus, Ohio, USA (2010)
Google Scholar
Auria, L., Moro, R.A.: Support Vector Machines (SVM) as a technique for solvency analysis. Discussion Papers of DIW Berlin (2008)
Google Scholar
Bondy, J.A.: Graph Theory with Applications, pp. 12–21. Elsevier Science Ltd. (1976)
Google Scholar
Cheng, H., Yan, X., Han, J., Hsu, C.: Discriminative frequent pattern analysis for effective classification. In: Proc. of ICDE, Istanbul, Turkey (2007)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, L.H.: The Weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
He, H., Singh, A.K.: Closure-Tree: An index structure for graph queries. In: ICDE 2006, Atlanta, Georgia (2006)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: Proceedings of the 3rd IEEE International Conference on Data Mining, ICDM 2003, Melbourne (2003)
Google Scholar
Kong, X., Fan, W., Yu, P.S.: Dual active feature and sample selection for graph classification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California (2011)
Google Scholar
Kong, X., Yu, P.S.: Semi-supervised feature selection for graph classification. In: Proceeding of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, Washington, DC (2010)
Google Scholar
Krishna, V., Suri, N.N.R.R., Athithan, G.: A comparative survey of algorithms for frequent subgraph discovery. Current Science 100(2) (2011)
Google Scholar
Kuramochi, M., Karypis, G.: An efficient algorithm for discovering frequent subgraphs. IEEE Transactions on Knowledge and Data Engineering 16(9), 1038–1051 (2004)
Article Google Scholar
Li, G., Semerci, M., Yener, B., Zaki, M.J.: Graph Classification via Topological and Label Attributes. In: 9th Workshop on Mining and Learning with Graphs (with SIGKDD) (August 2011)
Google Scholar
Nijssen, S., Kok, J.: A quick start in frequent structure mining can make a difference. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 647–652. ACM (2004)
Google Scholar
Parveen, P., Weger, Z.R., Thuraisingham, B.M., Hamlen, K.W., Khan, L.: Insider Threat Detection using Stream Mining and Graph Mining. In: IEEE 23rd International Conference on Tools with Artificial Intelligence, ICTAI 2011, Boca Raton, FL (2011)
Google Scholar
Singh, A.K.: Graph querying, graph motif mining and the discovery of clusters, United State Patent, Santa Barbara, CA (2011)
Google Scholar
Thoma, M., Cheng, H., Gretton, A., Han, J., Kriegel, H., Smola, A., Song, L., Yu, P.S., Yan, X., Borgwardt, K.M.: Near-optimal supervised feature selection among frequent subgraphs. In: SIAM International Conference on Data Mining (2009)
Google Scholar
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, I.R., Borgwardt, K.M.: Graph Kernels. Journal of Machine Learning Research 11, 1201–1242 (2010)
MathSciNet MATH Google Scholar
Wale, N., Karypis, G.: Comparison of descriptor spaces for chemical compound retrieval and classification. In: Proc. of ICDM, Hong Kong, pp. 678–689 (2006)
Google Scholar
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM 2002 (2002)
Google Scholar
Yan, X., Yu, P.S., Han, J.: Substructure similarity search in graph databases. In: SIGMOD Conference (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Iran University of Science and Technology, Tehran, Iran
Yaser Keneshloo & Sasan Yazdani

Authors

Yaser Keneshloo
View author publications
You can also search for this author in PubMed Google Scholar
Sasan Yazdani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaser Keneshloo .

Editor information

Editors and Affiliations

Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 3a, Poznan, 60-965, Poland
Tadeusz Morzy
, Dept. of Computer Science, University of Kaiserslautern, Kaiserslautern, 67653, Germany
Theo Härder
Institute of Computing Science, Faculty of Computing, Poznań University of Technology, Piotrowo 2, Poznań, 60-965, Poland
Robert Wrembel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Keneshloo, Y., Yazdani, S. (2013). A Relative Feature Selection Algorithm for Graph Classification. In: Morzy, T., Härder, T., Wrembel, R. (eds) Advances in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32741-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-32741-4_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32740-7
Online ISBN: 978-3-642-32741-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics