Abstract
Network motifs are basic building blocks in complex networks. Motif detection has recently attracted much attention as a topic to uncover structural design principles of complex networks. Pattern finding is the most computationally expensive step in the process of motif detection. In this paper, we design a pattern finding algorithm based on Google MapReduce to improve the efficiency. Performance evaluation shows our algorithm can facilitates the detection of larger motifs in large size networks and has good scalability. We apply it in the prescription network and find some commonly used prescription network motifs that provide the possibility to further discover the law of prescription compatibility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Block of Complex Networks. Science 5594, 824–827 (2002)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, New York (1979)
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: ACM OSDI (2004)
Kuramochi, M., Karypis, G.: Finding Frequent Patterns in a Large Sparse Graph. In: Data Mining and Knowledge Discovery, vol. 5810, pp. 243–271. Springer, Heidelberg (2005)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: 2002 IEEE International Conference on Data Mining, 2002. ICDM 2002. Proceedings, pp. 721–724. IEEE Press, Maebashi City (2002)
Inokucbi, A., Wasbio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50(3), 321–354 (2003)
Hong, M., Zhou, H., Wang, W., Shi, B.: An efficient algorithm of frequent connected subgraph extraction. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS, vol. 2637, pp. 40–51. Springer, Heidelberg (2003)
Yan, X., Hart, J.: CloseGraph: Mining closed frequent patterns. In: The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 286–295. ACM, Washington (2003)
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: 2003 International Conference on Data Mining (ICDM), Melbourne, pp. 549–552. IEEE, Florida (2003)
Gudes, E., Shimony, S.E., Vanetik, N.: Discovering frequent graph patterns using disjoint paths. IEEE Transactions on Knowledge and Data Engineering 18(11), 1441–1456 (2006)
Yoshida, K., Motoda, H., Indurkhya, N.: Graph-based induction as a unified learning framework. Journal of Applied Intelligence 4, 297–328 (1994)
Cook, J., Holder, L.: Substructure discovery using minimum description length and background knowledge. J. Artificial Intelligence Research, 231–255 (1994)
Schreiber, F., Schwöbbermeyer, H.: Frequent Concepts and Pattern Detection for the Analysis of Motifs in Networks. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds.) Transactions on Computational Systems Biology III. LNCS (LNBI), vol. 3737, pp. 89–104. Springer, Heidelberg (2005)
Chen, J., Hsu, W., Lee, M.-L., Ng, S.-K.: Nemofinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs. In: KDD, pp. 106–115 (2006)
Chen, C., Yan, X., Zhu, F., Han, J.: gApprox: Mining frequent approximate patterns from a massive network. In: Perner, P. (ed.) ICDM 2007. LNCS (LNAI), vol. 4597, pp. 445–450. Springer, Heidelberg (2007)
Chu, C., Kim, S.K., Lin, Y., Yu, Y.Y., Bradski, G.: Map-Reduce for Machine Learning on Multicore. NIPS (2006)
Chang, E., Zhu, K., Wang, H., Bai, H., Li, J., Qiu, Z., Cui, H.: PSVM: Parallelizing Support Vector Machines on Distributed Computers. NIPS (2007)
Wu, Z., Zhou, X., Liu, B., Chen, J.: Text Mining for Finding Functional Community of Related Genes using TCM Knowledge. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 459–470. Springer, Heidelberg (2004)
Ying, T., Guo-fu, Y., Gui-bing, L., Jian-ying, C.: Mining Compatibility Rules from Irregular Chinese Traditional Medicine Database by Apriori Agorithm. Journal of Southwest Jiaotong University (English Edition) 15, 288–292 (2007)
Xuezhong, Z., Zhaohui, W.: Distributional Character Clustering for Chinese Text Categorization. In: Zhang, C., Guesgen, H.W., Yeap, W.-K. (eds.) PRICAI 2004. LNCS (LNAI), vol. 3157, pp. 575–584. Springer, Heidelberg (2004)
Xiao, H., Liang, X., Lu, P., Chan, C.: New method for analysis of Chinese herbal complex prescription and its application. Chinese Science Bulletin 44, 1164–1172 (1999)
Feng, Y., Wu, Z., Zhou, X., Zhou, Z., Fan, W.: Knowledge discovery in traditional Chinese medicine: State of the art and perspectives. Artificial Intelligence in Medicine. 38(3), 219–236 (2006)
Chang, Y.-H., Lin, H.-J., Li, W.-C.: Clinical evaluation of the traditional Chinese prescription Chi-Ju-Di-Huang-Wan for Dry Eye. Phytotherapy Research 19(4), 349–354 (2005)
Kuramochi, M., Karypis, G.: An efficient algorithm for discovering frequent subgraphs. Technical Report 02-026, Department of Computer Science, University of Minnesota (2002)
Fujing, D.: Prescription: for the Specialty of Chinese Traditional Medicine. Shanghai Publishing House of Science and Technology Press, Shanghai (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, Y., Jiang, X., Chen, H., Ma, J., Zhang, X. (2009). MapReduce-Based Pattern Finding Algorithm Applied in Motif Detection for Prescription Compatibility Network. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-03644-6_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03643-9
Online ISBN: 978-3-642-03644-6
eBook Packages: Computer ScienceComputer Science (R0)