The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining

Alfred, Rayner

doi:10.1007/978-3-540-73871-8_21

Rayner Alfred²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4632))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2206 Accesses
8 Citations

Abstract

Most aggregation functions are limited to either categorical or numerical values but not both values. In this paper, we define three concepts of aggregation function and introduce a novel method to aggregate multiple instances that consists of both the categorical and numerical values. We show how these concepts can be implemented using clustering techniques. In our experiment, we discretize continuous values before applying the aggregation function on relational datasets. With the empirical results obtained, we demonstrate that our transformation approach using clustering techniques, as a means of aggregating multiple instances of attribute’s values, can compete with existing multi-relational techniques, such as Progol and Tilde. In addition, the effect of the number of interval for discretization on the classification performance is also evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alfred, R., Kazakov, D.: Pattern-Based Transformation Approach to Relational Domain Learning Using DARA. In: Crone, S.F., Lessmann, S., Stahlbock, R. (eds.) Proceedings of the International Conference on Data Mining, LAS VEGAS, Nevada, June 25-29, 2006, pp. 296–302. CSREA Press (2006)
Google Scholar
Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: A Study in first-order and feature-based induction. Artificial Intelligence, 85 (1996)
Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997)
Article MATH Google Scholar
Żytkow, J.M., Rauch, J. (eds.): Principles of Data Mining and Knowledge Discovery. LNCS (LNAI), vol. 1704. Springer, Heidelberg (1999)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos, Cal
Google Scholar
Kramer, S., Lavrač, N., Flach, P.: Propositionalization approaches to relational data mining. In: Dzeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)
Google Scholar
Salton, G., Michael, J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)
Google Scholar
Bezdek, J.C.: Some new indexes of cluster validity. IEEE Transaction System, Man, Cybern. B 28, 301–315 (1998)
Article Google Scholar
Boley, D.: Principal direction divisive partitioning. Data Mining and Knowledge Discovery 2(4), 325–344 (1998)
Article Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufman, Seattle (1999)
Google Scholar
Knobbe, A.J., de Haas, M., Siebes, A.: Propositionalisation and Aggregates. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)
Chapter Google Scholar
Perlich, C., Provost, F.: Aggregation-Based Feature Invention and Relational Concept Classes. In: KDD 2003. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2003)
Google Scholar
De Raedt, L.: Attribute-value learning versus inductive logic programming: The missing links (extended abstract). In: Page, D.L. (ed.) Inductive Logic Programming. LNCS, vol. 1446, pp. 1–8. Springer, Heidelberg (1998)
Chapter Google Scholar
Lavrač, N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994)
Google Scholar
Lavrač, N., Džeroski, S., Grobelnik, M.: Learning nonrecursive definitions of relations with LINUS. In: Kodratoff, Y. (ed.) Machine Learning - EWSL-91. LNCS, vol. 482, pp. 265–281. Springer, Heidelberg (1991)
Chapter Google Scholar
Krogel, M.A., Wrobel, S.: Transformation-Based Learning Using Multirelational Aggregation. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, Springer, Heidelberg (2001)
Chapter Google Scholar
Lavrač, N., Železny, F., Flach, P.A.: RSD: Relational subgroup discovery through first-order feature construction. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, Springer, Heidelberg (2003)
Google Scholar
Krogel, M.A., Rawles, S., Železny, F., Flach, P.A., Lavrač, N., Wrobel, S.: Comparative evaluation of approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 197–214. Springer, Heidelberg (2003)
Chapter Google Scholar
Blokceel, H., Bruynooghe, M.: Aggregation versus Selection Bias, and relational neural networks. In: Kurumatani, K., Chen, S.-H., Ohuchi, A. (eds.) IJCAI-WS 2003 and MAMUS 2003. LNCS (LNAI), vol. 3012, Springer, Heidelberg (2004)
Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 94–105. ACM Press, New York (1998)
Chapter Google Scholar
Hofmann, T., Buhnmann, J.M.: Active data clustering. In: Advance in Neural Information Processing System (1998)
Google Scholar
Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Universiti Malaysia Sabah, School of Engineering and Information Technology, 88999, Kota Kinabalu, Sabah, Malaysia
Rayner Alfred

Authors

Rayner Alfred
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Calgary , Calgary, AB, Canada
Reda Alhajj
School of Computer Science and Technology , Harbin Institute of Technology, Harbin, China
Hong Gao
School of Computer Science and Technology , Harbin Institute of Technology , Harbin, China
Jianzhong Li
School of Information Technology and Electronic Engineering , The University of Queensland , Queensland, Australia
Xue Li
Department of Computing Science , University of Alberta, Edmonton, AB, Canada
Osmar R. Zaïane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alfred, R. (2007). The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-73871-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73870-1
Online ISBN: 978-3-540-73871-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics