Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA

Sia, Florence; Alfred, Rayner; Chin, Kim On

doi:10.1007/978-3-642-40567-9_25

Florence Sia⁷,
Rayner Alfred⁷ &
Kim On Chin⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 378))

Included in the following conference series:

International Multi-Conference on Artificial Intelligence Technology

874 Accesses
1 Citations

Abstract

DARA (Dynamic Aggregation of Relational Attributes) algorithm is designed to summarize non-target records stored in a non-target table. These records have many-to-one relationships with records stored in the target table. The records stored in the non-target table are summarized and the summarized data is then appended to the target table. With these summarized data appended into the target table, a classifier will be applied to learn this data in order to perform the classification task. However, the predictive accuracy of the classification task is highly influenced by the representation of the summarized data. In our previous works, several types of feature construction methods have been introduced especially for the DARA algorithm in order to improve the descriptive accuracy of the summarized data and indirectly improve the predictive accuracy of the target data. This paper proposes a method that learns relational data based on multiple instances of summarized data that are obtained using different types of feature construction methods. This involves investigating the effect of selecting several sets of summarized data which have been summarized using the feature construction methods and appending these summarized data into the target table before the classification task can be performed. The predictive accuracy of the classification task is expected to be improved when multiple instances of summarized data appended into the target table. The experiment results show that there are some improvements in the predictive accuracy of the classification by selecting multiple instances of summarized data and appending them into the target table.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kavurucu, Y., Senkul, P., Toroslu, I.H.: A Comparative Study on ILP-Based Concept Discovery Systems. Expert Systems with Applications 38(9), 11598–11607 (2011)
Article Google Scholar
Xavier, J.C., Canuto, A.M.P., Freitas, A.A., Goncalves, L.M.G., Silla, C.N.: A Hierarchical Approach to Represent Relational Data Applied to Clustering Tasks. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 3055–3062 (2011)
Google Scholar
Tian, Y., Weiss, G.M., Hsu, D.F., Ma, Q.: A Combinatorial Fusion Method for Feature Construction (2009)
Google Scholar
Alfred, R.: Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques. Journal of Computer Science 6(7), 775–784 (2010)
Article Google Scholar
Kavurucu, Y., Senkul, P., Toroslu, I.H.: Concept Discovery on Relational Databases: New Techniques for Search Space Pruning and Rule Quality Improvement. Knowledge-Based Systems 23(8), 743–756 (2011)
Article Google Scholar
Maimon, O., Rokach, L.: Introduction to Knowledge Discovery and Data Mining. The Data Mining and Knowledge Discovery HandBook, pp. 1–5. Springer, Heidelberg (2010)
Google Scholar
Choudhary, A.K., Harding, J.A., Tiwari, M.K.: Data Mining in Manufacturing: A Review Based on the Kind of Knowledge. Journal of Intelligent Manufacturing, 501–521 (2009)
Google Scholar
Alfred, R.: The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining. In: Alhajj, R., et al. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 214–226. Springer, Heidelberg (2007)
Chapter Google Scholar
Alfred, R.: Optimizing Feature Construction Process for Dynamic Aggregation of Relational Attributes. J. Comput. Sci. 5, 864–877 (2009)
Article Google Scholar
Alfred, R.: Feature Transformation: A Genetic-Based Feature Construction Method for Data Summarization. Computational Intelligence 26(3), 337–357 (2010)
Article MathSciNet MATH Google Scholar
Alfred, R.: Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques. Journal of Computer Science 6(7), 775–784 (2010)
Article Google Scholar
Sia, F., Alfred, R.: Evolutionary-Based Feature Construction With Substitution For Data Summarization Using DARA. In: The 4th 2012 Conference on Data Mining and Optimization (DMO 2012), Langkawi, Malaysia (2012)
Google Scholar
Sia, F., Alfred, R.: A Variable Feature Construction Method For Data Summarization Using DARA. In: The 3rd International Conference on Advancements in Computing Technology (ICACT 2012), Soeul, Korea (2012)
Google Scholar
Shafti, L.S., Perez, E.: Evolutionary Multi-Feature Construction for Data Reduction: A Case Study. Appl. Soft Comput. 9, 1296–1303 (2009)
Article Google Scholar
Guan, Y., Dy, J.G., Jordan, M.I.: A Unified Probabilistic Model for Global and Local Unsupervised Feature Selection. In: Proc. ICMC (2011)
Google Scholar
Wong, C., Versace, M.: CARTMAP: A Neural Network Method for Automated Feature Selection in financial Time Series Forecasting. J. Neural Computing and Applications, 969–977 (2012)
Google Scholar
Vinh, L.T., Lee, S.Y., Park, Y.T., d’Auriol, B.J.: A Novel Feature Selection Method Based On Normalized Mutual Information. J. Applied Intelligence, 100–120 (2012)
Google Scholar
Pal, M., Foody, G.M.: Feature Selection for Classification of Hyperspectral Data by SVM. IEEE Transactions on Geoscience and Remote Sensing 48(5), 2297–2307 (2010)
Article Google Scholar
Song, L.A., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature Selection Via Dependence Maximization. Journal of Machine Learning Research 13, 1393–1434 (2012)
MathSciNet Google Scholar
Estevez, P.A., Tesmer, M., Perez, C.A., Zurada, J.M.: Normalized Mutual in-Formation Feature Selection. IEEE Transactions on Neural Networks 20(2), 189–201 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Information Technology, Universiti Malaysia Sabah, Jalan UMS, 88400, Kota Kinabalu, Sabah, Malaysia
Florence Sia, Rayner Alfred & Kim On Chin

Authors

Florence Sia
View author publications
You can also search for this author in PubMed Google Scholar
Rayner Alfred
View author publications
You can also search for this author in PubMed Google Scholar
Kim On Chin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi, Selangor, Malaysia
Shahrul Azman Noah
Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor D. E, Malaysia
Azizi Abdullah
Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor Darul Ehsan, Malaysia
Haslina Arshad , Zulaiha Ali Othman & Zalinda Othman , &
School of Computer Science, FTSM, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor Darul Ehsan, Malaysia
Azuraliza Abu Bakar
Pattern Recognition Research Group, CAIT, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Shahnorbanun Sahran
Faculty of Information Science & IT, National University of Malaysia, 43600, Bangi, Selangor, Malaysia
Nazlia Omar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sia, F., Alfred, R., Chin, K.O. (2013). Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA. In: Noah, S.A., et al. Soft Computing Applications and Intelligent Systems. M-CAIT 2013. Communications in Computer and Information Science, vol 378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40567-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-40567-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40566-2
Online ISBN: 978-3-642-40567-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics