Skip to main content

Dynamic Aggregation of Relational Attributes Based on Feature Construction

  • Conference paper
  • 484 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5207))

Abstract

The importance of input representation has been recognised already in machine learning. This paper discusses the application of genetic-based feature construction methods to generate input data for the data summarisation method called Dynamic Aggregation of Relational Attributes (DARA). Here, feature construction methods are applied in order to improve the descriptive accuracy of the DARA algorithm. The DARA algorithm is designed to summarise data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. This paper addresses the question whether or not the descriptive accuracy of the DARA algorithm benefits from the feature construction process. This involves solving the problem of constructing a relevant set of features for the DARA algorithm by using a genetic-based algorithm. This work also evaluates several scoring measures used as fitness functions to find the best set of constructed features.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alfred, R., Kazakov, D.: Data Summarisation Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making. In: 2nd ADMA International Conference, pp. 889–898 (2006)

    Google Scholar 

  2. Blockeel, H., Dehaspe, L.: Tilde and Warmr User Manual (1999), http://www.cs.kuleuvan.ac.be/~ml/PS/TWuser.ps.gz

  3. Lavrač, N., Flach, P.A.: An extended transformation approach to Inductive Logic Programming. ACM Trans. Comput. Log. 2(4), 458–494 (2001)

    Article  Google Scholar 

  4. Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  5. Pagallo, G., Haussler, D.: Boolean Feature Discovery in Empirical Learning. Machine Learning 5, 71–99 (1990)

    Article  Google Scholar 

  6. Hu, Y.J., Kibler, D.F.: Generation of Attributes for Learning Algorithms. AAAI/IAAI 1, 806–811 (1996)

    Google Scholar 

  7. Hu, Y.J.: A genetic programming approach to constructive induction. In: Proc. of the Third Annual Genetic Programming Conference, pp. 146–157. Morgan Kauffman, Madison (1998)

    Google Scholar 

  8. Otero, F.E.B., Silva, M.S., Freitas, A.A., Nievola, J.C.: Genetic Programming for Attribute Construction in Data Mining. In: EuroGP, pp. 384–393 (2003)

    Google Scholar 

  9. Bensusan, H., Kuscu, I.: Constructive Induction using Genetic Programming. In: ICML 1996 Evolutionary computing and Machine Learning Workshop (1996)

    Google Scholar 

  10. Zheng, Z.: Constructing X-of-N Attributes for Decision Tree Learning. Machine Learning 40(1), 35–75 (2000)

    Article  MATH  Google Scholar 

  11. Zheng, Z.: Effects of Different Types of New Attribute on Constructive Induction. In: ICTAI, pp. 254–257 (1996)

    Google Scholar 

  12. Quinlan, R.J.: Decision-Tree. In: C4.5: Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning (1993)

    Google Scholar 

  13. Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)

    Google Scholar 

  14. Amaldi, E., Kann, V.: On the Approximability of Minimising Nonzero Variables or Unsatisfied Relations in Linear Systems. Theory Computer Science 209(1-2), 237–260 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  15. Freitas, A.A.: Understanding the Crucial Role of Attribute Interaction in Data Mining. Artif. Intell. Rev. 16(3), 177–199 (2001)

    Article  MATH  Google Scholar 

  16. Shafti, L.S., Pérez, E.: Genetic Approach to Constructive Induction Based on Non-algebraic Feature Representation. In: R. Berthold, M., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 599–610. Springer, Heidelberg (2003)

    Google Scholar 

  17. Vafaie, H., DeJong, K.: Feature Space Transformation Using Genetic Algorithms. IEEE Intelligent Systems 13(2), 57–65 (1998)

    Article  Google Scholar 

  18. Koza, J.R.: Genetic Programming: On the programming of computers by means of natural selection. Statistics and Computing 4(2) (1994)

    Google Scholar 

  19. Krawiec, K.: Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines 3, 329–343 (2002)

    Article  MATH  Google Scholar 

  20. Davies, D.L., Bouldin, D.W.: A Cluster Separation Measure. IEEE Trans. Pattern Analysis and Machine Intelligence. 1, 224–227 (1979)

    Article  Google Scholar 

  21. Shannon, C.E.: A mathematical theory of communication. Bell system technical journal 27 (1948)

    Google Scholar 

  22. Wiener, N.: Cybernetics: Or Control and Communication in Animal and the Machine. MIT Press, Cambridge (2000)

    Google Scholar 

  23. Srinivasan, A., Muggleton, S., Sternberg, M.J.E., King, R.D.: Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction. Artif. Intell. 85(1-2), 277–299 (1996)

    Article  Google Scholar 

  24. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Paolo Atzeni Albertas Caplinskas Hannu Jaakkola

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Alfred, R. (2008). Dynamic Aggregation of Relational Attributes Based on Feature Construction. In: Atzeni, P., Caplinskas, A., Jaakkola, H. (eds) Advances in Databases and Information Systems. ADBIS 2008. Lecture Notes in Computer Science, vol 5207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85713-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85713-6_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85712-9

  • Online ISBN: 978-3-540-85713-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics