Feature Transformation and Multivariate Decision Tree Induction

Liu, Huan; Setiono, Rudy

doi:10.1007/3-540-49292-5_25

Huan Liu³ &
Rudy Setiono³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1532))

Included in the following conference series:

International Conference on Discovery Science

617 Accesses
4 Citations

Abstract

Univariate decision trees (UDT’s) have inherent problems of replication, repetition, and fragmentation. Multivariate decision trees (MDT’s) have been proposed to overcome some of the problems. Close examination of the conventional ways of building MDT’s, however, reveals that the fragmentation problem still persists. A novel approach is suggested to minimize the fragmentation problem by separating hyperplane search from decision tree building. This is achieved by feature transformation. Let the initial feature vector be x, the new feature vector after feature transformation T is y, i.e., y = T(x). We can obtain an MDTb y (1) building a UDT on y; and (2) replacing new features y at each node with the combinations of initial features x. We elaborate on the advantages of this approach, the details of T, and why it is expected to perform well. Experiments are conducted in order to confirm the analysis, and results are compared to those of C4.5, OC1, and CART

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K.P. Bennett and O.L. Mangasarian. Neural network training via linear programming. In P.M. Pardalos, editor, Advances in Optimization and Parallel Computing, pages 56–67. Elsevier Science Publishers B.V., Amsterdam, 1992.
Google Scholar
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.
Google Scholar
C.E. Brodley and P.E. Utgoff. Multivariate decision trees. Machine Learning, 19:45–77, 1995.
MATH Google Scholar
M. Dash and H. Liu. Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3), 1997. http://www-east.elsevier.com/ida/free.htm.
U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc., 1993.
Google Scholar
J.H. Friedman, R. Kohavi, and Y. Yun. Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717–724, 1996.
Google Scholar
L. Fu. Neural Networks in Computer Intelligence. McGraw-Hill, 1994.
Google Scholar
B. Hassibi and D.G. Stork. Second order derivatives for network pruning: Optimal brain surgeon. Neural Information Processing Systems, 5:164–171, 1993.
Google Scholar
D. Heath, S. Kasif, and S. Salzberg. Learning oblique decision trees. In Proceedings of the Thirteenth International Joint Conference on AI, pages 1002–1007, France, 1993.
Google Scholar
K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129–134. Menlo Park: AAAI Press/The MIT Press, 1992.
Google Scholar
H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In J.F. Vassilopoulos, editor, Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, November 5–8, 1995, pages 388–391, Herndon, Virginia, 1995. IEEE Computer Society.
Google Scholar
C Matheus and L. Rendell. Constructive induction on decision trees. In Proceedings of International Joint Conference on AI, pages 645–650, August1989.
Google Scholar
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html. Irvine, CA: University of California,Department of Information and Computer Science, 1996.
Google Scholar
John Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342, 1989.
Google Scholar
S Murthy, S. Kasif, S. Salzberg, and R. Beigel. Oc1: Randomized induction of oblique decision trees. In Proceedings of AAAIConference (AAAI’93), pages 322–327. AAAI Press / The MIT Press, 1993.
Google Scholar
G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.
Article Google Scholar
J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
Google Scholar
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
D.E. Rumelhart, J.L. McClelland, and the PDP Research Group. Parallel Distributed Processing, volume 1. Cambridge, Mass. The MIT Press, 1986.
Google Scholar
I.K. Sethi. Neural implementation of tree classifiers. IEEE Trans. on Systems, Man, and Cybernetics, 25(8), August 1995.
Google Scholar
R. Setiono. A penalty-function approach for pruning feedforward neural networks. Neural Computation, 9(1):185–204, 1997.
Article MATH Google Scholar
R. Setiono and H. Liu. Understanding neural networks via rule extraction. In Proceedings of International Joint Conference on AI, 1995.
Google Scholar
R. Setiono and H. Liu. Analysis of hidden representations by greedy clustering. Connection Science, 10(1):21–42, 1998.
Article Google Scholar
J.W. Shavlik, R.J. Mooney, and G.G. Towell. Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6(2):111–143, 1991.
Google Scholar
G.G. Towell and J.W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13(1):71–101, 1993.
Google Scholar
P.E. Utgo. and C.E. Brodley. An incremental method for finding multivariate splits for decision trees. In Machine Learning: Proceedings of the Seventh International Conference, pages 58–65. University of Texas, Austin,Texas, 1990.
Google Scholar
R. Vilalta, G. Blix, and L. Rendell. Global data analysis and the fragmentation problem in decision tree induction. In M. van Someren and G. Widmer, editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag, 1997.
Google Scholar
J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, Singapore, 119260
Huan Liu & Rudy Setiono

Authors

Huan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Rudy Setiono
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Kyushu University, Fukuoka, 812-8581, USA
Setsuo Arikawa
Institute of Scientific and Industrial Research Devision of Intelligent Systems Science, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, H., Setiono, R. (1998). Feature Transformation and Multivariate Decision Tree Induction. In: Arikawa, S., Motoda, H. (eds) Discovey Science. DS 1998. Lecture Notes in Computer Science(), vol 1532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49292-5_25

Download citation

DOI: https://doi.org/10.1007/3-540-49292-5_25
Published: 14 January 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65390-5
Online ISBN: 978-3-540-49292-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics