Multivariate Decision Trees Using Different Splitting Attribute Subsets for Large Datasets

Franco-Arcega, Anilu; Carrasco-Ochoa, José Ariel; Sánchez-Díaz, Guillermo; Martínez-Trinidad, José Fco.

doi:10.1007/978-3-642-13059-5_49

Anilu Franco-Arcega²¹,
José Ariel Carrasco-Ochoa²¹,
Guillermo Sánchez-Díaz²² &
…
José Fco. Martínez-Trinidad²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6085))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2586 Accesses
1 Citations

Abstract

In this paper, we introduce an incremental induction of multivariate decision tree algorithm, called IIMDTS, which allows choosing a different splitting attribute subset in each internal node of the decision tree and it processes large datasets. IIMDTS uses all instances of the training set for building the decision tree without storing the whole training set in memory. Experimental results show that our algorithm is faster than three of the most recent algorithms for building decision trees for large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)
Chapter Google Scholar
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel classifier for data mining. In: Proc. 22nd International Conference Very Large Databases, pp. 544–555 (1996)
Google Scholar
Alsabti, K., Ranka, S., Singh, V.: CLOUDS: A decision tree classifier for large datasets. In: KDD, pp. 2–8 (1998)
Google Scholar
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest - A frame- work for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery 4, 127–162 (2000)
Article Google Scholar
Yang, B., Wang, T., Yang, D., Chang, L.: BOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 405–416. Springer, Heidelberg (2008)
Chapter Google Scholar
Gehrke, J., Ganti, V., Ramakrishnan, R., Loh, W.: BOAT - optimistic decision tree construction. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 169–180 (1999)
Google Scholar
Yoon, H., Alsabti, K., Ranka, S.: Tree-based incremental classification for large datasets. Technical Report TR-99-013, CISE Department, University of Florida, Gainesville, FL. 32611 (1999)
Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proc. of Six Int. Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Google Scholar
Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G., Martínez-Trinidad, J.F.: A new incremental algorithm for induction of multivariate decision trees for large datasets. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 282–289. Springer, Heidelberg (2008)
Chapter Google Scholar
Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Machine Learning 19(1), 45–77 (1995)
MATH Google Scholar
Li, X.B., Sweigart, J.R., Teng, J.T., Donohue, J.M., Thombs, L.A., Wang, S.M.: Multivariate decision trees using linear discriminants and tabu search. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans 33(2), 194–205 (2003)
Article Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Adelman-McCarthy, J., Agueros, M.A., Allam, S.S.: Data Release 6. ApJS, 175 (in press, 2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Luis Enrique Erro # 1, Santa Maria Tonantzintla, Puebla, Mexico, C.P.72840
Anilu Franco-Arcega, José Ariel Carrasco-Ochoa & José Fco. Martínez-Trinidad
Computational Science and Technology Department, University of Guadalajara, CUValles, Carretera Guadalajara - Ameca Km. 45.5, Ameca, Jalisco, Mexico, C.P. 46600
Guillermo Sánchez-Díaz

Authors

Anilu Franco-Arcega
View author publications
You can also search for this author in PubMed Google Scholar
José Ariel Carrasco-Ochoa
View author publications
You can also search for this author in PubMed Google Scholar
Guillermo Sánchez-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
José Fco. Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NLP Technologies Inc., 1255 University Street, H3B 3W9, Montreal, Quebec, Canada
Atefeh Farzindar
Dalhousie University, Faculty of Computer Science, 6050 University Ave, Halifax, B3H 1W5, Nova Scotia, Canada
Vlado Kešelj

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G., Martínez-Trinidad, J.F. (2010). Multivariate Decision Trees Using Different Splitting Attribute Subsets for Large Datasets. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-13059-5_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics