Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

A new classification method to overcome over-branching

  • 34 Accesses

  • 1 Citations

Abstract

Classification is an important technique in data mining. The decision trees built by most of the existing classification algorithms commonly feature over-branching, which will lead to poor efficiency in the subsequent classification period. In this paper, we present a new value-oriented classification method, which aims at building accurately proper-sized decision trees while reducing over-branching as much as possible, based on the concepts of frequent-pattern-node and exceptive-child-node. The experiments show that while using relevant analysis as pre-processing, our classification method, without loss of accuracy, can eliminate the over-branching greatly in decision trees more effectively and efficiently than other algorithms do.

This is a preview of subscription content, log in to check access.

References

  1. [1]

    Liu Bing, Hsu Wynne, Ma Yiming. Integrating classification and association rule mining. InProc. Int. Conf. KDD, 1998, pp.80–86.

  2. [2]

    John Shafer, Rakesh Agrawal, Manish Mehta. SPRINT: A scalable parallel classifier for data mining. InProc. Int. Conf. VLDB 1996, 1996, pp.544–555.

  3. [3]

    Mihael Ankerst, Christian Elsen, Martin Esteret al. Visual classification: An interactive approach to decision tree construction. InProc. Int. Conf. KDD 1999, 1999, pp.392–396.

  4. [4]

    Manish Mehta, Rakesh Agrawal, Jorma Rissanen. SLIQ: A fast scalable classifier for data mining. InProc. EDBT 1996, 1996, pp. 18–32.

  5. [5]

    Rakesh Agrawal, Sakti Ghosh, Tomasz Imielinskiet al. An interval classifier for database mining applications. InProc. Int. Conf. VLDB 1992, 1992, pp.560–573.

  6. [6]

    Micheline Kamber, Lara Winstone, Wan Gonget al. Generalization and decision tree induction: Efficient classification in data mining. In7th International Workshop on Research Issues in Data Engineering 1997.

  7. [7]

    Manish Mehta, Jorma Rissanen, Rakesh Agrawal. MDL-based decision tree pruning. InProc. Int. Conf. KDD 1995, 1995, pp.216–221.

  8. [8]

    J Ross Quinlan. Induction of decision trees.Machine Learning, 1986 (1): 81–106.

  9. [9]

    J Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufman, 1994.

  10. [10]

    http://yoda.cis.temple.edu:8080/UGAIWWW/lectures95/learn/C45/

  11. [11]

    Rakesh Agrawal, ramakrishnan Srikant. Fast algorithms for mining association rules. InProc. Int. Conf. VLDB 1994, 1994, pp.487–499.

  12. [12]

    Freeman D H. Applied Categorical Data Analysis. Marcel Dekker, Inc., 1987.

  13. [13]

    http://www.almaden.ibm.com/cs/quest/demos.html.

Download references

Author information

Correspondence to Aoying Zhou.

Additional information

This work is partially supported by the NKBRSF of China (G1998030414) and the National Natural Science Foundation of China (No.60003016).

ZHOU Aoying received his M.S. degree in computer science from Sichuan. University in 1988, and his Ph.D. degree in computer software from Fudan University in 1993. He is currently a professor in the Department of Computer Science, Fudan University. His main research interests include object-oriented data model for multimedia information, Web data management, data mining and data warehousing, the novel database technologies and their applications to digital library and electronic commerce.

QIAN Weining is a Ph.D. candidate in the Departemnt of Computer Science, Fudan University. His speciality is database and knowledge base. He is supported by Microsoft Research Fellowship. His research interests include clustering, data mining and Web mining.

QIAN Hailei is a graduate student in the Department of Computer Science Fudan University. Her speciality is database and knowledge base. Her research interests include clustering and data mining.

JIN Wen is a Ph.D. candidate in the School of Computing, Simon Fraser University, Canada, supervised by Dr. Jiawei Han. His current research interests are database and data warehousing, data mining, Web mining and XML.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zhou, A., Qian, W., Qian, H. et al. A new classification method to overcome over-branching. J. Comput. Sci. & Technol. 17, 18–27 (2002). https://doi.org/10.1007/BF02949821

Download citation

Keywords

  • data mining
  • classification
  • over branching
  • decision tree
  • frequent pattern