Abstract
A new clustering algorithm ISOETRP has been developed. Several new objectives have been introduced to make ISOETRP particularly suitable to hierarchical pattern classification. These objectives are: a) minimizing overlap between pattern class groups, b) maximizing entropy reduction, and c) keeping balance between these groups. The overall objective to be optimized is GAIN=Entropy Reduction/(Overlap+1).
Balance is controlled by maximizing the GAIN. An interactive version of ISOETRP has also been developed by means of an overlap table. It has been shown that ISOETRP gives much better results than other existing algorithms in optimizing the above objectives. ISOETRP has played an important role in designing many large tree classifiers, where the tree performance was improved by optimizing GAIN value.
Similar content being viewed by others
References
G. H. Ball and D. J. Hall, ISODATA, a novel method for data analysis and pattern classification, NTIS Report, AD699616, 1965.
O. Kakusho and R. Mizoguch 1, A new nonlinear mapping algorithm and its application to dimension and cluster analysis, Proc. of the 5th Conf. on Pattern Recognition, 430–432, Dec. 1980.
J. B. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis,Psychometrika,29 (1964).
John A. Hartigon Clustering Algorithms, John Wiley & Sons, Inc., New York, Toronto, 1975.
L. W. Johnson and R. D. Riess, Numerical Analysis, Chapter 3, Addison-Wesley, 1977.
J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles, Chapter 3, 92–103, Addison Wesley, London, Amsterdam, Don Mill, Sydney, Tokyo, 1974.
Michael R. Anderberg, Cluster Analysis for Applications, Chapter, 6, Academic Press, New York, 1973.
Q. R. Wang, Decision Tree Approach to Large Character Set Pattern Recognition Problem, Ph.D. Thesis, Dept. of Computer Science, Concordia University, March, 1984.
C. Y. Suen and Q. R. Wang, ISOETRP — An interactive clustering algorithm with new objectives, in press, Pattern Recognition.
E. Diday and J. C. Simon, Clustering analysis, in Digital Pattern Recognition, K. S. Fu (ed.), 47–94, Springer-Verlag, Berlin, Heidelberg, New York, 1982.
Q. R. Wang, Y. X. Gu and C. Y. Suen, A preliminary study on computer recognition of Chinese characters printed in different fonts, Proc. Int. Conf. of the Chinese Language Computer Society, 344–351, Sept. 1982.
K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972.
Y. X. Gu, Q. R. Wang and C. Y. Suen, Application of multi-layer decision tree in computer recognition of Chinese characters, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-5, 83–89, Jan., 1983.
Q. R. Wang and C. Y. Suen, Analysis and design of a decision tree based on entropy reduction and its application to large character set recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, 406–417, July, 1984.
Q. R. Wang and C. Y. Suen, Classification of Chinese characters by phase features and fuzzy logic search, Proc. 1983 Int. Conf. on Chinese Information Processing, 133–155, Oct., Beijing.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wang, Q. ISOETRP clustering algorithm and its application to tree classifier design. J. of Compt. Sci. & Technol. 1, 70–85 (1986). https://doi.org/10.1007/BF02943303
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02943303