An Efficient Strategy to Handle Complex Datasets Having Multimodal Distribution
One of the main shortcomings of the conventional classifiers is appeared when facing with datasets having multimodal distribution. To overcome this drawback, here, an efficient strategy is proposed in which a clustering phase is firstly executed over all class samples to partition the feature space into separate subspaces (clusters). Since in clustering label of samples are not considered, each cluster contains impure samples belonging to different classes. The next phase is to apply a classifier to each of the created clusters. The main advantage of this proposed distributed approach is to simplify a complex pattern recognition problem by training a specific classifier for each subspace. It is expected applying an efficient classifier to a local cluster leads to better results compared to apply it to several scattered clusters. In the validation and test phases, before make a decision about which classifier should be applied, we should find the nearest cluster to the input sample and then utilize the corresponding trained classifier. Experimental results over different UCI datasets demonstrate a significant supremacy of the proposed distributed classifier system in comparison with single classifier approaches.
KeywordsDistributed classifiers classifier ensembles subspace classification distributed learning complex systems
Unable to display preview. Download preview PDF.
- 2.Schapiro, R.E.: The Strength of Weak Learnability. Journal of Machine Learning 5(2), 197–227 (1990)Google Scholar
- 7.Parimala, M., Lopez, D., Senthilkumar, N.C.: A Survey on Density Based Clustering Algorithms for Mining Large Spatial Databases. International Journal of Advanced Science and Technology 31 (2011)Google Scholar
- 8.Nagpal, P., Mann, P.: Comparative Study of Density based Clustering Algorithms. International Journal of Computer Applications 27(11), 421–435 (2011)Google Scholar
- 9.Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96(34), 226–231 (1996)Google Scholar
- 10.Ankrest, M., Breunig, M., Kriegel, H., Sander, J.: OPTICS: Ordering Points to Identify the Clustering Structure. In: International Conference on Management of Data, pp. 49–60 (1999)Google Scholar
- 14.Cai, D., Zhang, C., He, X.: Unsupervised Feature Selection for Multi-cluster Data. In: 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2010) (July 2010)Google Scholar