An Evaluation of Sampling Methods for Data Mining with Fuzzy C-Means
Using fuzzy c-means as the data-mining tool, this study evaluates the effectiveness of sampling methods in producing the knowledge of interest. The effectiveness is shown in terms of the representative-ness of sampling data and both the accuracy and errors of sampled data sets when subjected to the fuzzy clustering algorithm. Two population data in the weld inspection domain were used for the evaluation. Based on the results obtained, a number of observations are made.
KeywordsData Mining False Positive Rate False Negative Rate Fuzzy Cluster Algorithm Statistical Test Result
Unable to display preview. Download preview PDF.
- Agarwal, R., Gehrke, J., Gunopulos, D., and Raghavan, P., “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications,” SIGMOD ‘88, Seattle, WA, 94–105, 1998.Google Scholar
- Ball, G. H. and Hall, D. J., ISODATA, an iterative method of multivariate analysis and pattern recognition, Behavior Science, 153, 1967.Google Scholar
- Duran, B. S. and Odell, P. L., Cluster Analysis: a Survey, Volume 100 of Lecture Notes in Economics and Mathematical Systems. Springer-Verlag, 1974.Google Scholar
- Guha, S., Rastogi, R., and Shim, K., “CURE: An Efficient Clustering Algorithm for Large Databases,” SIGMOD ‘88, Seattle, WA, 73–84, 1998.Google Scholar
- Kohavi, R., Sommerfield, D., and Dougherty, J., Data Mining Using MLC++: A Machining Learning Library in C++, http://robotics.stanford.edu/—ronnyk.Google Scholar
- Ng, R. T. and Han, J., “Efficient and Effective Clustering Methods for Spatial Data Mining,” in Proc. of the VLDB Conference, Santiago, Chile, 144–155, 1994.Google Scholar
- Quinlan, J. R., C4.5: Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann, 1993.Google Scholar
- Reinartz, T., Focusing Solutions for Data Mining, Springer, 1999.Google Scholar
- Zhang, T., Ramakrishnan, R., and Livny, M., “BIRCH: An Efficient Data Clustering Method for Very Large Databases, ” in Proc. of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.Google Scholar