Analyzing XploRe Download Profiles with Intelligent Miner
- 449 Downloads
This paper is an example of data mining in action. The database we are mining contains 1085 profiles of individuals who have downloaded the statistical software XploRe. Each profile contains the responses to an online questionnaire comprised of questions about such things as an individuals’ computing preferences (operating system, favorite statistical software) or professional affiliation. After formatting and cleaning the raw data using MS Excel, we use IBM’s Intelligent Miner to perform a cluster analysis of the download profiles. We try to identify a small number of “types” of users by employing a clustering algorithm based on the New Condorcet Criterion, which is particularly well-suited for our all-categorical data. We identify three clusters in the mining run to which we refer as Academia, Unix/Linux users and Researchers, respectively. Based on the characteristics of the cluster members, we briefly outline how the results of the data analysis may be used for targeted marketing of XploRe.
KeywordsData Mining Cluster Analysis
- Ester, M, Kriegel, H., Sander, J., & Xu, X. (1996). A Density Based Algorith for Discovering Clusters in large Spatial Databases with Noise, Proc. of Int’l Conf. on Knowledge Discovery and Data Mining, Portland, Oregon.Google Scholar
- Gordon, A. D. (1999). Classification, Chapman and Hall, 2nd ed., London.Google Scholar
- Grabmeier, J. & Rudolph, A. (1998). Techniques of Cluster Algorithms in Data Mining, Technical Report IBM, http://www.ibm.com/software/data/iminer/fordata/clusttechn.pdf.
- Guha. S, Rastogi. R, & Shim. K (1998). CURE: An efficient clustering algorithm for large databases, Proc. of ACM SIGMOD Int’l Conf. on Management of Data, New York, pp. 73–84.Google Scholar
- Ng, R.T, & Han, J. (1994). Efficient and Effective Clustering Methods for Spatial Data Mining, Proc. of the 20th Int’l Conf. on Very large databases, Santiago, Chile, pp.144–155.Google Scholar
- Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: An Efficient Data Clustering Method for Very Large Databases, Proc. of the 1996 ACM SIGMOD Int’l Conf. on Management of Data, Montreal, Canada, pp. 103–114.Google Scholar