Abstract
We investigate a number of measures associated with partitions. The first of these is congruence measures, which are used to calculate the similarity between two partitions. We provide a number of examples of this type of measure. Another class of measures we investigate are prognostication measures. This measure, closely related to a concept of containment between partitions, are useful in indicating how well knowledge of an objects class in one partition predicts its class in a second partitioning. Finally we introduce a measure of the non-specificity of a partition. This measures a feature of a partition related to the generality of the constituent classes of the partition. A common task in machine learning is developing rules that allow us to predict the class of an object based upon the value of some features of the object. The more narrowly we categorize the features in the rules the better we can predict an objects classification. However counterbalancing this is the fact that to many narrow feature categories are difficult for human experts to cognitively manage, this introduces a fundamental issue in data mining. We shown how the combined use of our measures prognostication and non-specificity allow us navigate this issue.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Michalski, R.S., Stepp, R.E.: Learning from observation: Conceptual clustering. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach. Morgan Kaufmann, San Mateo (1983)
Michalski, R.S., Stepp, R.E.: Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 5, 396–410 (1983)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)
Pawlak, Z.: Rough Sets- Theoretical Aspects of Reasoning About Data. Kluwer, Hingham (1991)
Zadeh, L.A.: Similarity relations and fuzzy orderings. Information Sciences 3, 177–200 (1971)
Hillier, F.S., Lieberman, G.J.: Introduction to Operations Research. McGraw Hill, New York (2005)
Goldberg, D.E.: Genetic Algorithms in Search Optimization and Machine Learning. Addison-Wesley, Reading (1989)
Yager, R.R.: On measures of specificity. In: Kaynak, O., Zadeh, L.A., Turksen, B., Rudas, I.J. (eds.) Computational Intelligence: Soft Computing and Fuzzy-Neuro Integration with Applications, pp. 94–113. Springer, Berlin (1998)
Klir, G.J.: Uncertainty and Information. John Wiley & Sons, New York (2006)
Miller, G.A.: The magical number seven, plus or minus two: Some limitations on our capacity for processing information. Psychological Review 63, 81–97 (1956)
Yager, R.R., Petry, F.E.: Evidence resolution using concept hierarchies. IEEE Transactions on Fuzzy Systems 16, 299–308 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Yager, R.R. (2010). Partition Measures for Data Mining. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-05177-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05176-0
Online ISBN: 978-3-642-05177-7
eBook Packages: EngineeringEngineering (R0)