Heuristic Measures of Interestingness

  • Robert J. Hilderman
  • Howard J. Hamilton
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 638)

Abstract

The tuples in a generalized relation (i.e., a summary generated from a database) are unique, and therefore, can be considered to be a population with a structure that can be described by some frequency or probability distribution based upon the values contained in the derived Count attribute. In this chapter, we describe sixteen diversity measures that evaluate the frequency or probability distribution of the values in the derived Count attribute in a summary to assign a single real-valued index that represents its interestingness relative to other summaries generated from the same database. The measures are well-known measures of dispersion, dominance, inequality, and concentration that have previously been successfully and frequently applied in several areas of the physical, social, ecological, management, information, and computer sciences. Their use for ranking summaries generated from databases is a new application area.

Keywords

Diversity Measure Gini Coefficient Shannon Index Lorenz Curve Proportional Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Science+Business Media New York 2001

Authors and Affiliations

  • Robert J. Hilderman
    • 1
  • Howard J. Hamilton
    • 1
  1. 1.University of ReginaCanada

Personalised recommendations