Skip to main content

An Axiomatic Approach to Defining Approximation Measures for Functional Dependencies

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2435))

Abstract

We consider the problem of defining an approximation measure for functional dependencies (FDs). An approximation measure for X → Y is a function mapping relation instances, r, to non-negative real numbers. The number to which r is mapped, intuitively, describes the “degree” to which the dependency X → Y holds in r. We develop a set of axioms for measures based on the following intuition. The degree to which X → Y is approximate in r is the degree to which r determines a function from π X(r)to π Y (r). The axioms apply to measures that depend only on frequencies (i.e. the frequency of x ∈ π X(r) is the number of tuples containing x divided by the total number of tuples). We prove that a unique measure satisfies these axioms (up to a constant multiple), namely, the information dependency measure of [5]. We do not argue that this result implies that the only reasonable, frequency-based, measure is the information dependency measure. However, if an application designer decides to use another measure, then the designer must accept that the measure used violates one of the axioms.

Work supported by National Science Foundation grant IIS-0082407.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul S., Hull R., and Vianu V. Foundations of Database Systems. Addison-Wesley, Reading, Mass., 1995.

    Google Scholar 

  2. Ash R. Information Theory. Interscience Publishers, John Wiley and Sons, New York, 1965.

    MATH  Google Scholar 

  3. Cavallo R. and Pittarelli M. The Theory of Probabilistic Databases. In Proceedings 13th International Conference on Very Large Databases (VLDB), pages 71–81, 1987.

    Google Scholar 

  4. Dalkilic M. Establishing the Foundations of Data Mining. PhD thesis, Indiana University, Bloomington, IN 47404, May 2000.

    Google Scholar 

  5. Dalkilic M. and Robertson E. Information Dependencies. In Proceedings 19th ACM SIGMOD-SIGACT-SIGART Symposium on Principals of Database Systems (PODS), pages 245–253, 2000.

    Google Scholar 

  6. De Bra P. and Paredaens J. An Algorithm for Horizontal Decompositions. Information Processing Letters, 17:91–95, 1983.

    Article  MATH  Google Scholar 

  7. Demetrovics J., Katona G.O.H., and Miklos D. Partial Dependencies in Relational Databases and Their Realization. Discrete Applied Mathematics, 40:127–138, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  8. Demetrovics J., Katona G.O.H., Niklosb D., Seleznjevc O., and Thalheimd B. Asymptotic Properties of Keys and Functional Dependencies in Random Databases. Theoretical Computer Science, 40(2):151–166, 1998.

    Article  Google Scholar 

  9. Goodman L. and Kruskal W. Measures of Associations for Cross Classifications.Journal of the American Statistical Association, 49:732–764, 1954.

    Article  MATH  Google Scholar 

  10. Hilderman R. and Hamilton H. Evaluation of Interestingness Measures for Ranking Discovered Knowledge. In Lecture Notes in Computer Science 2035 (Proceedings Fifth Pacific-Asian Conference on Knowledge Discovery and Data Mining (PAKDD 2001)), pages 247–259, 2001.

    Google Scholar 

  11. Huhtala Y., Kärkkäinen J., Porkka P., and Toivonen H. TANE: An Efficient Algorithm for Discovering Functional and Approximate Dependencies. The Computer Journal, 42(2):100–111, 1999.

    Article  MATH  Google Scholar 

  12. Kantola M., Mannila H., Räihä K., and Siirtola H. Discovering Functional and Inclusion Dependencies in Relational Databases. International Journal of Intelligent Systems, 7:591–607, 1992.

    Article  MATH  Google Scholar 

  13. Kivinen J., Mannila H. Approximate Inference of Functional Dependencies from Relations. Theoretical Computer Science, 149:129–149, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  14. Lee T. An Information-Theoretic Analysis of Relational Databases — Part I: Data Dependencies and Information Metric. IEEE Transactions on Software Engineering, SE-13(10):1049–1061, 1987.

    Article  Google Scholar 

  15. Lopes S., Petit J., and Lakhal L. Efficient Discovery of Functional Dependencies and Armstrong Relations. In Lecture Notes in Computer Science 1777 (Proceedings 7th International Conference on Extending Database Technology (EDBT)), pages 350–364, 2000.

    Google Scholar 

  16. Malvestuto F. Statistical Treatment of the Information Content of a Database. Information Systems, 11(3):211–223, 1986.

    Article  MATH  Google Scholar 

  17. Mannila H. and Räihä K. Dependency Inference. In Proceedings 13th International Conference on Very Large Databases (VLDB), pages 155–158, 1987.

    Google Scholar 

  18. Nambiar K. K. Some Analytic Tools for the Design of Relational Database Systems. In Proceedings 6th International Conference on Very Large Databases (VLDB), pages 417–428, 1980.

    Google Scholar 

  19. Novelli N. and Cicchetti R. Functional and Embedded Dependency Inference: a Data Mining Point of View. Information Systems, 26:477–506, 2001.

    Article  MATH  Google Scholar 

  20. Piatatsky-Shapiro G. Probabilistic Data Dependencies. In Proceedings ML-92 Workshop on Machine Discovery, Aberdeen, UK, pages 11–17, 1992.

    Google Scholar 

  21. Ramakrishnan R., Gehrke J. Database Management Systems Second Edition. Mc-Graw Hill Co., New York, 2000.

    Google Scholar 

  22. Wyss C., Giannella C., and Robertson E. FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances. In Lecture Notes in Computer Science 2114 (Proceedings 3rd International Conference on Data Warehousing and Knowledge Discovery (DaWaK)), pages 101–110, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Giannella, C. (2002). An Axiomatic Approach to Defining Approximation Measures for Functional Dependencies. In: Manolopoulos, Y., Návrat, P. (eds) Advances in Databases and Information Systems. ADBIS 2002. Lecture Notes in Computer Science, vol 2435. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45710-0_4

Download citation

  • DOI: https://doi.org/10.1007/3-540-45710-0_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44138-0

  • Online ISBN: 978-3-540-45710-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics