Approximations in Database Systems
The need for approximations of information has become very critical in the recent past. From traditional query optimization to newer functionality like user feedback and knowledge discovery, data management systems require quick delivery of approximate data in order to serve their goals. There are several techniques that have been proposed to solve the problem, each with its own strengths and weaknesses. In this paper, we take a look at some of the most important data approximation problems and attempt to put them in a common framework and identify their similarities and differences. We then hint on some open and challenging problems that we believe are worth investigating.
KeywordsDatabase System Data Element Class Attribute Semantic Distance Decision Tree Method
Unable to display preview. Download preview PDF.
- 1.Barbará D., et al.: The New Jersey Data Reduction Report. Data Engineering Bulletin 20:4 (1997) 3–45Google Scholar
- 2.Bradley P., Gehrke J., Ramakrishnan R., Srikant R.: Scaling Mining Algorithms to Large Databases. CACM 45:8 (2002) 38–43Google Scholar
- 3.Bruno N., Chaudhuri S., Gravano L.: STHoles: A Multidimensional Workload-Aware Histogram. SIGMOD Conference (2001) 294–305Google Scholar
- 5.Garofalakis M., Gibbons P.: Approximate Query Processing: Taming the Terabytes. VLDB Conference Tutorial (2001)Google Scholar
- 6.Ioannidis Y., Poosala V.: Histogram-Based Approximation of Set-Valued Query-Answers. VLDB Conference (1999) 174–185Google Scholar
- 7.Jagadish H. V., et al.: Optimal Histograms with Quality Guarantees. VLDB Conference (1998) 275–286Google Scholar
- 8.Poosala V., Ioannidis Y.: Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB Conference (1997) 486–495Google Scholar
- 9.Poosala V., Ioannidis Y., Haas P., Shekita E.: Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conference (1996) 294–305Google Scholar
- 10.Theodoridis, S.: Pattern Recognition. Encyclopedia of Information Systems, Vol. 3. Elsevier Science (2003) 459–479Google Scholar