Skip to main content

Data Selection for Exact Value Acquisition to Improve Uncertain Clustering

  • Conference paper
Web-Age Information Management (WAIM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6184))

Included in the following conference series:

Abstract

In recent years, data uncertainty widely attracts researchers’ attention because the amount of imprecise data is growing rapidly. Although data are not known exactly, probability distributions or expected errors are sometimes available. While most researchers on uncertain data mining are looking for methods to extract mining results from uncertain data, which is usually in the form of probability distributions or expected errors, it is also very important to lower the data uncertainty by making a part of data more certain to help get better mining results. For example, input values of some sensors in the sensor network are usually designed to be recorded more frequently than others because they are more important or more likely to change. In this paper, the issue of selecting a part of uncertain data and acquiring their exact values to improve clustering results is explored. Under a general uncertainty model, we propose both global and localized data selection methods, which can be used together with any existing uncertain clustering algorithm. Experimental results show that the quality of clustering improves after the selective exact value acquisition is applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Yu, P.S.: A Survey of Uncertain Data Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering (2009)

    Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Uncertain Data Streams. In: Proceedings of the 24th IEEE International Conference on Data Engineering (2008)

    Google Scholar 

  3. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: Proceedings of the 29th International Conference on Very Large Data Bases (2003)

    Google Scholar 

  4. Ankerst, M., Breunig, M.M., Kriegel, H.-P., Sander, J.: OPTICS: Ordering Points to Identify the Clustering Structure. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (1999)

    Google Scholar 

  5. Chen, J., Cheng, R.: Quality-Aware Probing of Uncertain Data with Resource Constraints. In: Ludäscher, B., Mamoulis, N. (eds.) SSDBM 2008. LNCS, vol. 5069, pp. 491–508. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Cheng, R., Chen, J., Xie, X.: Cleaning Uncertain Data with Quality Guarantees. In: Proceedings of the 34th International Conference on Very Large Data Bases (2008)

    Google Scholar 

  7. Deshpande, A., Guestrin, C., Madden, S.R., Hellerstein, J.M., Hong, W.: Model-Driven Data Acquisition in Sensor Networks. In: Proceedings of the 34th International Conference on Very Large Data Bases (2004)

    Google Scholar 

  8. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proceedings of the 2nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1996)

    Google Scholar 

  9. Kriegel, H.-P., Pfeifle, M.: Density-Based Clustering of Uncertain Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2005)

    Google Scholar 

  10. Kriegel, H.-P., Pfeifle, M.: Hierarchical Density Based Clustering of Uncertain Data. In: Proceedings of the 5th IEEE International Conference on Data Mining (2005)

    Google Scholar 

  11. Kriegel, H.-P., Pfeifle, M.: Measuring the Quality of Approximated Clusterings. In: BTW (2005)

    Google Scholar 

  12. Ngai, W., Kao, B., Chui, C., Cheng, R., Chau, M., Yip, K.Y.: Efficient Clustering of Uncertain Data. In: Proceedings of the 6th IEEE International Conference on Data Mining (2006)

    Google Scholar 

  13. Olston, C., Jiang, J., Widom, J.: Adaptive Filters for Continuous Queries over Distributed Data Streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2003)

    Google Scholar 

  14. Zhang, Q., Li, F., Yi, K.: Finding Frequent Items in Probailistic Data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, YC., Yang, DN., Chen, MS. (2010). Data Selection for Exact Value Acquisition to Improve Uncertain Clustering. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14246-8_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14245-1

  • Online ISBN: 978-3-642-14246-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics