Data Selection for Exact Value Acquisition to Improve Uncertain Clustering

Lin, Yu-Chieh; Yang, De-Nian; Chen, Ming-Syan

doi:10.1007/978-3-642-14246-8_45

Yu-Chieh Lin²⁰,
De-Nian Yang²¹ &
Ming-Syan Chen^20,22

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6184))

Included in the following conference series:

International Conference on Web-Age Information Management

1660 Accesses
1 Citations

Abstract

In recent years, data uncertainty widely attracts researchers’ attention because the amount of imprecise data is growing rapidly. Although data are not known exactly, probability distributions or expected errors are sometimes available. While most researchers on uncertain data mining are looking for methods to extract mining results from uncertain data, which is usually in the form of probability distributions or expected errors, it is also very important to lower the data uncertainty by making a part of data more certain to help get better mining results. For example, input values of some sensors in the sensor network are usually designed to be recorded more frequently than others because they are more important or more likely to change. In this paper, the issue of selecting a part of uncertain data and acquiring their exact values to improve clustering results is explored. Under a general uncertainty model, we propose both global and localized data selection methods, which can be used together with any existing uncertain clustering algorithm. Experimental results show that the quality of clustering improves after the selective exact value acquisition is applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C.C., Yu, P.S.: A Survey of Uncertain Data Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering (2009)
Google Scholar
Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Uncertain Data Streams. In: Proceedings of the 24th IEEE International Conference on Data Engineering (2008)
Google Scholar
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: Proceedings of the 29th International Conference on Very Large Data Bases (2003)
Google Scholar
Ankerst, M., Breunig, M.M., Kriegel, H.-P., Sander, J.: OPTICS: Ordering Points to Identify the Clustering Structure. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (1999)
Google Scholar
Chen, J., Cheng, R.: Quality-Aware Probing of Uncertain Data with Resource Constraints. In: Ludäscher, B., Mamoulis, N. (eds.) SSDBM 2008. LNCS, vol. 5069, pp. 491–508. Springer, Heidelberg (2008)
Chapter Google Scholar
Cheng, R., Chen, J., Xie, X.: Cleaning Uncertain Data with Quality Guarantees. In: Proceedings of the 34th International Conference on Very Large Data Bases (2008)
Google Scholar
Deshpande, A., Guestrin, C., Madden, S.R., Hellerstein, J.M., Hong, W.: Model-Driven Data Acquisition in Sensor Networks. In: Proceedings of the 34th International Conference on Very Large Data Bases (2004)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proceedings of the 2nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1996)
Google Scholar
Kriegel, H.-P., Pfeifle, M.: Density-Based Clustering of Uncertain Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2005)
Google Scholar
Kriegel, H.-P., Pfeifle, M.: Hierarchical Density Based Clustering of Uncertain Data. In: Proceedings of the 5th IEEE International Conference on Data Mining (2005)
Google Scholar
Kriegel, H.-P., Pfeifle, M.: Measuring the Quality of Approximated Clusterings. In: BTW (2005)
Google Scholar
Ngai, W., Kao, B., Chui, C., Cheng, R., Chau, M., Yip, K.Y.: Efficient Clustering of Uncertain Data. In: Proceedings of the 6th IEEE International Conference on Data Mining (2006)
Google Scholar
Olston, C., Jiang, J., Widom, J.: Adaptive Filters for Continuous Queries over Distributed Data Streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2003)
Google Scholar
Zhang, Q., Li, F., Yi, K.: Finding Frequent Items in Probailistic Data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, National Taiwan University, Taiwan
Yu-Chieh Lin & Ming-Syan Chen
Institute of Information Science, Academia Sinica, Taiwan
De-Nian Yang
Research Center for Information Technology Innovation, Academia Sinica, Taiwan
Ming-Syan Chen

Authors

Yu-Chieh Lin
View author publications
You can also search for this author in PubMed Google Scholar
De-Nian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Syan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
Lei Chen
Computer Department, Sichuan University, 610064, Chengdu, China
Changjie Tang
Department of Computer Science, Duke University, Box 90129, NC 27708-0129, Durham, USA
Jun Yang
College of Computer Science, Zhejiang University, 388 Yuhangtang Road, 310058, Hangzhou, China
Yunjun Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, YC., Yang, DN., Chen, MS. (2010). Data Selection for Exact Value Acquisition to Improve Uncertain Clustering. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-14246-8_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14245-1
Online ISBN: 978-3-642-14246-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics