Skip to main content

Document Clustering

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 14 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Boley D. Principal direction divisive partitioning. Data Mining Knowl Discov. 1998; 2(4): 325–44.

    Google Scholar 

  2. Cutting DR, Pedersen JO, Karger DR, Tukey JW. Scatter/gather: a cluster-based approach to browsing large document collections. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1992. p. 318–29.

    Google Scholar 

  3. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc. 1977;39(1):1–38.

    MathSciNet  MATH  Google Scholar 

  4. Dhillon IS. Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2001. p. 269–74.

    Google Scholar 

  5. Ding C., He X., Zha H., Gu M., and Simon H. 1Spectral min-max cut for graph partitioning and data clustering. Technical Report TR-2001-XX, Lawrence Berkeley National Laboratory, University of California, Berkeley, 2001.

    Google Scholar 

  6. Duda RO, Hart PE, Stork DG. Pattern classification. New York: Wiley; 2001.

    MATH  Google Scholar 

  7. Fisher D. Iterative optimization and simplification of hierarchical clusterings. J Artif Intell Res. 1996;4(1):147–80.

    Article  MATH  Google Scholar 

  8. Jain AK, Dubes RC. Algorithms for clustering data. New York: Prentice Hall; 1988.

    MATH  Google Scholar 

  9. Karypis G. Cluto: a clustering toolkit. Technical Report 02-017, Department of Computer Science, University of Minnesota, 2002.

    Google Scholar 

  10. King B. Step-wise clustering procedures. J Am Stat Assoc. 1967;69(317):86–101.

    Article  Google Scholar 

  11. MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Symposium on Mathematical Statistics and Probablity. 1967; p. 281–97.

    Google Scholar 

  12. Salton G. Automatic text processing: the transformation, analysis, and retrieval of information by computer. Reading: Addison-Wesley; 1989.

    Google Scholar 

  13. Sneath PH, Sokal RR. Numerical taxonomy. London: Freeman; 1973.

    MATH  Google Scholar 

  14. Zahn K. Graph-tehoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput. 1971; C-20(1):68–86.

    Article  MATH  Google Scholar 

  15. Zha H, He X, Ding C, Simon H, Gu M. Bipartite graph partitioning and data clustering. In: Proceedings of the International Conference on Information and Knowledge Management; 2001.

    Google Scholar 

  16. Zhao Y, Karypis G. Criterion functions for document clustering: experiments and analysis. Mach Learn. 2004;55:311–31.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Zhao, Y., Karypis, G. (2018). Document Clustering. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1479

Download citation

Publish with us

Policies and ethics