Skip to main content

Refining the Results of Automatic e-Textbook Construction by Clustering

  • Conference paper
Advances in Web-Based Learning – ICWL 2005 (ICWL 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3583))

Included in the following conference series:

  • 871 Accesses

Abstract

The abundance of knowledge-rich information on the World Wide Web makes compiling an online e-textbook both possible and necessary. The authors of [7] proposed an approach to automatically generate an e-textbook by mining the ranking lists of the search engine. However, the performance of the approach was degraded by Web pages that were relevant but not actually discussing the desired concept. In this paper, we extend the work in [7] by applying a clustering approach before the mining process. The clustering approach serves as a post-processing stage to the original results retrieved by the search engine, and aims to reach an optimum state in which all Web pages assigned to a concept are discussing that exact concept.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brin, S., Page, L.: The Anatomy of a Large-scale Hypertextual Web Search Engine. In: Proceedings of International Conference on World Wide Web (1998)

    Google Scholar 

  2. Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann Publishers, San Francisco (2002)

    Google Scholar 

  3. Zamir, O., Etzioni, O.: Grouper: A Dynamic Clustering Interface to Web Search Results. Computer Networks 31(11-16), 1361–1374 (1999)

    Article  Google Scholar 

  4. Zeng, H.-J., He, Q.-C., Chen, Z., Ma, W.-Y.: Learning To Cluster Web Search Results. In: Proceedings of the 27th annual international conference on research and development in information retrieval (SIGIR 2004), Sheffield, United Kingdom, pp. 210–217 (July 2004)

    Google Scholar 

  5. Ferragina, P., Gullí, A.: The Anatomy of a Hierarchical Clustering Engine for Web-page, News and Book Snippets. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 395–398. Springer, Heidelberg (2004)

    Google Scholar 

  6. Vivisimo, http://vivisimo.com/html/index

  7. Chen, J., Li, Q., Wang, L., Jia, W.: Automatically Generating an e-Textbook on the Web. In: Liu, W., Shi, Y., Li, Q. (eds.) ICWL 2004. LNCS, vol. 3143, pp. 35–42. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Liu, B., Chin, C.-W., Ng, H.-T.: Mining Topic-specific Concepts and Definitions on the Web. In: Proceedings of International Conference on World Wide Web, 2003, pp. 251–260 (2003)

    Google Scholar 

  9. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw Hill, New York (1983)

    MATH  Google Scholar 

  10. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing 18(6), 1245–1262 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  11. Wang, Y., DeWitt, D.J., Cai, J.-y.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: ICDE 2003, pp. 519–530 (2003)

    Google Scholar 

  12. Nierman, A., Jagadish, H.V.: Evaluating Structural Similarity in XML Documents. In: WebDB 2002, pp. 61–66 (2002)

    Google Scholar 

  13. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An Efficient k-Means Clustering Algorithm: Analysis and Implementation. IEEE Transaction on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)

    Article  Google Scholar 

  14. de Castro Reis, D., Golgher, P.B., da Silva, A.S., Laender, A.H.F.: Automatic web news extraction using tree edit distance. In: WWW 2004, pp. 502–511 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, J., Li, Q., Feng, L. (2005). Refining the Results of Automatic e-Textbook Construction by Clustering. In: Lau, R.W.H., Li, Q., Cheung, R., Liu, W. (eds) Advances in Web-Based Learning – ICWL 2005. ICWL 2005. Lecture Notes in Computer Science, vol 3583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11528043_31

Download citation

  • DOI: https://doi.org/10.1007/11528043_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27895-5

  • Online ISBN: 978-3-540-31716-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics