Abstract
In order to overcome the shortcomings of the incomprehensive of traditional automatic summarization, this paper proposes the automatic multi-document summarization extraction method based on user’s query for web pages. The key technology in our method is the sentence importance weight calculation, which takes varieties of impact factors into account to score the candidate sentence importance weight in the retrieval results. These impact factors include the segmentation results weight, characteristics of sentence structure, length of sentence and the mutual information of search terms. On the basis of our method, this paper gives a description of the automatic summarization process. Then, the comparative experimental results show that our method is more effective on the Precision and Recall than others in abstract extraction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001) ISBN 1558604898
Luhn, H.P.: The automatic creation of literature abstract. IBM Journal of Research and Development 2(2), 159–165 (1958)
Wang, J., Wu, G., Zhou, Y., Zhang, F.: Research on Automatic Summarization of Web Document Guided by Discourse. Journal of Computer Research and Development, 398–405 (2003)
Tadashi, N., Yuji, M.: A New Approach to Unsupervised Text Summarization. In: Proceedings of ACM SIGIR 2001, pp. 26–34 (2001)
Conroy, J.M., Schlesinger, J.D.: CLASSY 2007 at DUC 2007. In: Proceedings of the 2007 Document Understanding Conference (DUC 2007), New York (2007)
Gong, Y.H., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Processing of ACM SIGIR 2001, pp. 19–25 (2001)
Gong, Y., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Proceedings of ACM SIGIR 2001, pp. 19–25 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this paper
Cite this paper
He, Q., Hao, HW., Yin, XC. (2012). Query-Based Automatic Multi-document Summarization Extraction Method for Web Pages. In: Gaol, F., Nguyen, Q. (eds) Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science. Advances in Intelligent and Soft Computing, vol 144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28314-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-28314-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28313-0
Online ISBN: 978-3-642-28314-7
eBook Packages: EngineeringEngineering (R0)