Abstract
We describe a method for extractive summarization of legal judgments using our own graph-based summarization algorithm. In contrast to the connected and undirected graphs of previous work, we construct directed and disconnected graphs (a set of connected graphs) for each document, where each connected graph indicates a cluster that shares one topic in a document. Our method automatically chooses the number of representative sentences with coherence for summarization, and we don’t need to provide a priori, the desired compression rate. We also propose our own node/edge-weighting scheme in the graph. Furthermore, we do not depend on expensive hand-crafted linguistic features or resources. Our experimental results show our method outperforms previous clustering-based methods, including those which use TF*IDF-based and centroid-based sentence selection. Our experimental results also show that our method outperforms previous machine learning methods that exploit a variety of linguistic features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barzilay, R., Elhadad, M.: Text summarizations with lexical chains. In: Mani, I., Maybury, M. (eds.) Advances in Automatic Text Summarization, pp. 111–121. MIT Press (1999)
Chieze, E., Farzindar, A., Lapalme, G.: An Automatic System for Summarization and Information Extraction of Legal Information. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS (LNAI), vol. 6036, pp. 216–234. Springer, Heidelberg (2010)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science, 391–407 (1990)
Edmundson, H.: New methods in automatic extracting. Journal of the ACM 16(2), 264–285 (1969)
Erkan, G., Radev, D.R.: LexRank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Farzindar, A., Lapalme, G.: Legal Texts Summarization by Exploration of the Thematic Structures and Argumentative Roles. In: Text Summarization Branches Out: Proceedings of the ACL 2004 Workshop, pp. 27–34 (2004)
Fuentes, M., Alfonseca, E., RodrĂguez, H.: Support vector machines for query-focused summarization trained and evaluated on pyramid data. In: Proc. of the Annual Meeting of the Association for Computational Linguistics, Companion Volume: Proceedings of the Demo and Poster Sessions, pp. 57–60 (2007)
Galgani, F., Compton, P., Hoffmann, A.: Citation Based Summarisation of Legal Texts. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS, vol. 7458, pp. 40–52. Springer, Heidelberg (2012)
Galley, M., McKeown, K.: Improving word sense disambiguation in lexical chaining. In: Proc. of the International Joint Conference on Artificial Intelligence, pp. 1486–1488 (2003)
Garcia-Hernandez, R.A., Ledeneva, Y.: Word Sequence Models for Single Text Summarization. In: Proc. of Conference on Advances in Computer-Human Interaction, pp. 44–48 (2009)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proc. of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25 (2001)
Gupta, V.: A Survey of Text Summarization Extractive Techniques. Journal of Emerging Technologies in Web Intelligence 2(3), 258–268 (2010)
Grover, C., Hachey, B., Hughson, I.: The HOLJ Corpus: supporting summarization of legal texts. In: Proc. of the 5th International Workshop on Linguistically Interpreted Corpora (2004)
Hachey, B., Grover, C.: Extractive summarization of legal texts. Artificial Intelligence and Law 14, 305–345 (2006)
Hakkani-Tur, D., Tur, G.: Statistical sentence extraction for information distillation. In: Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. IV-1–IV-4 (2007)
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156 (2010)
Mihalcea, R., Tarau, P.: TextRank: Bringing order in texts. In: Proc. of the Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Nenkova, A., McKeown, K.: A survery of text summarization techniques. Mining Text Data, 43–76 (2012)
Osborne, M.: Using maximum entropy for sentence extraction. In: Proc. of the ACL Workshop on Automatic Summarization, pp. 1–8 (2002)
Rada, M.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: ACL 2004, pp. 170–173 (2004)
Rath, G., Resnick, A., Savage, R.: The formation of abstracts by the selection of sentences: Part 1: sentence selection by man and machines. American Documentation 2(12), 139–208 (1961)
Sankar, K., Sobha, L.: An Approach to Text Summarization. In: Proc. of Third International Cross Lingual Information Access Workshop, pp. 53–60 (2009)
Saravanan, M., Ravindran, B., Raman, S.: Improving Legal Document Summarization Using Graphical Models. In: Proc. of JURIX, pp. 51–60 (2006)
Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic text structuring and summarization. Information Processing and Management 33(2), 193–208 (1997)
Shakeri, H., Gholamrezazadeh, S., Salehi, M.A., Ghadamyari, F.: A New Graph-Based Algorithm for Persian Text Summarization. In: Park, J.J., Chao, H.-C., Obaidat, M.S., Kim, J. (eds.) Computer Science and Convergence. LNEE, vol. 114, pp. 21–30. Springer, Heidelberg (2012)
Silber, H., McCoy, K.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics 28(4), 487–496 (2002)
Ulrich, J., Murray, G., Carenini, G.: A publicly available annotated corpus for supervised email summarization. In: Proc. of the AAAI EMAIL Workshop, pp. 77–87 (2008)
Wong, K., Wu, M., Li, W.: Extractive summarization using supervised and semi-supervised learning. In: Proc. of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 985–992 (2008)
Xie, S., Lin, H., Liu, Y.: Semi-supervised extractive speech summarization via co-training algorithm. In: The 11th Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2522–2525 (2010)
Yousfi-Monod, M., Farzindar, A., Lapalme, G.: Supervised Machine Learning for Summarizing Legal Documents. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS (LNAI), vol. 6085, pp. 51–62. Springer, Heidelberg (2010)
Zhou, L., Hovy, E.: A web-trained extraction summarization system. In: Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 205–211 (2003)
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, MY., Xu, Y., Goebel, R. (2013). Summarization of Legal Texts with High Cohesion and Automatic Compression Rate. In: Motomura, Y., Butler, A., Bekki, D. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2012. Lecture Notes in Computer Science(), vol 7856. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39931-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-39931-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39930-5
Online ISBN: 978-3-642-39931-2
eBook Packages: Computer ScienceComputer Science (R0)