Summarization of Legal Texts with High Cohesion and Automatic Compression Rate

Kim, Mi-Young; Xu, Ying; Goebel, Randy

doi:10.1007/978-3-642-39931-2_14

Mi-Young Kim⁷,
Ying Xu⁷ &
Randy Goebel⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7856))

Included in the following conference series:

JSAI International Symposium on Artificial Intelligence

1055 Accesses
9 Citations

Abstract

We describe a method for extractive summarization of legal judgments using our own graph-based summarization algorithm. In contrast to the connected and undirected graphs of previous work, we construct directed and disconnected graphs (a set of connected graphs) for each document, where each connected graph indicates a cluster that shares one topic in a document. Our method automatically chooses the number of representative sentences with coherence for summarization, and we don’t need to provide a priori, the desired compression rate. We also propose our own node/edge-weighting scheme in the graph. Furthermore, we do not depend on expensive hand-crafted linguistic features or resources. Our experimental results show our method outperforms previous clustering-based methods, including those which use TF*IDF-based and centroid-based sentence selection. Our experimental results also show that our method outperforms previous machine learning methods that exploit a variety of linguistic features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barzilay, R., Elhadad, M.: Text summarizations with lexical chains. In: Mani, I., Maybury, M. (eds.) Advances in Automatic Text Summarization, pp. 111–121. MIT Press (1999)
Google Scholar
Chieze, E., Farzindar, A., Lapalme, G.: An Automatic System for Summarization and Information Extraction of Legal Information. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS (LNAI), vol. 6036, pp. 216–234. Springer, Heidelberg (2010)
Chapter Google Scholar
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science, 391–407 (1990)
Google Scholar
Edmundson, H.: New methods in automatic extracting. Journal of the ACM 16(2), 264–285 (1969)
Article Google Scholar
Erkan, G., Radev, D.R.: LexRank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Article Google Scholar
Farzindar, A., Lapalme, G.: Legal Texts Summarization by Exploration of the Thematic Structures and Argumentative Roles. In: Text Summarization Branches Out: Proceedings of the ACL 2004 Workshop, pp. 27–34 (2004)
Google Scholar
Fuentes, M., Alfonseca, E., Rodríguez, H.: Support vector machines for query-focused summarization trained and evaluated on pyramid data. In: Proc. of the Annual Meeting of the Association for Computational Linguistics, Companion Volume: Proceedings of the Demo and Poster Sessions, pp. 57–60 (2007)
Google Scholar
Galgani, F., Compton, P., Hoffmann, A.: Citation Based Summarisation of Legal Texts. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS, vol. 7458, pp. 40–52. Springer, Heidelberg (2012)
Chapter Google Scholar
Galley, M., McKeown, K.: Improving word sense disambiguation in lexical chaining. In: Proc. of the International Joint Conference on Artificial Intelligence, pp. 1486–1488 (2003)
Google Scholar
Garcia-Hernandez, R.A., Ledeneva, Y.: Word Sequence Models for Single Text Summarization. In: Proc. of Conference on Advances in Computer-Human Interaction, pp. 44–48 (2009)
Google Scholar
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proc. of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25 (2001)
Google Scholar
Gupta, V.: A Survey of Text Summarization Extractive Techniques. Journal of Emerging Technologies in Web Intelligence 2(3), 258–268 (2010)
Article Google Scholar
Grover, C., Hachey, B., Hughson, I.: The HOLJ Corpus: supporting summarization of legal texts. In: Proc. of the 5th International Workshop on Linguistically Interpreted Corpora (2004)
Google Scholar
Hachey, B., Grover, C.: Extractive summarization of legal texts. Artificial Intelligence and Law 14, 305–345 (2006)
Article Google Scholar
Hakkani-Tur, D., Tur, G.: Statistical sentence extraction for information distillation. In: Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. IV-1–IV-4 (2007)
Google Scholar
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156 (2010)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: Bringing order in texts. In: Proc. of the Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Google Scholar
Nenkova, A., McKeown, K.: A survery of text summarization techniques. Mining Text Data, 43–76 (2012)
Google Scholar
Osborne, M.: Using maximum entropy for sentence extraction. In: Proc. of the ACL Workshop on Automatic Summarization, pp. 1–8 (2002)
Google Scholar
Rada, M.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: ACL 2004, pp. 170–173 (2004)
Google Scholar
Rath, G., Resnick, A., Savage, R.: The formation of abstracts by the selection of sentences: Part 1: sentence selection by man and machines. American Documentation 2(12), 139–208 (1961)
Article Google Scholar
Sankar, K., Sobha, L.: An Approach to Text Summarization. In: Proc. of Third International Cross Lingual Information Access Workshop, pp. 53–60 (2009)
Google Scholar
Saravanan, M., Ravindran, B., Raman, S.: Improving Legal Document Summarization Using Graphical Models. In: Proc. of JURIX, pp. 51–60 (2006)
Google Scholar
Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic text structuring and summarization. Information Processing and Management 33(2), 193–208 (1997)
Article Google Scholar
Shakeri, H., Gholamrezazadeh, S., Salehi, M.A., Ghadamyari, F.: A New Graph-Based Algorithm for Persian Text Summarization. In: Park, J.J., Chao, H.-C., Obaidat, M.S., Kim, J. (eds.) Computer Science and Convergence. LNEE, vol. 114, pp. 21–30. Springer, Heidelberg (2012)
Chapter Google Scholar
Silber, H., McCoy, K.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics 28(4), 487–496 (2002)
Article Google Scholar
Ulrich, J., Murray, G., Carenini, G.: A publicly available annotated corpus for supervised email summarization. In: Proc. of the AAAI EMAIL Workshop, pp. 77–87 (2008)
Google Scholar
Wong, K., Wu, M., Li, W.: Extractive summarization using supervised and semi-supervised learning. In: Proc. of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 985–992 (2008)
Google Scholar
Xie, S., Lin, H., Liu, Y.: Semi-supervised extractive speech summarization via co-training algorithm. In: The 11th Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2522–2525 (2010)
Google Scholar
Yousfi-Monod, M., Farzindar, A., Lapalme, G.: Supervised Machine Learning for Summarizing Legal Documents. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS (LNAI), vol. 6085, pp. 51–62. Springer, Heidelberg (2010)
Chapter Google Scholar
Zhou, L., Hovy, E.: A web-trained extraction summarization system. In: Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 205–211 (2003)
Google Scholar
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, AB, T6G 2E8, Canada
Mi-Young Kim, Ying Xu & Randy Goebel

Authors

Mi-Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ying Xu
View author publications
You can also search for this author in PubMed Google Scholar
Randy Goebel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Institute of Advanced Industrial Science and Technology (AIST), Japan
Yoichi Motomura
Tohoku University, Kawauchi 41, 980-8576, Aoba-ku, Sendai, Japan
Alastair Butler
Ochanomizu University, 2-1-1 Bunkyo-ku, 122-8610, Tokyo, Japan
Daisuke Bekki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, MY., Xu, Y., Goebel, R. (2013). Summarization of Legal Texts with High Cohesion and Automatic Compression Rate. In: Motomura, Y., Butler, A., Bekki, D. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2012. Lecture Notes in Computer Science(), vol 7856. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39931-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-39931-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39930-5
Online ISBN: 978-3-642-39931-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics