Collaborative Multi-agent System for Automatic Linear Text Segmentation

Perotto, Filipo Studzinski

doi:10.1007/978-3-031-21203-1_35

Filipo Studzinski Perotto ORCID: orcid.org/0000-0003-2283-4703¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13753))

Included in the following conference series:

International Conference on Principles and Practice of Multi-Agent Systems

697 Accesses

Abstract

This paper proposes a collaborative multi-agent system for splitting documents into semantically coherent text chunks, labeling them according to a given segmentation structure. Diverse linear text segmentation methods can be incorporated into the system by introducing new agents, which allows to combine complementary approaches: domain-specific, supervised and unsupervised. The system must be supplied with a representative set of previously segmented documents from the target corpus, which are used both to train the supervised agents and to evaluate every agent within the system, similar to ensemble methods. The accuracy of each agent determines its weight in a subsequent aggregation phase, when a common solution is agreed on. The proposed approach presented promising results on segmenting documents from a juridical corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.legifrance.gouv.fr/search/juri.

References

Arnold, S., Schneider, R., Cudré-Mauroux, P., Gers, F.A., Löser, A.: SECTOR: a neural model for coherent topic segmentation and classification. Trans. ACL 7, 169–184 (2019)
Google Scholar
Bayomi, M., Levacher, K., Ghorab, M.R., Lawless, S.: OntoSeg: a novel approach to text segmentation using ontological similarity. In: ICDMW 2015, Proceedings, pp. 1274–1283. IEEE (2015)
Google Scholar
Beeferman, D., Berger, A.L., Lafferty, J.D.: Statistical models for text segmentation. Mach. Learn. 34(1–3), 177–210 (1999)
Article MATH Google Scholar
Conitzer, V.: Making decisions based on the preferences of multiple agents. Commun. ACM 53(3), 84–94 (2010)
Article MathSciNet Google Scholar
Dadachev, B., Balinsky, A., Balinsky, H.: On automatic text segmentation. In: Proceedings of the ACM Symposium on Document Engineering. DocEng 2014, pp. 73–80. ACM (2014)
Google Scholar
Ghinassi, I.: Unsupervised text segmentation via deep sentence encoders: a first step towards a common framework for text-based segmentation, summarization and indexing of media content. In: 2nd DataTV, Proceedings. Zenodo (2021)
Google Scholar
Glavaš, G., Nanni, F., Ponzetto, S.P.: Unsupervised text segmentation using semantic relatedness graphs. In: 5th SEM, Proceedings, pp. 125–130. ACL (2016)
Google Scholar
Gupta, V., Zhu, G., Yu, A., Brown, D.E.: A comparative study of the performance of unsupervised text segmentation techniques on dialogue transcripts. In: SIEDS 2020, Proceedings, pp. 1–6 (2020)
Google Scholar
Habibi, M., et al.: Patseg: a sequential patent segmentation approach. Big Data Res. 19–20, 100133 (2020)
Article Google Scholar
Hearst, M.A.: TextTiling: segmenting text into multi-paragraph subtopic passages. Comput. Linguist. 23(1), 33–64 (1997)
Google Scholar
Koshorek, O., Cohen, A., Mor, N., Rotman, M., Berant, J.: Text segmentation as a supervised learning task. In: NAACL, Proceedings, vol. 2, pp. 469–473. ACL (2018)
Google Scholar
Li, W., Matsukawa, T., Saigo, H., Suzuki, E.: Context-aware latent Dirichlet allocation for topic segmentation. In: Lauw, H.W., Wong, R.C.-W., Ntoulas, A., Lim, E.-P., Ng, S.-K., Pan, S.J. (eds.) PAKDD 2020. LNCS (LNAI), vol. 12084, pp. 475–486. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47426-3_37
Chapter Google Scholar
Memon, M.Q., Lu, Y., Chen, P., Memon, A., Pathan, M.S., Zardari, Z.A.: An ensemble clustering approach for topic discovery using implicit text segmentation. J. Inf. Sci. 47(4), 431–457 (2021)
Article Google Scholar
Misra, H., Yvon, F., Jose, J.M., Cappe, O.: Text segmentation via topic modeling: an analytical study. In: CIKM 2009, Proceedings, pp. 1553–1556. ACM (2009)
Google Scholar
Pak, I., Teh, P.L.: Text segmentation techniques: a critical review. In: Zelinka, I., Vasant, P., Duy, V.H., Dao, T.T. (eds.) Innovative Computing, Optimization and Its Applications. SCI, vol. 741, pp. 167–181. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66984-7_10
Chapter Google Scholar
Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agent. Multi-agent Syst. 11, 387–434 (2005)
Article Google Scholar
Pethe, C., Kim, A., Skiena, S.: Chapter Captor: text segmentation in novels. In: EMNLP 2020, Proceedings, pp. 8373–8383. ACL (2020)
Google Scholar
Riedl, M., Biemann, C.: Text segmentation with topic models. J. Lang. Technol. Comput. Linguist. 27(47–69), 13–24 (2012)
Google Scholar
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010)
Article Google Scholar
Wagh, R.S., Anand, D.: A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge. In: Thampi, S.M., et al. (eds.) Intelligent Systems, Technologies and Applications. AISC, vol. 910, pp. 53–63. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-6095-4_4
Chapter Google Scholar
Zeinab Shahbazi, Y.C.B.: Analysis of domain-independent unsupervised text segmentation using LDA topic modeling over social media contents. Int. J. Adv. Sci. Technol. 29(06), 5993–6014 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

ONERA/DTIS, University of Toulouse, 31055, Toulouse, France
Filipo Studzinski Perotto

Authors

Filipo Studzinski Perotto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filipo Studzinski Perotto .

Editor information

Editors and Affiliations

Özyeğin University, Istanbul, Turkey
Reyhan Aydoğan
Universitat Politècnica de València, Valencia, Spain
Natalia Criado
Université Paris-Dauphine, Paris, France
Jérôme Lang
Universitat Politècnica de València, Valencia, Spain
Victor Sanchez-Anguix
King's College London, London, UK
Marc Serramia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perotto, F.S. (2023). Collaborative Multi-agent System for Automatic Linear Text Segmentation. In: Aydoğan, R., Criado, N., Lang, J., Sanchez-Anguix, V., Serramia, M. (eds) PRIMA 2022: Principles and Practice of Multi-Agent Systems. PRIMA 2022. Lecture Notes in Computer Science(), vol 13753. Springer, Cham. https://doi.org/10.1007/978-3-031-21203-1_35

Download citation

DOI: https://doi.org/10.1007/978-3-031-21203-1_35
Published: 12 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21202-4
Online ISBN: 978-3-031-21203-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Collaborative Multi-agent System for Automatic Linear Text Segmentation