Skip to main content

Collaborative Multi-agent System for Automatic Linear Text Segmentation

  • Conference paper
  • First Online:
PRIMA 2022: Principles and Practice of Multi-Agent Systems (PRIMA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13753))

  • 697 Accesses

Abstract

This paper proposes a collaborative multi-agent system for splitting documents into semantically coherent text chunks, labeling them according to a given segmentation structure. Diverse linear text segmentation methods can be incorporated into the system by introducing new agents, which allows to combine complementary approaches: domain-specific, supervised and unsupervised. The system must be supplied with a representative set of previously segmented documents from the target corpus, which are used both to train the supervised agents and to evaluate every agent within the system, similar to ensemble methods. The accuracy of each agent determines its weight in a subsequent aggregation phase, when a common solution is agreed on. The proposed approach presented promising results on segmenting documents from a juridical corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.legifrance.gouv.fr/search/juri.

References

  1. Arnold, S., Schneider, R., Cudré-Mauroux, P., Gers, F.A., Löser, A.: SECTOR: a neural model for coherent topic segmentation and classification. Trans. ACL 7, 169–184 (2019)

    Google Scholar 

  2. Bayomi, M., Levacher, K., Ghorab, M.R., Lawless, S.: OntoSeg: a novel approach to text segmentation using ontological similarity. In: ICDMW 2015, Proceedings, pp. 1274–1283. IEEE (2015)

    Google Scholar 

  3. Beeferman, D., Berger, A.L., Lafferty, J.D.: Statistical models for text segmentation. Mach. Learn. 34(1–3), 177–210 (1999)

    Article  MATH  Google Scholar 

  4. Conitzer, V.: Making decisions based on the preferences of multiple agents. Commun. ACM 53(3), 84–94 (2010)

    Article  MathSciNet  Google Scholar 

  5. Dadachev, B., Balinsky, A., Balinsky, H.: On automatic text segmentation. In: Proceedings of the ACM Symposium on Document Engineering. DocEng 2014, pp. 73–80. ACM (2014)

    Google Scholar 

  6. Ghinassi, I.: Unsupervised text segmentation via deep sentence encoders: a first step towards a common framework for text-based segmentation, summarization and indexing of media content. In: 2nd DataTV, Proceedings. Zenodo (2021)

    Google Scholar 

  7. Glavaš, G., Nanni, F., Ponzetto, S.P.: Unsupervised text segmentation using semantic relatedness graphs. In: 5th SEM, Proceedings, pp. 125–130. ACL (2016)

    Google Scholar 

  8. Gupta, V., Zhu, G., Yu, A., Brown, D.E.: A comparative study of the performance of unsupervised text segmentation techniques on dialogue transcripts. In: SIEDS 2020, Proceedings, pp. 1–6 (2020)

    Google Scholar 

  9. Habibi, M., et al.: Patseg: a sequential patent segmentation approach. Big Data Res. 19–20, 100133 (2020)

    Article  Google Scholar 

  10. Hearst, M.A.: TextTiling: segmenting text into multi-paragraph subtopic passages. Comput. Linguist. 23(1), 33–64 (1997)

    Google Scholar 

  11. Koshorek, O., Cohen, A., Mor, N., Rotman, M., Berant, J.: Text segmentation as a supervised learning task. In: NAACL, Proceedings, vol. 2, pp. 469–473. ACL (2018)

    Google Scholar 

  12. Li, W., Matsukawa, T., Saigo, H., Suzuki, E.: Context-aware latent Dirichlet allocation for topic segmentation. In: Lauw, H.W., Wong, R.C.-W., Ntoulas, A., Lim, E.-P., Ng, S.-K., Pan, S.J. (eds.) PAKDD 2020. LNCS (LNAI), vol. 12084, pp. 475–486. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47426-3_37

    Chapter  Google Scholar 

  13. Memon, M.Q., Lu, Y., Chen, P., Memon, A., Pathan, M.S., Zardari, Z.A.: An ensemble clustering approach for topic discovery using implicit text segmentation. J. Inf. Sci. 47(4), 431–457 (2021)

    Article  Google Scholar 

  14. Misra, H., Yvon, F., Jose, J.M., Cappe, O.: Text segmentation via topic modeling: an analytical study. In: CIKM 2009, Proceedings, pp. 1553–1556. ACM (2009)

    Google Scholar 

  15. Pak, I., Teh, P.L.: Text segmentation techniques: a critical review. In: Zelinka, I., Vasant, P., Duy, V.H., Dao, T.T. (eds.) Innovative Computing, Optimization and Its Applications. SCI, vol. 741, pp. 167–181. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66984-7_10

    Chapter  Google Scholar 

  16. Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agent. Multi-agent Syst. 11, 387–434 (2005)

    Article  Google Scholar 

  17. Pethe, C., Kim, A., Skiena, S.: Chapter Captor: text segmentation in novels. In: EMNLP 2020, Proceedings, pp. 8373–8383. ACL (2020)

    Google Scholar 

  18. Riedl, M., Biemann, C.: Text segmentation with topic models. J. Lang. Technol. Comput. Linguist. 27(47–69), 13–24 (2012)

    Google Scholar 

  19. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010)

    Article  Google Scholar 

  20. Wagh, R.S., Anand, D.: A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge. In: Thampi, S.M., et al. (eds.) Intelligent Systems, Technologies and Applications. AISC, vol. 910, pp. 53–63. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-6095-4_4

    Chapter  Google Scholar 

  21. Zeinab Shahbazi, Y.C.B.: Analysis of domain-independent unsupervised text segmentation using LDA topic modeling over social media contents. Int. J. Adv. Sci. Technol. 29(06), 5993–6014 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Filipo Studzinski Perotto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Perotto, F.S. (2023). Collaborative Multi-agent System for Automatic Linear Text Segmentation. In: Aydoğan, R., Criado, N., Lang, J., Sanchez-Anguix, V., Serramia, M. (eds) PRIMA 2022: Principles and Practice of Multi-Agent Systems. PRIMA 2022. Lecture Notes in Computer Science(), vol 13753. Springer, Cham. https://doi.org/10.1007/978-3-031-21203-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21203-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21202-4

  • Online ISBN: 978-3-031-21203-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics