Text-Independent Speech Balloon Segmentation for Comics and Manga

Rigaud, Christophe; Burie, Jean-Christophe; Ogier, Jean-Marc

doi:10.1007/978-3-319-52159-6_10

Christophe Rigaud¹⁵,
Jean-Christophe Burie¹⁵ &
Jean-Marc Ogier¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9657))

Included in the following conference series:

International Workshop on Graphics Recognition

730 Accesses
13 Citations

Abstract

Comics and manga are one of the most popular and familiar forms of graphic content over the world and play a major role in spreading country’s culture. Nowadays, massive digitization and digital-born materials allow page-per-page mobile reading but we believe that other usages may be released in the near future. In this paper, we focus on speech balloon segmentation which is a key issue for text/graphic association in scanned and digital-born comic book images. Speech balloons are at the interface between text and comic characters, they inform the reader about speech tone and the position of the speakers. We present a generic and text-independent speech balloon segmentation method based on color, shape and topological organization of the connected-components. The method has been evaluated at pixel-level on two public datasets (eBDtheque and Manga109) and the F-measure results are 78.24% and 80.04% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Milton Griepp’s White Paper, ICv2 Conference 2014.
2.
http://www.j-comi.jp/.
3.
https://github.com/crigaud/publication/tree/master/2016/LNCS/text-independent_speech_balloon_segmentation_for_comics_and_manga.

References

Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. (IJIP) 4(6), 669–676 (2011)
Google Scholar
Bigorda, L.G., Karatzas, D.: A fast hierarchical method for multi-script and arbitrary oriented scene text extraction. CoRR abs/1407.7504 (2014). http://arxiv.org/abs/1407.7504
Chalmeta, R., Hurtado, F., Sacristn, V., Saumell, M.: Measuring regularity of convex polygons. Comput. Aided Des. 45(2), 93–104 (2013). http://www.sciencedirect.com/science/article/pii/S0010448512001650. Solid and Physical Modeling 2012
Article Google Scholar
Cyb: Making Comics: Storytelling Secrets of Comics, Manga and Graphic Novels, pp. 128–153. William Morrow Paperbacks (2006)
Google Scholar
Donoser, M., Bischof, H.: Efficient maximally stable extremal region (MSER) tracking. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 553–560. IEEE (2006)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Guérin, C., Rigaud, C., Mercier, A., et al.: eBDtheque: a representative database of comics. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 1145–1149. Washington DC (2013)
Google Scholar
Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and speech balloon extraction from comic books. In: 10th IAPR International Workshop on Document Analysis Systems, pp. 424–428, March 2012
Google Scholar
Lamiroy, B., Ogier, J.M.: Analysis and interpretation of graphical documents. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition. Springer, London (2014). doi:10.1007/978-0-85729-859-1_19
Google Scholar
Li, L., Wang, Y., Suen, C.Y., Tang, Z., Liu, D.: A tree conditional random field model for panel detection in comic images. Pattern Recogn. 48(7), 2129–2140 (2015). http://dx.doi.org/10.1016/j.patcog.2015.01.011
Article Google Scholar
Liu, X., Wang, Y., Tang, Z.: A clump splitting based method to localize speech balloons in comics. In: Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 901–906. IEEE (2015)
Google Scholar
Liu, X., Li, C., Zhu, H., Wong, T.T., Xu, X.: Text-aware balloon extraction from manga. Vis. Comput. 32(4), 501–511 (2015). http://dx.doi.org/10.1007/s00371-015-1084-0
Article Google Scholar
Matsui, Y., Ito, K., Aramaki, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using manga109 dataset. CoRR abs/1510.04389 (2015). http://arxiv.org/abs/1510.04389
Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. Int. J. Doc. Anal. Recogn. (IJDAR) 18(3), 199–221 (2015). http://dx.doi.org/10.1007/s10032-015-0243-1
Article Google Scholar
Rigaud, C., Karatzas, D., Burie, J.-C., Ogier, J.-M.: Adaptive contour classification of comics speech balloons. In: Lamiroy, B., Ogier, J.-M. (eds.) GREC 2013. LNCS, vol. 8746, pp. 53–62. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44854-0_5
Google Scholar
Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: An active contour model for speech balloon detection in comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1240–1244. IEEE (2013)
Google Scholar
Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP). SCITEPRESS Digital Library (2013)
Google Scholar
Rigaud, C., Le Thanh, N., Burie, J.C., Ogier, J.M., Iwata, M., Imazu, E., Koichi, K.: Speech balloon and speaker association for comics and manga understanding. In: Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 351–356. IEEE (2015)
Google Scholar
Roudier, N.: Les terres creusées, volume Acte sur BD, Actes Sud (2011)
Google Scholar
Stommel, M., Merhej, L.I., Müller, M.G.: Segmentation-free detection of comic panels. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, K. (eds.) ICCVG 2012. LNCS, vol. 7594, pp. 633–640. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33564-8_76
Chapter Google Scholar
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)
Article MATH Google Scholar

Download references

Acknowledgment

This work was supported by the University of La Rochelle (France), the town of La Rochelle and the PIA-iiBD (“Programme d’Investissements d’Avenir”). We are grateful to all authors and publishers of comics and manga images from eBDtheque and Manga109 datasets for allowing us to use their works.

Author information

Authors and Affiliations

Laboratoire L3i, Université de La Rochelle, Avenue Michel Crépeau, 17042, La Rochelle, France
Christophe Rigaud, Jean-Christophe Burie & Jean-Marc Ogier

Authors

Christophe Rigaud
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Christophe Burie
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Ogier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christophe Rigaud .

Editor information

Editors and Affiliations

Université de Lorraine, Vandoeuvre-lès-Nancy, France
Bart Lamiroy
Universidade Federal de Pernambuco, Recife, Pernambuco, Brazil
Rafael Dueire Lins

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rigaud, C., Burie, JC., Ogier, JM. (2017). Text-Independent Speech Balloon Segmentation for Comics and Manga. In: Lamiroy, B., Dueire Lins, R. (eds) Graphic Recognition. Current Trends and Challenges. GREC 2015. Lecture Notes in Computer Science(), vol 9657. Springer, Cham. https://doi.org/10.1007/978-3-319-52159-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-52159-6_10
Published: 08 January 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52158-9
Online ISBN: 978-3-319-52159-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics