A Multimodal Approach to Relevance and Pertinence of Documents

Cristani, Matteo; Tomazzoli, Claudio

doi:10.1007/978-3-319-42007-3_14

Matteo Cristani¹⁸ &
Claudio Tomazzoli¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9799))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2615 Accesses
8 Citations

Abstract

Automated document classification process extracts information with a systematical analysis of the content of documents. This is an active research field of growing importance due to the large amount of electronic documents produced in the world wide web and made readily available thanks to diffused technologies including mobile ones. Several application areas benefit from automated document classification, including document archiving, invoice processing in business environments, press releases and search engines. Current tools classify or “tag” either text or images separately. In this paper we show how, by linking image and text-based contents together, a technology improves fundamental document management tasks like retrieving information from a database or automatically routing documents. We present a formal definition of pertinence and relevance concepts, that apply to those documents types we name “multimodal”. These are based on a model of conceptual spaces we believe compulsory for document investigation while using joint information sources coming from text and images forming complex documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image Vis. Comput. 23(6), 565–576 (2005)
Article Google Scholar
Kahn, C.: Dynamic inline images: context-sensitive retrieval and integration of images into web documents. J. Digit. Imaging 21(3), 274–279 (2008)
Article MathSciNet Google Scholar
Park, G., Baek, Y., Lee, H.-K.: Web image retrieval using majority-based ranking approach. Multimed. Tools Appl. 31(2), 195–219 (2006)
Article Google Scholar
Liu, Y., Zhang, D., Guojun, L., Ma, W.-Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)
Article MATH Google Scholar
Schettini, R., Brambilla, C., Ciocca, G., Valsasna, A., De Ponti, M.: A hierarchical classification strategy for digital documents. Pattern Recogn. 35(8), 1759–1769 (2002)
Article MATH Google Scholar
Seo, K.-K.: An application of one-class support vector machines in content-based image retrieval. Expert Syst. Appl. 33(2), 491–498 (2007)
Article Google Scholar
Larabi, S.: Textual description of shapes. J. Vis. Commun. Image Represent. 20(8), 563–584 (2009)
Article Google Scholar
Sagara, N., Sunayama, W., Yachida, M.: Image labeling using key sentences of HTML. Electron. Commun. Jpn. (Part III Fundam. Electron. Sci.) 89(7), 31–41 (2006)
Article Google Scholar
Fei, W., Han, Y.-H., Zhuang, Y.-T.: Multiple hypergraph clustering of web images by MiningWord2Image correlations. J. Comput. Sci. Technol. 25(4), 750–760 (2010)
Article Google Scholar
de Mello, R.F., Bueno, J.M., Senger, L.J., Yang, L.T.: Image indexing and retrieval using an ART-2A neural network architecture. Int. J. Imaging Syst. Technol. 18(2–3), 202–208 (2008)
Article Google Scholar
Shen, H.T., Zhou, X., Cui, B.: Indexing and integrating multiple features for www images. World Wide Web 9(3), 343–364 (2006)
Article Google Scholar
Wang, H., Liu, S., Chia, L.-T.: Image retrieval with a multi-modality ontology. Multimed. Syst. 13(5), 379–390 (2008)
Article Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 712–727 (2008)
Article Google Scholar
Qin, J., Yung, N.H.C.: Scene categorization via contextual visual words. Pattern Recogn. 43(5), 1874–1888 (2010)
Article MATH Google Scholar
Sable, C.L., Hatzivassiloglou, V.: Text-based approaches for non-topical image categorization. Int. J. Digit. Libr. 3(3), 261–275 (2000)
Article Google Scholar
Zhao, M., Li, S., Kwok, J.: Text detection in images using sparse representation with discriminative dictionaries. Image Vis. Comput. 28(12), 1590–1599 (2010)
Article Google Scholar
Srihari, S.N., Tao, H., Geetha, S.: Machine-printed Japanese document recognition. Pattern Recogn. 30(8), 1301–1313 (1997)
Article Google Scholar
Caponetti, L., Castiello, C., Gorecki, P.: Document page segmentation using neuro-fuzzy approach. Appl. Soft Comput. J. 8(1), 118–126 (2008)
Article Google Scholar
Chan, W., Coghill, G.: Text analysis using local energy. Pattern Recogn. 34(12), 2523–2532 (2001)
Article MATH Google Scholar
Chang, Y., Chen, D., Zhang, Y., Yang, J.: An image-based automatic arabic translation system. Pattern Recogn. 42(9), 2127–2134 (2009)
Article MATH Google Scholar
Wen, D., Ding, X.-Q.: Visual similarity based document layout analysis. J. Comput. Sci. Technol. 21(3), 459–465 (2006)
Article MathSciNet Google Scholar
Lin, W.-C., Chang, Y.-C., Chen, H.-H.: Integrating textual and visual information for cross-language image retrieval: a trans-media dictionary approach. Inf. Process. Manage. 43(2), 488–502 (2007)
Article Google Scholar
Ah-Pine, J., Bressan, M., Clinchant, S., Csurka, G., Hoppenot, Y., Renders, J.-M.: Crossing textual and visual content in different application scenarios. Multimed. Tools Appl. 42(1), 31–56 (2009)
Article Google Scholar
Cristani, M., Tomazzoli, C.: A multimodal approach to exploit similarity in documents. In: Ali, M., Pan, J.-S., Chen, S.-M., Horng, M.-F. (eds.) IEA/AIE 2014, Part I. LNCS, vol. 8481, pp. 490–499. Springer, Heidelberg (2014)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Verona, Verona, Italy
Matteo Cristani & Claudio Tomazzoli

Authors

Matteo Cristani
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Tomazzoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Matteo Cristani or Claudio Tomazzoli .

Editor information

Editors and Affiliations

Iwate Prefectural University , Iwate, Japan
Hamido Fujita
Department Computer Science, Texas State University, San Marcos, Texas, USA
Moonis Ali
Universiti Teknologi Malaysis (UTM), Bahru, Malaysia
Ali Selamat
Iwate Prefectural University , Iwate, Japan
Jun Sasaki
Iwate Prefectural University , Iwate, Japan
Masaki Kurematsu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cristani, M., Tomazzoli, C. (2016). A Multimodal Approach to Relevance and Pertinence of Documents. In: Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M. (eds) Trends in Applied Knowledge-Based Systems and Data Science. IEA/AIE 2016. Lecture Notes in Computer Science(), vol 9799. Springer, Cham. https://doi.org/10.1007/978-3-319-42007-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-42007-3_14
Published: 14 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42006-6
Online ISBN: 978-3-319-42007-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics