Skip to main content

Mapping Heterogeneous Textual Data: A Multidimensional Approach Based on Spatiality and Theme

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11938))

Abstract

In this paper, we propose a multidimensional mapping approach for heterogeneous textual data that exploits firstly the spatial dimension and secondly the thematic dimension. Based on the Spatial Textual Representation (STR) as well as the Geodict geographic database, the contribution presented in this paper integrates the thematic dimension of documents. To support our proposal on mapping textual documents, we evaluate the different aspects of the process using two real corpora, including one corpus that is highly heterogeneous.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Geodict is available at this address: http://dx.doi.org/10.18167/DVN1/MWQQOQ.

  2. 2.

    Group of morphemes or words that follow each other with a specific meaning.

  3. 3.

    https://dicoagroecologie.fr/en/.

  4. 4.

    http://www.culture.gouv.fr/Thematiques/Langue-francaise-et-langues-de-France/Politiques-de-la-langue/Enrichissement-de-la-langue-francaise/FranceTerme/Vocabulaire-du-developpement-durable-2015.

References

  1. Arsevska, E., et al.: Monitoring disease outbreak events on the web using text-mining approach and domain expert knowledge. In: European Language Resources Association (ELRA), Paris, France, May 2016

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Bunke, H., Allermann, G.: Inexact graph recognition matching for structural pattern. Pattern Recognit. Lett. 1(May), 245–253 (1983). https://doi.org/10.1016/0167-8655(83)90033-8

    Article  MATH  Google Scholar 

  4. Casati, R., Varzi, A.C.: Spatial entities. In: Stock, O. (ed.) Spatial and Temporal Reasoning, pp. 73–96. Springer, Dordrecht (1997). https://doi.org/10.1007/978-0-585-28322-7_3

    Chapter  Google Scholar 

  5. Fischer, A., Riesen, K., Bunke, H.: Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment. Pattern Recognit. Lett. 87, 55–62 (2017). https://doi.org/10.1016/j.patrec.2016.06.014

    Article  Google Scholar 

  6. Fize, J., Roche, M., Teisseire, M.: Matching heterogeneous textual data using spatial features. In: 2018 IEEE International Conference on Data Mining Workshops, ICDM Workshops, Singapore, Singapore, 17–20 November 2018, pp. 1389–1396 (2018). https://doi.org/10.1109/ICDMW.2018.00197

  7. Fize, J., Shrivastava, G.: GeoDict: an integrated gazetteer. Association for Computational Linguistics (2017)

    Google Scholar 

  8. Lossio-Ventura, J.A., Jonquet, C., Roche, M., Teisseire, M.: Biomedical term extraction: overview and a new methodology. Inf. Retr. J. 19(1–2), 59–99 (2016). https://doi.org/10.1007/s10791-015-9262-2

    Article  Google Scholar 

  9. Papadimitriou, P., Dasdan, A., Garcia-Molina, H.: Web graph similarity for anomaly detection. J. Internet Serv. Appl. 1(1), 19–30 (2010). https://doi.org/10.1007/s13174-010-0003-x

    Article  Google Scholar 

  10. Riesen, K., Jiang, X., Bunke, H.: Exact and inexact graph matching: methodology and applications. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data, vol. 40, pp. 217–247. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-6045-0_7

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jacques Fize .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fize, J., Roche, M., Teisseire, M. (2019). Mapping Heterogeneous Textual Data: A Multidimensional Approach Based on Spatiality and Theme. In: El Yacoubi, S., Bagnoli, F., Pacini, G. (eds) Internet Science. INSCI 2019. Lecture Notes in Computer Science(), vol 11938. Springer, Cham. https://doi.org/10.1007/978-3-030-34770-3_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34770-3_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34769-7

  • Online ISBN: 978-3-030-34770-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics