Skip to main content

TaLTaC 3.0. A Multi-level Web Platform for Textual Big Data in the Social Sciences

  • Conference paper
  • First Online:

Abstract

The TaLTaC software package as a tool of lexical and textual analysis, versions 1.0 e 2.0, lived over the last decades (1999–2015). It appears now to have met its technological limits. The TaLTaC version 3.0 (from now on T3) has been redesigned to overcome those limits. The process included: (i) recoding of all inner software components with modern web-related languages and standards; (ii) adoption of a new kind of database (NoSQL) capable to handle corpora in the order of magnitude of gigabytes; (iii) new criteria for data storage and data processing. The software architecture is modular and allows to decouple user interaction from actual data computing. The two main components are: the GUI (graphical user interface), based on HTML5/CSS/Js and the back-end processing CORE. The new design also made it possible to run T3 among the mainstream operating systems: Os X, Windows, and Linux. From a single parsing operation, T3 produces many vocabularies for multi-level lexical analysis. This allows one to disambiguate, in a semiautomatic fashion, between the different text graphical forms on the basis of concordance. I also allows for a virtual transformation of simple forms into multi-words.

This is a preview of subscription content, log in via an institution.

Notes

  1. 1.

    http://www.dss.uniroma1.it/en/node/6365 (2016). Accessed 30 Apr 2016.

  2. 2.

    http://github.com/electron/electron, accessed November 2016.

References

  • Allier, S., Barais, O., Baudry, B., Bourcier, J., Daubert, E., Fleurey, F., et al. (2015). Multitier Diversification in Web-Based Software Applications. IEEE Software, 32(1), 83–90.

    Article  Google Scholar 

  • Bolasco, S. (2010). Taltac 2.10 Sviluppi, esperienze ed elementi essenziali di analisi automatica dei testi. Milano: Led.

    Google Scholar 

  • Bolasco, S., Baiocchi, F., Canzonetti, A., & De Gasperis, G. (2016). TaLTaC 3.0: Un software multi-lessicale e uni-testuale ad architettura web. In D. Mayaffre, C. Poudat, L. Vanni, V. Magri & P. Follette (Eds.), Proceedings of 13th International Conference on Statistical Analysis of Textual Data, University Nice Sophia Antipolis, 225–235.

    Google Scholar 

  • Bolasco, S. (2013). L’analisi automatica dei testi. Fare ricerca con il text mining. Roma: Carocci.

    Google Scholar 

  • Escoubas-Benveniste, M. P., Floquet, O., & Bolasco, S. (2012). Contribution empirique à l’étude du gérondif et du participe présent en français parlé et écrit. JADT 2012: 11èmes Journées internationales d’Analyse statistique des Données Textuelles, 473–485.

    Google Scholar 

  • Lebart, L., & Salem, A. (1988). Analyse statistique des donnes textuelles. Paris: Dunod.

    Google Scholar 

  • Mayer-Schönberger, V., & Cukier, K. (2013). Big Data A Revolution that will transform how we live, work and think. London: John Murray.

    Google Scholar 

  • Silberztein, M. (1993). Dictionnaires électroniques et analyse automatique de textes. Le système INTEX. Paris: Masson.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Bolasco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bolasco, S., De Gasperis, G. (2017). TaLTaC 3.0. A Multi-level Web Platform for Textual Big Data in the Social Sciences. In: Lauro, N., Amaturo, E., Grassia, M., Aragona, B., Marino, M. (eds) Data Science and Social Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-55477-8_9

Download citation

Publish with us

Policies and ethics