Abstract
In this paper we deal with distances for fuzzy strings in \([0,1]^n\), to be used in distance-based linguistic classification. We start from the fuzzy Hamming distance, anticipated by the linguist Muljačić back in 1967, and the taxicab distance, which both generalize the usual crisp Hamming distance, using in the first case the standard logical operations of minimum for conjunctions and maximum for disjunctions, while in the second case one uses Łukasiewicz’ T-norms and T-conorms. We resort to the Steinhaus transform, a powerful tool which allows one to deal with linguistic data which are not only fuzzy, but possibly also irrelevant or logically inconsistent. Experimental results on actual data are shown and preliminarily commented upon.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bortolussi, L., Sgarro, A., Longobardi, G., Guardiano, C.: How many possible languages are there? In: Biology, Computation and Linguistics, pp. 168–179. IOS Press, Amsterdam (2011). https://doi.org/10.3233/978-1-60750-762-8-168
Deza, M.M., Deza, E.: Dictionary of Distances. Elsevier B.V., New York City (2006)
Dubois, D., Prade, H.: Fundamentals of Fuzzy Sets. Kluwer Academic Publishers, Dordrecht (2000)
Franzoi, L., Sgarro, A.: Fuzzy Hamming distinguishability. In: IEEE International Conference on Fuzzy Systems, FUZZ-IEEE, pp. 1–6 (2017). https://doi.org/10.1109/FUZZ-IEEE.2017.8015434
Franzoi, L., Sgarro, A.: Linguistic classification: T-norms, fuzzy distances and fuzzy distinguishabilities. Proc. Comput. Sci. 112, 1168–1177 (2017). https://doi.org/10.1016/j.procs.2017.08.163. KES
Franzoi, L.: Jaccard-like fuzzy distances for computational linguistics. In: Presented at SYNASC 2017 (2017, in press)
Longobardi, G., Ceolin, A., Bortolussi, L., Guardiano, C., Irimia, M.A., Michelioudakis, D., Radkevich, N., Sgarro, A.: Mathematical modeling of grammatical diversity supports the historical reality of formal syntax. In: Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics, pp. 1–4. University of Tübingen, Tübingen DEU (2016). https://doi.org/10.15496/publikation-10122
Longobardi, G., Guardiano, C., Silvestri, G., Boattini, A., Ceolin, A.: Toward a syntactic phylogeny of modern Indo-European languages. J. Hist. Linguist. 3(11), 122–152 (2013). https://doi.org/10.1075/bct.75.07lon
Longobardi, G., Ghirotto, S., Guardiano, C., Tassi, F., Benazzo, A., Ceolin, A., Barbujani, G.: Across language families: genome diversity mirrors language variation within Europe. Am. J. Phys. Anthropol. 157, 630–640 (2015). https://doi.org/10.1002/ajpa.22758
Muljačić, Z.: Die Klassifikation der romanischen Sprachen. Rom. J. Buch. XVIII, 23–37 (1967)
Sgarro, A.: A fuzzy Hamming distance. Bull. Math. de la Soc. Sci. Math. de la R. S. de Romanie 69(1–2), 137–144 (1977)
Zadeh, L.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Steinhaus transforms of fuzzy string distances in computational linguistics (Support material). goo.gl/3p1sY7
Acknowledgment
Authors A. Dinu and L. P. Dinu are supported by HerCoRe project (no. 91970), funded by Volkswagen Foundation; L. Franzoi and A. Sgarro are with the INdAM research group GNCS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Dinu, A., Dinu, L.P., Franzoi, L., Sgarro, A. (2018). Steinhaus Transforms of Fuzzy String Distances in Computational Linguistics. In: Medina, J., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. IPMU 2018. Communications in Computer and Information Science, vol 853. Springer, Cham. https://doi.org/10.1007/978-3-319-91473-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-91473-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91472-5
Online ISBN: 978-3-319-91473-2
eBook Packages: Computer ScienceComputer Science (R0)