Abstract
Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Specifically, concepts from the SNOMED-CT medical ontology [25].
References
van Rijsbergen, C.J.: The Geometry of Information Retrieval. Cambridge University Press, New York (2004)
Song, D., Lalmas, M., van Rijsbergen, K., Frommholz, I., Piwowarski, B., Wang, J., Zhang, P., Zuccon, G., Bruza, P.D., Arafat, S., et al.: How quantum theory is developing the field of information retrieval. In: Proceedings of QI, Arlington, VA, USA, pp. 105–108, November 2010
Zuccon, G., Azzopardi, L.: Using the quantum probability ranking principle to rank interdependent documents. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 357–369. Springer, Heidelberg (2010)
Zuccon, G., Piwowarski, B., Azzopardi, L.: On the use of complex numbers in quantum models for information retrieval. In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 346–350. Springer, Heidelberg (2011)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. JASIST 41(6), 391–407 (1990)
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Meth. Instrum. Comput. 28, 203–208 (1996)
Kanerva, P., Kristofersson, J., Holst, A.: Random indexing of text samples for latent semantic analysis. In: Proceedings of CogSci., vol. 1036, Philadelphia, PA, USA (2000)
Karlgren, J., Sahlgren, M.: From words to understanding. In: Uesaka, Y., Kanerva, P., Asoh, H. (eds.) Foundations of Real-World Intelligence, pp. 294–308. CSLI Publications, Stanford (2001)
Sahlgren, M.: The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Institutionen för lingvistik. Department of Linguistics, Stockholm University (2006)
Symonds, M., Bruza, P., Sitbon, L., Turner, I.: Modelling word meaning using efficient tensor representations. In: Proceedings of PacLic., pp. 313–322, November 2011
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Koopman, B., Bruza, P., Sitbon, L., Lawley, M.: Towards semantic search and inference in electronic medical records: an approach using concept-based information retrieval. AMJ 9, 482–488 (2012)
Wittgenstein, L.: Philosophical Investigations. Blackwell Publishing, Oxford (1967)
Harris, Z.: Distributional structure. In: Harris, Z. (ed.) Papers in Structural and Transformational Linguistics. Formal Linguistics, pp. 775–794. Humanities Press, New York (1970)
Firth, J.R.: Papers in Linguistics 1934–1951. Oxford University Press, London (1957)
Bloomfield, L.: Language. Holt, Reinhart and Winston, New York (1933)
Morris Charles, W.: Signs, Language and Behavior. Prentice Hall, New York (1946)
von Uexküll, J.: The theory of meaning. Semiotica 42(1), 25–82 (1982)
Peirce, C.: Logic as semiotic: the theory of signs. In: Peirce, C., Buchler, J. (eds.) Philosophical Writings of Peirce, pp. 98–119. Dover Publications, New York (1955)
Frege, G.: Sense and reference. Philos. Rev. 57(3), 209–230 (1948)
Sahlgren, M.: An introduction to random indexing. In: Proceedings of TKE, Copenhagen, Denmark (2005)
Widdows, D., Ferraro, K.: Semantic vectors: a scalable open source package and online technology management application. In: Proceedings LREC, Marrakech, Morocco, May 2008
Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., Lawley, M.: Graph-based concept weighting for medical information retrieval. In: Proceedings of ADCS, Dunedin, New Zealand, pp. 80–87, December 2012
Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., Butt, L.: Exploiting medical hierarchies for concept-based information retrieval. In: Proceedings of ADCS, Dunedin, New Zealand, pp. 111–114, December 2012
Spackman, K.: SNOMED Clinical Terms Basics. International Health Terminology Standards Development Organisation Technical report, August 2008
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. JAMIA 17(3), 229–236 (2010)
Wu, S.T., Liu, H., Li, D., Tao, C., Musen, M.A., Chute, C.G., Shah, N.H.: Unified medical language system term occurrences in clinical notes: a large-scale corpus analysis. JAMIA 19(e1), e149–e156 (2012)
Wittek, P., Tan, C.L.: Compactly supported basis functions as support vector kernels for classification. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 2039–2050 (2011)
Widdows, D., Cohen, T.: Real, complex, and binary semantic vectors. In: Busemeyer, J.R., Dubois, F., Lambert-Mogiliansky, A., Melucci, M. (eds.) QI 2012. LNCS, vol. 7620, pp. 24–35. Springer, Heidelberg (2012)
Voorhees, E., Tong, R.: Overview of the TREC Medical Records Track. In: Proceedings of TREC, Gaithersburg, MD, USA, November 2011
Wu, S., Masanz, J., Ravikumar, K., Liu, H.: Three questions about clinical information retrieval. In: Proceedings of TREC, Gaithersburg, MD, USA, November 2012
Aerts, D., Czachor, M.: Quantum aspects of semantic analysis and symbolic artificial intelligence. J. Phys. A: Math. Gen. 37, L123–L132 (2004)
Bruza, P., Kitto, K., Ramm, B., Sitbon, L., Song, D., Blomberg, S.: Quantum-like non-separability of concept combinations, emergent associates and abduction. Logic J. IGPL 20(2), 445–457 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wittek, P., Koopman, B., Zuccon, G., Darányi, S. (2014). Combining Word Semantics within Complex Hilbert Space for Information Retrieval. In: Atmanspacher, H., Haven, E., Kitto, K., Raine, D. (eds) Quantum Interaction. QI 2013. Lecture Notes in Computer Science(), vol 8369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54943-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-54943-4_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54942-7
Online ISBN: 978-3-642-54943-4
eBook Packages: Computer ScienceComputer Science (R0)