Use of Elliptic Curves in Term Discrimination
Detection of discriminant terms allow us to improve the performance of natural language processing systems. The goal is to be able to find the possible term contribution in a given corpus and, thereafter, to use the terms of high contribution for representing the corpus. In this paper we present various experiments that use elliptic curves with the purpose of discovering discriminant terms of a given textual corpus. Different experiments led us to use the mean and variance of the corpus terms for determining the parameters of a Weierstrass reduced equation (elliptic curve). We use the elliptic curves in order to graphically visualize the behavior of the corpus vocabulary. Thereafter, we use the elliptic curve parameters in order to cluster those terms that share characteristics. These clusters are then used as discriminant terms in order to represent the original document collection. Finally, we evaluated all these corpus representations in order to determine those terms that best discrimine each document.
KeywordsElliptic Curve Elliptic Curf North American Free Trade Agreement Textual Corpus Elliptic Curve Cryptography
- 3.Pinto, D.: On Clustering and Evaluation of Narrow Domain Short-Text Corpora. Phd thesis, Department of Information Systems and Computation, UPV (2008)Google Scholar
- 7.Santiesteban, Y., Pons-Porrata, A.: LEX: a new algorithm for the calculus of typical testors. Mathematics Sciences Journal 21(1), 85–95 (2003)Google Scholar