Discriminant Analysis Using Textual Data

Lebart, Ludovic; Callant, Conchita

doi:10.1007/978-3-642-51175-2_68

Ludovic Lebart⁸ &
Conchita Callant⁹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

657 Accesses

Summary

In the domain of text analysis, a large spectrum of statistical methods has been developed in order to solve problems such as authorship attribution, time determination, information retrieval, processing of responses to open questions in surveys. In analyses of this kind, the applied statistical methods have to produce discrimination models. The textual data entities may be chosen by features of form, or by characteristics of content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AKUTO, H. (Ed.) (1992): International Comparison of Dietary Cultures, Nihon Keizai Shimbun, Tokyo.
Google Scholar
AKUTO, H., LEBART, L. (1992): Le Repas Idéal. Analyse de Réponses Libres en Anglais, Franais, Japonais. Les Cahiers de l’Analyse des Données, vol XVII, n°3, Dunod, Paris, 327–352.
Google Scholar
BENZECRI, J.-P. (1977): Analyse Discriminante et Analyse Factorielle, Les Cahiers de l’Analyse des Données, II, n°4, 369–406.
Google Scholar
BENZECRI J-P.& COLL.(1981a): -Pratique de l’Analyse des Données, tome 3, Linguistique &: Lexicologie, Dunod , Paris.
Google Scholar
BENZECRI, J.-P. (1992a): Note de Lecture : Sur l’Analyse des Données dans une Enquête Internationale. Les Cahiers de l’Analyse des Données, vol XVII, n°3, Dunod, Paris, 353–358.
Google Scholar
BENZECRI, J.-P. et F. (1992b): Typologie de Textes espagnols de la Litterature du Siècle d’Or d’après les Occurrences des Formes des mots outil. Les Cahiers de l’Analyse des Données, vol XVII, n°4, Dunod, Paris, 425–464.
Google Scholar
CELEUX, G., HÉBRAIL, G., MKHADRI, A., SUCHARD, M. (1991): Reduction of a Large Scale and ill-conditioned statistical Problem on textual Data, in Applied Stochastic Models and Data Analysis, Proceedings of the 5th Symposium in ASMDA, Gutierrez R. and Valderrama M.J. Eds, World Scientific, 129–137.
Google Scholar
DEERWESTER, S., DUMAIS, S.T., FURNAS, G.W., LANDAUER, T.K., HARSHMAN, R. (1990): “Indexing by Latent Semantic Analysis”, J. of the Amer. Soc. for Information Science, 41 (6), 391–407.
Article Google Scholar
FOWLER, R.H., FOWLER, W.A.L., WILSON, B.A. (1991): “Integrating Query, Thesaurus, and Documents through a Common Visual Representation”, Proceedings of the 14th Int. ACM Conf., on Res. and Dev. in Information Retrieval, Bookstein A. and al., , Ed, p 142–151, ACM Press, New York.
Google Scholar
HOLMES, D.I. (1985): The Analysis of Literary Style — A Review J.R.Statist.Soc., 148, Part 4, 328–341.
Google Scholar
HOLMES, D.I. (1992): A Stylometric Analysis of Mormon Scripture and Related Texts. J.R.Statist.Soc., 155, Part 1, 91–120.
Article Google Scholar
LEBART, L. (1982): Exploratory Analysis of Large Sparse Matrices, with Application to Textual Data. COMPSTAT, Physica Verlag, p 67–76.
Google Scholar
LEBART, L. (1992a): Discrimination through the Regularized Nearest Cluster Method, in: Computational Statistics, (Y. Dodge, J. Whittaker, eds) Physica Verlag, Heidelberg, 103–118.
Google Scholar
LEBART, L. (1992b): Assessing and Comparing Patterns in Multivariate Analysis, Second Japanese French Seminar on Data Science, in Data Science and application, Hayashi et al. ed, HBJ, Tokyo, Japan.
Google Scholar
LEBART, L., SALEM, A. (1988): Analyse Statistique des Données Textuelles, Dunod, Paris.
Google Scholar
LEBART, L., SALEM, A. (1994): Statistique Textuelle, Dunod, Paris.
Google Scholar
LEBART, L., SALEM, A., BERRY, E. (1991): Recent Development in the Statistical Processing of Textual Data, in Applied Stoch. Model and Data Analysis, 7, 47–62, Wiley.
Google Scholar
MCLACHLAN, G.J. (1992): Discriminant Analysis and Statistical Pattern Recognition, Wiley, New York.
Book Google Scholar
MOSTELLER, F., WALLACE, D. (1964): Inference and disputed Authorship : The Federalists. Addison-Wesley, Reading, Mass.
Google Scholar
SALEM, A. (1984): “La Typologie des Segments Répétés dans un Corpus, Fondée sur l’Analyse d’un Tableau Croisant Mots et textes”, Les Cahiers d’Analyse des Données, Vol IX — n°4, p. 489–500.
Google Scholar
SALTON, G. (1988): Automatic Text Processing: the Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley.
Google Scholar
SALTON, G., MC GILL, M.J. (1983): Introduction to Modern Information Retrieval, International Student Edition.
Google Scholar
THISTED, R., EFRON, B. (1987): Did Shakespeare write a newly discovered poem? Biometrika, 74, 445–455.
Article Google Scholar
YULE, G.U. (1944): The Statistical Study of Literary Vocabulary, Cambridge University Press, Reprinted in 1968 by Archon Books, Hamden, Connecticut.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre National de la Recherche Scientifique Télécom Paris, 46 rue Barrault, F - 75634, Paris Cedex 13, France
Ludovic Lebart
Department of Research, Development and Statistical Methods, Eurostat - Statistical Office of the European Communities, L - 2920, Luxembourg, Belgium
Conchita Callant

Authors

Ludovic Lebart
View author publications
You can also search for this author in PubMed Google Scholar
Conchita Callant
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut National de Recherche en Informatique et en Automatique (INRIA), F-75150, Rocquencourt, Le Chesnay, France
Edwin Diday & Yves Lechevallier &
Universität Mannheim, Schloß, D-68131, Mannheim, Germany
Martin Schader (Lehrstuhl für Wirtschaftsinformatik III) (Lehrstuhl für Wirtschaftsinformatik III)
Université Paris IX Dauphine, Pl. du Maréchal de Lattre de Tassigny, F-75775, Paris Cedex 16, France
Patrice Bertrand
TELECOM-Paris, 46, rue Barrault, F-75013, Paris, France
Bernard Burtschy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lebart, L., Callant, C. (1994). Discriminant Analysis Using Textual Data. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (eds) New Approaches in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51175-2_68

Download citation

DOI: https://doi.org/10.1007/978-3-642-51175-2_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58425-4
Online ISBN: 978-3-642-51175-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics