Skip to main content

Discriminant Analysis Using Textual Data

  • Conference paper
New Approaches in Classification and Data Analysis

Summary

In the domain of text analysis, a large spectrum of statistical methods has been developed in order to solve problems such as authorship attribution, time determination, information retrieval, processing of responses to open questions in surveys. In analyses of this kind, the applied statistical methods have to produce discrimination models. The textual data entities may be chosen by features of form, or by characteristics of content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AKUTO, H. (Ed.) (1992): International Comparison of Dietary Cultures, Nihon Keizai Shimbun, Tokyo.

    Google Scholar 

  • AKUTO, H., LEBART, L. (1992): Le Repas Idéal. Analyse de Réponses Libres en Anglais, Franais, Japonais. Les Cahiers de l’Analyse des Données, vol XVII, n°3, Dunod, Paris, 327–352.

    Google Scholar 

  • BENZECRI, J.-P. (1977): Analyse Discriminante et Analyse Factorielle, Les Cahiers de l’Analyse des Données, II, n°4, 369–406.

    Google Scholar 

  • BENZECRI J-P.& COLL.(1981a): -Pratique de l’Analyse des Données, tome 3, Linguistique &: Lexicologie, Dunod , Paris.

    Google Scholar 

  • BENZECRI, J.-P. (1992a): Note de Lecture : Sur l’Analyse des Données dans une Enquête Internationale. Les Cahiers de l’Analyse des Données, vol XVII, n°3, Dunod, Paris, 353–358.

    Google Scholar 

  • BENZECRI, J.-P. et F. (1992b): Typologie de Textes espagnols de la Litterature du Siècle d’Or d’après les Occurrences des Formes des mots outil. Les Cahiers de l’Analyse des Données, vol XVII, n°4, Dunod, Paris, 425–464.

    Google Scholar 

  • CELEUX, G., HÉBRAIL, G., MKHADRI, A., SUCHARD, M. (1991): Reduction of a Large Scale and ill-conditioned statistical Problem on textual Data, in Applied Stochastic Models and Data Analysis, Proceedings of the 5th Symposium in ASMDA, Gutierrez R. and Valderrama M.J. Eds, World Scientific, 129–137.

    Google Scholar 

  • DEERWESTER, S., DUMAIS, S.T., FURNAS, G.W., LANDAUER, T.K., HARSHMAN, R. (1990): “Indexing by Latent Semantic Analysis”, J. of the Amer. Soc. for Information Science, 41 (6), 391–407.

    Article  Google Scholar 

  • FOWLER, R.H., FOWLER, W.A.L., WILSON, B.A. (1991): “Integrating Query, Thesaurus, and Documents through a Common Visual Representation”, Proceedings of the 14th Int. ACM Conf., on Res. and Dev. in Information Retrieval, Bookstein A. and al., , Ed, p 142–151, ACM Press, New York.

    Google Scholar 

  • HOLMES, D.I. (1985): The Analysis of Literary Style — A Review J.R.Statist.Soc., 148, Part 4, 328–341.

    Google Scholar 

  • HOLMES, D.I. (1992): A Stylometric Analysis of Mormon Scripture and Related Texts. J.R.Statist.Soc., 155, Part 1, 91–120.

    Article  Google Scholar 

  • LEBART, L. (1982): Exploratory Analysis of Large Sparse Matrices, with Application to Textual Data. COMPSTAT, Physica Verlag, p 67–76.

    Google Scholar 

  • LEBART, L. (1992a): Discrimination through the Regularized Nearest Cluster Method, in: Computational Statistics, (Y. Dodge, J. Whittaker, eds) Physica Verlag, Heidelberg, 103–118.

    Google Scholar 

  • LEBART, L. (1992b): Assessing and Comparing Patterns in Multivariate Analysis, Second Japanese French Seminar on Data Science, in Data Science and application, Hayashi et al. ed, HBJ, Tokyo, Japan.

    Google Scholar 

  • LEBART, L., SALEM, A. (1988): Analyse Statistique des Données Textuelles, Dunod, Paris.

    Google Scholar 

  • LEBART, L., SALEM, A. (1994): Statistique Textuelle, Dunod, Paris.

    Google Scholar 

  • LEBART, L., SALEM, A., BERRY, E. (1991): Recent Development in the Statistical Processing of Textual Data, in Applied Stoch. Model and Data Analysis, 7, 47–62, Wiley.

    Google Scholar 

  • MCLACHLAN, G.J. (1992): Discriminant Analysis and Statistical Pattern Recognition, Wiley, New York.

    Book  Google Scholar 

  • MOSTELLER, F., WALLACE, D. (1964): Inference and disputed Authorship : The Federalists. Addison-Wesley, Reading, Mass.

    Google Scholar 

  • SALEM, A. (1984): “La Typologie des Segments Répétés dans un Corpus, Fondée sur l’Analyse d’un Tableau Croisant Mots et textes”, Les Cahiers d’Analyse des Données, Vol IX — n°4, p. 489–500.

    Google Scholar 

  • SALTON, G. (1988): Automatic Text Processing: the Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley.

    Google Scholar 

  • SALTON, G., MC GILL, M.J. (1983): Introduction to Modern Information Retrieval, International Student Edition.

    Google Scholar 

  • THISTED, R., EFRON, B. (1987): Did Shakespeare write a newly discovered poem? Biometrika, 74, 445–455.

    Article  Google Scholar 

  • YULE, G.U. (1944): The Statistical Study of Literary Vocabulary, Cambridge University Press, Reprinted in 1968 by Archon Books, Hamden, Connecticut.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lebart, L., Callant, C. (1994). Discriminant Analysis Using Textual Data. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (eds) New Approaches in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51175-2_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-51175-2_68

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58425-4

  • Online ISBN: 978-3-642-51175-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics