Skip to main content

The lexical analysis of French

  • Conference paper
  • First Online:
Electronic Dictionaries and Automata in Computational Linguistics (LITP 1987)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 377))

Included in the following conference series:

Abstract

The automatic linguistic analysis of texts requires basic information about the simple and compound words of the text. Lexical analysis is the preliminary step before syntactic analysis. We have shown that important linguistic problems appear during this basic step. Some of them cannot yet be solved (recognition of proper names, compound verbs, and so on); others, if solved during lexical analysis, facilitates the syntactic analysis by reducing the degree of ambiguity of the text.

The lexical parser is based on a program (automaton) which could be used in more general cases in order to disambiguate some strings. For example, we have described the context of the string j'; in the same way, it would be possible to describe the context of the word je, which appears only in a limited number of schemas.

We have given an enumeration of problems that have to be solved in order to recognize words. Each of these problems is well-known. What we have attempted here is to formulate them in such a form that they can be represented by finite automata and treated by the corresponding algorithms. It should be clear that the number of automata to be built, their size and the formulation of their interaction is by no mean trivial: a complex program is required simply to recognize the words of a text.

Unit 819 of the CNRS. This work has been partly financed by the Programme de Recherches Coordonnées «Informatique Linguistique» of the Ministry of Research and Technology.

FIRTECH Industries de la langue française.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bérard-Dugourd Anne and Richard Gilles, 1986. Le traitement des locutions dans l'analyse du langage naturel. Centre scientifique IBM France, Paris.

    Google Scholar 

  • Blumenthal Lucie, 1987. Le trait d'union dans les mots composés du français. Mémoires du CERIL, Evry.

    Google Scholar 

  • Courtois Blandine, 1987. DELAS: Dictionnaire Electronique du LADL pour les mots Simples du français. Rapport technique du LADL, Université Paris 7.

    Google Scholar 

  • Dufour M.-L, 1971. Le tapuscrit, recommandations pour la présentation et la dactylographie des travaux scientifiques. Ecole des hautes études en sciences sociales, Paris.

    Google Scholar 

  • Grevisse Maurice and Goosse André, 1986. Le bon usage, douxième édition. Editions Duculot, Paris-Gembloux.

    Google Scholar 

  • Gross Gaston, 1986. Typologie des noms composés. Rapport A.T.P. Nouvelles recherches sur le langage, Paris XIII, Villetaneuse.

    Google Scholar 

  • Gross Gaston, Jung René and Mathieu-Colas Michel, 1987. Noms composés. Rapport no5 du Programme de Recherches Coordonnées «Informatique Linguistique», Université Paris 7.

    Google Scholar 

  • Gross Maurice, 1986. Les adjectifs composés du français. Rapport no3 du Programme de Recherches Coordonnées «Informatique Linguistique», CNRS, Paris.

    Google Scholar 

  • Gross Maurice, 1989. Grammaire transformationnelle du français: 3 Syntaxe de l'adverbe. Cantilène, Paris.

    Google Scholar 

  • Laporte Eric, 1988a. Méthodes algorithmiques et lexicales de phonétisation de textes, Applications au français. Thèse de doctorat en informatique, LADL, Université Paris 7.

    Google Scholar 

  • Laporte Eric, 1988b. La reconnaissance des expressions figées lors de l'analyse automatique. Langages no90: «Les expressions figées». Larousse, Paris.

    Google Scholar 

  • Leeman Danièle, 1988. Echantillons des adjonctions au DELAS d'adjectifs en-able. Rapport du Programme de Recherches Coordonnées «Informatique Linguistique», LADL, Université Paris 7.

    Google Scholar 

  • Lesk M.E. and Schmidt E., 1978. Lex — A Lexical Analyzer Generator. Bell Laboratories Murray Hill, New Jersey 07974.

    Google Scholar 

  • Mathieu-Colas Michel, 1987. Variations graphiques de mots composés. Rapport no4 du Programme de Recherches Coordonnées «Informatique Linguistique», CNRS, Paris.

    Google Scholar 

  • Perrin Dominique, 1989. Automates et algorithmes sur les mots, Annales des Télécommunications, CNET, Paris.

    Google Scholar 

  • Woznika Stan, 1987. Dictionnaire des homographes du français. Rapport de recherches du LADL, Université Paris 7.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Maurice Gross Dominique Perrin

Rights and permissions

Reprints and permissions

Copyright information

© 1989 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Silberztein, M. (1989). The lexical analysis of French. In: Gross, M., Perrin, D. (eds) Electronic Dictionaries and Automata in Computational Linguistics. LITP 1987. Lecture Notes in Computer Science, vol 377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-51465-1_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-51465-1_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-51465-7

  • Online ISBN: 978-3-540-48140-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics