Skip to main content
Log in

Word segmentation based on database semantics in NChiql

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper a novel word-segmentation algorithm is presented to delimit words in Chinese natural language queries in NChiql system, a Chinese natural language query interface to databases. Although there are sizable literatures on Chinese segmentation, they cannot satisfy particular requirements in this system. The novel word-segmentation algorithm is based on the database semantics, namely Semantic Conceptual Model (SCM) for specific domain knowledge. Based on SCM, the segmenter labels the database semantics to words directly, which eases the disambiguation and translation (from natural language to database query) in NChiql.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Copestake A, Jones K S. Natural language interfaces to databases.The Knowledge Engineering Review, 1990, 5(4): 225–249.

    Article  Google Scholar 

  2. Sproat Ret al. A stochastic finite-state word-segmentation algorithm for chinese. Available at URL: http://xxx.lanl.gov/abs/cmp-lg

  3. Yu S W. The ambiguity in natural language and the strategy in machine language.Journal of Chinese Information, 1989, 3(2).

  4. Feng Z W. Computer Processing to Natural Languages. Shanghai Foreign Education Press, 1996.

  5. Meng X Fet al. Investigation and evaluation of Chinese natural language queries. Technical Report, Renmin University of China. 1998.

  6. Cercone N, McCalla G. Accessing knowledge through natural language.Advances in Computers, 1986, 25(1): 1–99.

    Google Scholar 

  7. Meng X F, Zhou Y, Wang S. Domain knowledge extracting in a Chinese natural language interface to databases: NChiql InProc. PAKDD’99, Spinger-Verlag, Beijing, April 1999.

    Google Scholar 

  8. Meng X F, Wang S. Researches on the Chinese restricted natural language interface to databases. InProc. the Fifth International Conference for Young Computer Scientists, ICYCS’99, Nanjing, August 1999.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Xiaofeng.

Additional information

This work is supported by the National Natural Science Foundation of China under grant No.69633020.

MENG Xiaofeng is an associate professor of School of Information, Renmin University of China. He obtained the M.S. degree from Renmin University of China in 1993 and Ph.D. degree from the Institute of Computing Technology, the Chinese Academy of Sciences in 1999. His research interests include database systems, natural language interface, mobile and embedded software, and Web application.

LIU Shuang is a Ph.D. candidate at Institute of Computing Technology, the Chinese Academy of Sciences. She obtained the M.S. degree from Renmin University of China in 1999. Her research interests include database systems.

WANG Shan is a professor and dean of School of Information, Renmin University of China. She obtained the M.S. degree from Renmin University of China in 1982. Her research interests include database systems, datawarehouse & data mining, and information systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meng, X., Liu, S. & Wang, S. Word segmentation based on database semantics in NChiql. J. Comput. Sci. & Technol. 15, 346–354 (2000). https://doi.org/10.1007/BF02948870

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02948870

Keywords

Navigation