Skip to main content

Text Segmentation Model Based LDA and Ontology for Question Answering in Agriculture

  • Conference paper
  • First Online:
Proceedings of 2013 World Agricultural Outlook Conference

Abstract

Question answering system based on text collections has been one research focus in information technology. The significant problem for text collections was how to construct models for text and segmentations. An approach to building topic models based on a formal generative model of documents, Latent Dirichlet Allocation (LDA), is heavily cited in the machine learning literature, but its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use LDA to improve answer retrieval. We propose an LDA-based segmentation model within the language modelling framework, and evaluate it on text segmentation collections for agriculture cultivation. Gibbs sampling is employed to conduct approximate inference in LDA and the computational complexity is analyzed. The process of generating answers in agriculture cultivation question answering system (ACQA_onto) was presented. We demonstrate LDA’s improved expressiveness over traditional QA system based information retrieval with visualizations of answers accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Blair-Goldensohn S, McKeown K, Schlaikjer A (2004) Answering definitional questions: a hybrid approach. In: New directions in question answering

    Google Scholar 

  • Blei DM, Ng AY, Jordan MJ (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    Google Scholar 

  • Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19(2):261–272

    Article  Google Scholar 

  • Chang CC, Lin CC, Hu YS (2007) An SVD oriented watermark embedding scheme with high qualities for the restored images. Int J Innov Comput Infor Cont 3(3):609–620

    Google Scholar 

  • Dang HT, Kelly D, Lin J (2007) Overview of the TREC 2007 question answering track. In: Proceedings of the sixteenth Text Retrieval Conference, 2007

    Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990a) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW, Laudauer TK, Harhman R (1990b) Indexing by latent semantic analysis. J Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Diaz-Galiano M, Cumbreras MG, Martin-Valdivia M, Raez AM, Urena-Lopez L (2007) Integrating mesh ontology to improve medical information retrieval. In CLEF, vol 5152 of lecture notes in computer science, Springer, pp 601–606

    Google Scholar 

  • Dridi O (2008) Ontology-based information retrieval: overview and new proposition. In: Pastor O, Flory A, Cavarero JL (eds) RCIS, pp 421–426. IEEE

    Google Scholar 

  • From Web resources to agricultural ontology: a method for semi-automatic construction

    Google Scholar 

  • Geman S, Geman D (1984) IEEE Trans Pattern Anal Mach Intell 6:721–741

    Article  Google Scholar 

  • Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in practice. Chapman & Hall, New York

    Book  Google Scholar 

  • Griths T (2002) Gibbs sampling in the generative model of Latent Dirichlet Allocation. Technical report, Stanford University

    Google Scholar 

  • Griths L, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(Suppl. 1):5228–5235

    Google Scholar 

  • Hidden topic Markov models

    Google Scholar 

  • Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of SIGIR’99, Berkeley, CA, USA

    Google Scholar 

  • Hu D, Wang W, Xie N, Cao C (2012) ACQA_onto: an ontology approach for restrain domain question answering system. In: The proceedings of 2012 IET International Conference on Information Science and Control Engineering (ICISCE 2012)

    Google Scholar 

  • Kaisser M, Scheible S, Webber B (2006) Experiments at the University of Edinburgh for the TREC 2006 QA track. In: Proceedings of the fifteenth Text Retrieval Conference

    Google Scholar 

  • Laudauer TK (1997) A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, and representation of knowledge. Psychol Rev 104(2):211–240

    Article  Google Scholar 

  • Liu JS (2001) Monte Carlo strategies in scientific computing. Springer, New York

    Google Scholar 

  • MacKay DJ (2003) Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge/New York

    Google Scholar 

  • Miao DQ, Wei ZH (2007) The principle and application of Chinese information processing. Tsinghua Publishing House, Beijing

    Google Scholar 

  • Papadimitriou CH, Tamaki H, Raghavan P, Vempala S (1998) Latent semantic indexing: a probabilistic analysis. In: Proceedings of the ACM conference on Principles of Database Systems (PODS), pp 159–168

    Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    Google Scholar 

  • Question answering in restricted domains: an overview

    Google Scholar 

  • Salton G, Wong A, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  Google Scholar 

  • Semi-automatic construction of topic ontology

    Google Scholar 

  • Tomassen SL (2006) Research on ontology driven information retrieval. In: OTM workshops, 2

    Google Scholar 

  • Ueda N, Saito K (2003) Advances in neural information processing systems, vol 15. MIT Press, Cambridge, MA

    Google Scholar 

  • Weischedel R, Xu J, Licuanan A (2004) A hybrid approach to answering biographical questions. In: New directions in question answering

    Google Scholar 

  • Wu F, Wu G, Fu X (2007) Design and implementation of ontology-based query expansion for information retrieval. In: Xu L, Tjoa A, Chaudhry S (eds) CONFENIS (1), vol 254 of IFIP, Springer, pp 293–298

    Google Scholar 

  • Xing EP, Yan R, Hauptmann AG, Mining associated text and images with dual-wing harmoniums

    Google Scholar 

  • Ye J, Coyle L, Dobson S, Nixon P (2007) Ontology-based models in pervasive computing systems. Knowl Eng Rev 22(04):315–347

    Article  Google Scholar 

  • Yu ZT, Fan XZ, Guo JY, Gen ZM (2006) Answer extracting for Chinese question answering system based on latent semantic analysis. Chin J Comput 29(10)

    Google Scholar 

Download references

Acknowledgements

This paper is the sub item of key Technologies for Agricultural Field Information Comprehensive Sensing and Rural extension (Item No.: 2011BAD21B01) of National Science and Technology Support Program.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hu, D., Wang, W., Liu, S., Xie, N., Yin, G. (2014). Text Segmentation Model Based LDA and Ontology for Question Answering in Agriculture. In: Xu, S. (eds) Proceedings of 2013 World Agricultural Outlook Conference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54389-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54389-0_27

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54388-3

  • Online ISBN: 978-3-642-54389-0

  • eBook Packages: Business and EconomicsEconomics and Finance (R0)

Publish with us

Policies and ethics