Abstract
Question answering system based on text collections has been one research focus in information technology. The significant problem for text collections was how to construct models for text and segmentations. An approach to building topic models based on a formal generative model of documents, Latent Dirichlet Allocation (LDA), is heavily cited in the machine learning literature, but its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use LDA to improve answer retrieval. We propose an LDA-based segmentation model within the language modelling framework, and evaluate it on text segmentation collections for agriculture cultivation. Gibbs sampling is employed to conduct approximate inference in LDA and the computational complexity is analyzed. The process of generating answers in agriculture cultivation question answering system (ACQA_onto) was presented. We demonstrate LDA’s improved expressiveness over traditional QA system based information retrieval with visualizations of answers accurate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blair-Goldensohn S, McKeown K, Schlaikjer A (2004) Answering definitional questions: a hybrid approach. In: New directions in question answering
Blei DM, Ng AY, Jordan MJ (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19(2):261–272
Chang CC, Lin CC, Hu YS (2007) An SVD oriented watermark embedding scheme with high qualities for the restored images. Int J Innov Comput Infor Cont 3(3):609–620
Dang HT, Kelly D, Lin J (2007) Overview of the TREC 2007 question answering track. In: Proceedings of the sixteenth Text Retrieval Conference, 2007
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990a) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Deerwester S, Dumais ST, Furnas GW, Laudauer TK, Harhman R (1990b) Indexing by latent semantic analysis. J Soc Inf Sci 41(6):391–407
Diaz-Galiano M, Cumbreras MG, Martin-Valdivia M, Raez AM, Urena-Lopez L (2007) Integrating mesh ontology to improve medical information retrieval. In CLEF, vol 5152 of lecture notes in computer science, Springer, pp 601–606
Dridi O (2008) Ontology-based information retrieval: overview and new proposition. In: Pastor O, Flory A, Cavarero JL (eds) RCIS, pp 421–426. IEEE
From Web resources to agricultural ontology: a method for semi-automatic construction
Geman S, Geman D (1984) IEEE Trans Pattern Anal Mach Intell 6:721–741
Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in practice. Chapman & Hall, New York
Griths T (2002) Gibbs sampling in the generative model of Latent Dirichlet Allocation. Technical report, Stanford University
Griths L, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(Suppl. 1):5228–5235
Hidden topic Markov models
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of SIGIR’99, Berkeley, CA, USA
Hu D, Wang W, Xie N, Cao C (2012) ACQA_onto: an ontology approach for restrain domain question answering system. In: The proceedings of 2012 IET International Conference on Information Science and Control Engineering (ICISCE 2012)
Kaisser M, Scheible S, Webber B (2006) Experiments at the University of Edinburgh for the TREC 2006 QA track. In: Proceedings of the fifteenth Text Retrieval Conference
Laudauer TK (1997) A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, and representation of knowledge. Psychol Rev 104(2):211–240
Liu JS (2001) Monte Carlo strategies in scientific computing. Springer, New York
MacKay DJ (2003) Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge/New York
Miao DQ, Wei ZH (2007) The principle and application of Chinese information processing. Tsinghua Publishing House, Beijing
Papadimitriou CH, Tamaki H, Raghavan P, Vempala S (1998) Latent semantic indexing: a probabilistic analysis. In: Proceedings of the ACM conference on Principles of Database Systems (PODS), pp 159–168
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Question answering in restricted domains: an overview
Salton G, Wong A, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Semi-automatic construction of topic ontology
Tomassen SL (2006) Research on ontology driven information retrieval. In: OTM workshops, 2
Ueda N, Saito K (2003) Advances in neural information processing systems, vol 15. MIT Press, Cambridge, MA
Weischedel R, Xu J, Licuanan A (2004) A hybrid approach to answering biographical questions. In: New directions in question answering
Wu F, Wu G, Fu X (2007) Design and implementation of ontology-based query expansion for information retrieval. In: Xu L, Tjoa A, Chaudhry S (eds) CONFENIS (1), vol 254 of IFIP, Springer, pp 293–298
Xing EP, Yan R, Hauptmann AG, Mining associated text and images with dual-wing harmoniums
Ye J, Coyle L, Dobson S, Nixon P (2007) Ontology-based models in pervasive computing systems. Knowl Eng Rev 22(04):315–347
Yu ZT, Fan XZ, Guo JY, Gen ZM (2006) Answer extracting for Chinese question answering system based on latent semantic analysis. Chin J Comput 29(10)
Acknowledgements
This paper is the sub item of key Technologies for Agricultural Field Information Comprehensive Sensing and Rural extension (Item No.: 2011BAD21B01) of National Science and Technology Support Program.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, D., Wang, W., Liu, S., Xie, N., Yin, G. (2014). Text Segmentation Model Based LDA and Ontology for Question Answering in Agriculture. In: Xu, S. (eds) Proceedings of 2013 World Agricultural Outlook Conference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54389-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-54389-0_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54388-3
Online ISBN: 978-3-642-54389-0
eBook Packages: Business and EconomicsEconomics and Finance (R0)