Question Classification Based on Hadoop Platform

Qi, XiangXiang; Su, Lei; Yang, Bin; Chen, Jun; Li, Yiyang; Liu, Junhui

doi:10.1007/978-3-319-16050-4_10

Question Classification Based on Hadoop Platform

XiangXiang Qi¹⁹,
Lei Su¹⁹,
Bin Yang¹⁹,
Jun Chen²⁰,
Yiyang Li¹⁹ &
…
Junhui Liu²⁰

Conference paper
First Online: 01 January 2015

1437 Accesses

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 142))

Abstract

The statistical supervised learning model for question classification needs a large amount of labeled training examples. However, labeled data are difficult to collected but unlabeled data are readily obtained. To solve the lack of labeled data, we utilize the method of transfer learning to build the learning model with the labeled and the unlabeled training examples. Based on the feature spaces of source and target domain, the common space are build. Then, those examples from source domain whose conditional probability is like to be similar to the target domain are selected into the common space. Therefore, the question classifier is trained by the labeled data in the source domain and the unlabeled data in the target domain. Meanwhile, the method of Map/Reduce based on the Hadoop platform is used to reduce the time complexity in kernel mapping. The subtasks are constructed for the mapping process and then the final result is obtained by assembling the subtasks. Experiments on question classification show that the proposed method could improve the classification accuracy. Furthermore, the learning model based on the Hadoop Platform could ask each computing resources to reduce the running time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zhang, D., Lee, W.S.: Question classification using supports vector machines. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 26–32 (2003)
Google Scholar
Nguyen, M.L., Nguyen, T.T., Shimazu, A.: Subtree mining for question classification problem. In: Proceedings of the 20th International Conference on Artificial Intelligence, Hyderabad, India, pp. 1695–1700 (2007)
Google Scholar
Moschitti, A., Quateroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question/answer classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech, pp. 776–783 (2007)
Google Scholar
Zhong, E., Fan, W., Peng, J., et al.: Cross domain distribution adaptation via kernel mapping. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, pp. 1027–1036 (2009)
Google Scholar
Pang, X.: Research on Classification Algorithm Based on Active Learning SVM in Hadoop Platform. Master dissertation, South China University of Technology (2011)
Google Scholar
Chen, M., Mao, S., Liu, Y.: Big data: a survey. ACM/Springer Mob. Netw. Appl. 19(2), 171–209 (2014)
Article MathSciNet Google Scholar
Chen, M.: NDNC-BAN: supporting rich media healthcare services via named data networking in cloud-assisted wireless body area networks. Inf. Sci. 284(10), 142–156 (2014)
Article Google Scholar
Sinno Jialin Pan and Qiang Yang: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. 12(10), 2385–2404 (2000)
Article Google Scholar
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D.P., Williamson, B. (eds.) COLT 2001 and EuroCOLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001)
Chapter Google Scholar
Ren, J., Shi, X., Fan, W., et al.: Type-independent correction of sample selection bias via structural discovery and re-balancing. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 565–576 (2008)
Google Scholar
Che, W., Li, Z., Liu, T.: LTP: a Chinese language technology platform. In: Proceeding of the 23rd International Conference on Computational Linguistics, Demonstrations Volume, 23–27 August 2010, Beijing, China, pp. 13–16 (2010)
Google Scholar

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China (No. 61365010), Yunnan Nature Science Foundation (2011FZ069), Yunnan Province Department of Education Foundation (2011Y387).

Author information

Authors and Affiliations

School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650093, China
XiangXiang Qi, Lei Su, Bin Yang & Yiyang Li
School of Software, Yunnan University, Kunming, 650091, Yunnan, China
Jun Chen & Junhui Liu

Authors

XiangXiang Qi
View author publications
You can also search for this author in PubMed Google Scholar
Lei Su
View author publications
You can also search for this author in PubMed Google Scholar
Bin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yiyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Junhui Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Su .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, The University of British Columbia, Vancouver, British Columbia, Canada
Victor C.M. Leung
Cofederal Networks Inc., Renton, Washington, USA
Roy Xiaorong Lai
School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
Min Chen
School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou, China
Jiafu Wan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qi, X., Su, L., Yang, B., Chen, J., Li, Y., Liu, J. (2015). Question Classification Based on Hadoop Platform. In: Leung, V., Lai, R., Chen, M., Wan, J. (eds) Cloud Computing. CloudComp 2014. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 142. Springer, Cham. https://doi.org/10.1007/978-3-319-16050-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-16050-4_10
Published: 08 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16049-8
Online ISBN: 978-3-319-16050-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics