A Set-Based Training Query Classification Approach for Twitter Search

Ma, Qingli; He, Ben; Xu, Jungang; Wang, Bin

doi:10.1007/978-3-319-39937-9_38

Qingli Ma¹⁸,
Ben He¹⁸,
Jungang Xu¹⁸ &
…
Bin Wang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9658))

Included in the following conference series:

International Conference on Web-Age Information Management

1573 Accesses

Abstract

Learning to rank is a popular technique of building a ranking model for Twitter search by utilizing a rich list of features. As most learning to rank algorithms are supervised, their effectiveness is heavily affected by the quality of labeled training data. Selecting training queries with high quality is an important means to improving the effectiveness of ranking model for Twitter search. Existing approach for this problem learns a query quality classifier, which estimates the training query quality on a per query basis, but ignores the dependence between queries. This paper proposes a set-based training query classification approach that estimates a training query’s quality by taking its usefulness in combination with other training queries into consideration. Evaluation on standard TREC Microblog track test collection shows effective retrieval performance brought by the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of WWW, pp. 591–600 (2010)
Google Scholar
Duan, Y., Jiang, L., Qin, T., Zhou, M., Shum, H.Y.: An empirical study on learning to rank of tweets. In: Proceedings of COLING, pp. 295–303 (2010)
Google Scholar
Lin, J., Efron, M.: Overview of the TREC 2013 microblog track. In: TREC (2013)
Google Scholar
Lin, J., Efron, M.: Overview of the TREC 2014 microblog track. In: TREC (2014)
Google Scholar
Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)
Article Google Scholar
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of ICML, pp. 129–136 (2007)
Google Scholar
Long, B., Chapelle, O., Zhang, Y., Chang, Y., Zheng, Z., Tseng, B.: Active learning for ranking through expected loss optimization. In: Proceedings of SIGIR, pp. 267–274 (2010)
Google Scholar
Zhang, X., He, B., Luo, T., Li, D., Xu, J.: Clustering-based transduction for learning a ranking model with limited human labels. In: Proceedings of CIKM, pp. 1777–1782 (2013)
Google Scholar
Li, D., He, B., Luo, T., Zhang, X.: Selecting training data for learning-based twitter search. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 501–506. Springer, Heidelberg (2015)
Google Scholar
Lv, C., Fan, F., Qiang, R., Fei, Y., Yang, J.: Pkuicst at TREC 2014 microblog track: feature extraction for effective microblog search and adaptive clustering algorithms for TTG. In: TREC (2014)
Google Scholar
Xu, T., Oard, D.W., McNamee, P.: HLTCOE at TREC 2014: microblog and clinical decision support. In: TREC (2014)
Google Scholar
Zhang, Z., Lan, M.: Estimating semantic similarity between expanded query and tweet content for microblog retrieval. In: TREC (2014)
Google Scholar
Magdy, W., Gao, W., El-Ganainy, T., Wei, Z.: QCRI at TREC 2014: applying the KISS principle for the TTG task in the microblog track. In: TREC (2014)
Google Scholar
Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of CIKM, pp. 538–548 (2002)
Google Scholar
Li, C., Wang, Y., Mei, Q.: A user-in-the-loop process for investigational search: foreseer in TREC 2013 microblog track. In: TREC (2013)
Google Scholar
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co. Inc., Boston (1999)
Google Scholar
Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20(4), 357–389 (2002)
Article Google Scholar
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)
Article Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of SIGIR, pp. 334–342 (2001)
Google Scholar
Shtok, A., Kurland, O., Carmel, D., Raiber, F., Markovits, G.: Predicting query performance by query-drift estimation. ACM Trans. Inf. Syst. 30(2), 11:1–11:35 (2012)
Article Google Scholar
Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. In: Proceedings of SIGIR, pp. 186–193 (2006)
Google Scholar
Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank: theory and algorithm. In: Proceedings of ICML, pp. 1192–1199 (2008)
Google Scholar

Download references

Acknowledgments

This work is supported in part by the National Natural Science Foundation of China (61472391), and Beijing Natural Science Foundation (4142050).

Author information

Authors and Affiliations

University of Chinese Academy of Sciences, Beijing, China
Qingli Ma, Ben He, Jungang Xu & Bin Wang

Authors

Qingli Ma
View author publications
You can also search for this author in PubMed Google Scholar
Ben He
View author publications
You can also search for this author in PubMed Google Scholar
Jungang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ben He or Jungang Xu .

Editor information

Editors and Affiliations

Peking University , Beijing, China
Bin Cui
The George Washington University, Washington, D.C., USA
Nan Zhang
Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
Jianliang Xu
University of Texas Rio Grande Valley, Edinburg, Texas, USA
Xiang Lian
Jiangxi University of Finance and Economics, Nanchang, Jiangxi, China
Dexi Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, Q., He, B., Xu, J., Wang, B. (2016). A Set-Based Training Query Classification Approach for Twitter Search. In: Cui, B., Zhang, N., Xu, J., Lian, X., Liu, D. (eds) Web-Age Information Management. WAIM 2016. Lecture Notes in Computer Science(), vol 9658. Springer, Cham. https://doi.org/10.1007/978-3-319-39937-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-39937-9_38
Published: 28 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39936-2
Online ISBN: 978-3-319-39937-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics