Abstract
We present a novel method for obtaining and ranking rephrased questions from crowds, to be used as a part of instructions in microtask-based crowdsourcing. Using our method, we are able to obtain questions that differ in expression yet have the same semantics with respect to the crowdsourcing task. This is done by generating tasks that give a hint and elicit instructions from workers. We conduct experiments with data used for a real set of gold standard questions submitted to a commercial crowdsourcing platform and compared the results with those of a direct-rewrite method. The results show that extracted questions are semantically ranked at high precision and we identify cases where each method is effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The items are actually associated with an hierarchical directory. We selected subcategories of “Food and Drink.” Each chosen category satisfies the following conditions: (1) it is not a top-level category, (2) it has at least three data items, and (3) the category name is not a composition of two ore more different category names (such as “Beer and Wine”).
- 2.
If there are more than three questions with the same support score, we included all the questions.
References
Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. J. Artif. Intell. Res. 38, 135–187 (2010)
Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24, Portland, Oregon, USA, pp. 190–200 (2011). http://www.aclweb.org/anthology/P11-1020
Dai, P., Lin, C.H., Mausam, Weld, D.S.: Pomdp-based control of workflows for crowdsourcing: Artif. Intell. 202, 52–85 (2013)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management 2005, pp. 84–90 (2005)
Law, E., von Ahn, L.: Input-agreement: a new mechanism for collecting data using human computation games. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Boston, MA, USA, 4–9 April 2009, pp. 1197–1206 (2009). http://doi.acm.org/10.1145/1518701.1518881
Lewis, J.R., Sauro, J.: The factor structure of the system usability scale. In: Kurosu, M. (ed.) HCD 2009. LNCS, vol. 5619, pp. 94–103. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02806-9_12
Lytinen, S.L., Tomuro, N.: The use of question types to match questions in faqfinder. In: Proceedings of the IEEE 2002, p. 1 (2002)
Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput. Linguist. 36, 341–387 (2010)
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web 2006, pp. 377–386 (2006)
Sauro, J., Lewis, J.R.: When designing usability questionnaires, does it hurt to be positive? In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2215–2224. CHI 2011, NY, USA (2011). http://doi.acm.org/10.1145/1978942.1979266
Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using user logs. In: ACM Transactions on Information Systems (TOIS) 2002, pp. 59–81 (2002)
Zhao, S., Zhou, M., Liu, T.: Learning question paraphrases for qa from encarta logs. In: IJCAI 2007 Proceedings of the 20th International Joint Conference on Artifical Intelligence 2007, pp. 1795–1800 (2007)
Acknowledgments
We are grateful to the project members of Yahoo! Crowdsourcing, including Masashi Nakagawa and Manabu Yamamoto. This work was supported by JSPS KAKENHI Grant Number 25240012 in part.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Hayashi, R., Shimizu, N., Morishima, A. (2016). Obtaining Rephrased Microtask Questions from Crowds. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10047. Springer, Cham. https://doi.org/10.1007/978-3-319-47874-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-47874-6_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47873-9
Online ISBN: 978-3-319-47874-6
eBook Packages: Computer ScienceComputer Science (R0)