Obtaining Rephrased Microtask Questions from Crowds

Hayashi, Ryota; Shimizu, Nobuyuki; Morishima, Atsuyuki

doi:10.1007/978-3-319-47874-6_23

Ryota Hayashi¹⁵,
Nobuyuki Shimizu¹⁶ &
Atsuyuki Morishima¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10047))

Included in the following conference series:

International Conference on Social Informatics

2416 Accesses
1 Citations

Abstract

We present a novel method for obtaining and ranking rephrased questions from crowds, to be used as a part of instructions in microtask-based crowdsourcing. Using our method, we are able to obtain questions that differ in expression yet have the same semantics with respect to the crowdsourcing task. This is done by generating tasks that give a hint and elicit instructions from workers. We conduct experiments with data used for a real set of gold standard questions submitted to a commercial crowdsourcing platform and compared the results with those of a direct-rewrite method. The results show that extracted questions are semantically ranked at high precision and we identify cases where each method is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The items are actually associated with an hierarchical directory. We selected subcategories of “Food and Drink.” Each chosen category satisfies the following conditions: (1) it is not a top-level category, (2) it has at least three data items, and (3) the category name is not a composition of two ore more different category names (such as “Beer and Wine”).
2.
If there are more than three questions with the same support score, we included all the questions.

References

Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. J. Artif. Intell. Res. 38, 135–187 (2010)
MATH Google Scholar
Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24, Portland, Oregon, USA, pp. 190–200 (2011). http://www.aclweb.org/anthology/P11-1020
Dai, P., Lin, C.H., Mausam, Weld, D.S.: Pomdp-based control of workflows for crowdsourcing: Artif. Intell. 202, 52–85 (2013)
Google Scholar
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management 2005, pp. 84–90 (2005)
Google Scholar
Law, E., von Ahn, L.: Input-agreement: a new mechanism for collecting data using human computation games. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Boston, MA, USA, 4–9 April 2009, pp. 1197–1206 (2009). http://doi.acm.org/10.1145/1518701.1518881
Lewis, J.R., Sauro, J.: The factor structure of the system usability scale. In: Kurosu, M. (ed.) HCD 2009. LNCS, vol. 5619, pp. 94–103. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02806-9_12
Chapter Google Scholar
Lytinen, S.L., Tomuro, N.: The use of question types to match questions in faqfinder. In: Proceedings of the IEEE 2002, p. 1 (2002)
Google Scholar
Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput. Linguist. 36, 341–387 (2010)
Article MathSciNet Google Scholar
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web 2006, pp. 377–386 (2006)
Google Scholar
Sauro, J., Lewis, J.R.: When designing usability questionnaires, does it hurt to be positive? In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2215–2224. CHI 2011, NY, USA (2011). http://doi.acm.org/10.1145/1978942.1979266
Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using user logs. In: ACM Transactions on Information Systems (TOIS) 2002, pp. 59–81 (2002)
Google Scholar
Zhao, S., Zhou, M., Liu, T.: Learning question paraphrases for qa from encarta logs. In: IJCAI 2007 Proceedings of the 20th International Joint Conference on Artifical Intelligence 2007, pp. 1795–1800 (2007)
Google Scholar

Download references

Acknowledgments

We are grateful to the project members of Yahoo! Crowdsourcing, including Masashi Nakagawa and Manabu Yamamoto. This work was supported by JSPS KAKENHI Grant Number 25240012 in part.

Author information

Authors and Affiliations

University of Tsukuba, Tsukuba, Ibaraki, Japan
Ryota Hayashi & Atsuyuki Morishima
Yahoo Japan Corporation, Tokyo, Japan
Nobuyuki Shimizu

Authors

Ryota Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Nobuyuki Shimizu
View author publications
You can also search for this author in PubMed Google Scholar
Atsuyuki Morishima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atsuyuki Morishima .

Editor information

Editors and Affiliations

University of Washington, Seattle, Washington, USA
Emma Spiro
Indiana University, Bloomington, Indiana, USA
Yong-Yeol Ahn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hayashi, R., Shimizu, N., Morishima, A. (2016). Obtaining Rephrased Microtask Questions from Crowds. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10047. Springer, Cham. https://doi.org/10.1007/978-3-319-47874-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-47874-6_23
Published: 19 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47873-9
Online ISBN: 978-3-319-47874-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics