Skip to main content

Obtaining Rephrased Microtask Questions from Crowds

  • Conference paper
  • First Online:
Book cover Social Informatics (SocInfo 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10047))

Included in the following conference series:

Abstract

We present a novel method for obtaining and ranking rephrased questions from crowds, to be used as a part of instructions in microtask-based crowdsourcing. Using our method, we are able to obtain questions that differ in expression yet have the same semantics with respect to the crowdsourcing task. This is done by generating tasks that give a hint and elicit instructions from workers. We conduct experiments with data used for a real set of gold standard questions submitted to a commercial crowdsourcing platform and compared the results with those of a direct-rewrite method. The results show that extracted questions are semantically ranked at high precision and we identify cases where each method is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The items are actually associated with an hierarchical directory. We selected subcategories of “Food and Drink.” Each chosen category satisfies the following conditions: (1) it is not a top-level category, (2) it has at least three data items, and (3) the category name is not a composition of two ore more different category names (such as “Beer and Wine”).

  2. 2.

    If there are more than three questions with the same support score, we included all the questions.

References

  1. Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. J. Artif. Intell. Res. 38, 135–187 (2010)

    MATH  Google Scholar 

  2. Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24, Portland, Oregon, USA, pp. 190–200 (2011). http://www.aclweb.org/anthology/P11-1020

  3. Dai, P., Lin, C.H., Mausam, Weld, D.S.: Pomdp-based control of workflows for crowdsourcing: Artif. Intell. 202, 52–85 (2013)

    Google Scholar 

  4. Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management 2005, pp. 84–90 (2005)

    Google Scholar 

  5. Law, E., von Ahn, L.: Input-agreement: a new mechanism for collecting data using human computation games. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Boston, MA, USA, 4–9 April 2009, pp. 1197–1206 (2009). http://doi.acm.org/10.1145/1518701.1518881

  6. Lewis, J.R., Sauro, J.: The factor structure of the system usability scale. In: Kurosu, M. (ed.) HCD 2009. LNCS, vol. 5619, pp. 94–103. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02806-9_12

    Chapter  Google Scholar 

  7. Lytinen, S.L., Tomuro, N.: The use of question types to match questions in faqfinder. In: Proceedings of the IEEE 2002, p. 1 (2002)

    Google Scholar 

  8. Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput. Linguist. 36, 341–387 (2010)

    Article  MathSciNet  Google Scholar 

  9. Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web 2006, pp. 377–386 (2006)

    Google Scholar 

  10. Sauro, J., Lewis, J.R.: When designing usability questionnaires, does it hurt to be positive? In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2215–2224. CHI 2011, NY, USA (2011). http://doi.acm.org/10.1145/1978942.1979266

  11. Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using user logs. In: ACM Transactions on Information Systems (TOIS) 2002, pp. 59–81 (2002)

    Google Scholar 

  12. Zhao, S., Zhou, M., Liu, T.: Learning question paraphrases for qa from encarta logs. In: IJCAI 2007 Proceedings of the 20th International Joint Conference on Artifical Intelligence 2007, pp. 1795–1800 (2007)

    Google Scholar 

Download references

Acknowledgments

We are grateful to the project members of Yahoo! Crowdsourcing, including Masashi Nakagawa and Manabu Yamamoto. This work was supported by JSPS KAKENHI Grant Number 25240012 in part.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atsuyuki Morishima .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Hayashi, R., Shimizu, N., Morishima, A. (2016). Obtaining Rephrased Microtask Questions from Crowds. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10047. Springer, Cham. https://doi.org/10.1007/978-3-319-47874-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47874-6_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47873-9

  • Online ISBN: 978-3-319-47874-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics