Skip to main content

State-of-the-Art: Semantics Acquisition and Crowdsourcing

  • Chapter
  • First Online:

Abstract

In this chapter we review the field of semantics acquisition to provide ground for further discussion on the semantics acquisition games. First, we cover the necessary definitions and review the main “client” approaches for semantics utilization—the information retrieval applications. Then, we move through three major groups of semantics acquisition approaches. The first group constitutes the expert-based approaches: costly, yet often essential for certain tasks such as seeding, setting-up schemas and semantics acquisition output validation. As second, we review the automated approaches: quantitatively effective, yet with questionable quality of output, widely utilized for many tasks such as ontology learning or resource metadata acquisition. Finally, we review the crowd-based approaches, which represent a balance between quality and quantity. They comprise many working schemes, ranging from “explicit” mechanical turking, to “implicit” social tagging applications and of course semantics acquisition games.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://protege.stanford.edu/

  2. 2.

    http://www.cyc.com/

  3. 3.

    http://wordnet.princeton.edu/

  4. 4.

    http://www.google.com/reCAPTCHA

  5. 5.

    https://www.mturk.com

  6. 6.

    http://www.flickr.com

  7. 7.

    http://delicious.org

  8. 8.

    https://www.duolingo.com

References

  1. von Ahn, L., Dabbish, L.: Designing games with a purpose. Commun. ACM 51(8), 58–67 (2008)

    Google Scholar 

  2. von Ahn, L., Liu, R., Blum, M.: Peekaboom: a game for locating objects in images. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’06, pp. 55–64. ACM, New York (2006)

    Google Scholar 

  3. Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’13, pp. 554–562. ACM, New York (2013)

    Google Scholar 

  4. Bai, J., Song, D., Bruza, P., Nie, J.Y., Cao, G.: Query expansion using term relationships in language models for information retrieval. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM ’05, pp. 688–695. ACM, New York (2005)

    Google Scholar 

  5. Barathi, M.: Context disambiguation based semantic web search for effective information retrieval. J. Comput. Sci. 7(4), 548–553 (2011)

    Article  Google Scholar 

  6. Barla, M.: Towards social-based user modeling and personalization. Inf. Sci. Technol. Bull. ACM Slovakia 3(1), 52–60 (2011)

    Google Scholar 

  7. Barla, M., Bieliková, M.: On deriving tagsonomies: keyword relations coming from crowd. In: Proceedings of the 1st International Conference on Computational Collective Intelligence, Semantic Web, Social Networks and Multiagent Systems, ICCCI ’09, pp. 309–320. Springer, Berlin, Heidelberg (2009)

    Google Scholar 

  8. Barla, M., Bieliková, M., Ezzeddinne, A.B., Kramár, T., Šimko, M., Vozár, O.: On the impact of adaptive test question selection for learning efficiency. Comput. Educ. 55(2), 846–857 (2010)

    Article  Google Scholar 

  9. Bhogal, J., Macfarlane, A., Smith, P.: A review of ontology based query expansion. Inf. Process. Manage. 43(4), 866–886 (2007)

    Article  Google Scholar 

  10. Bieliková, M., Kuric, E.: Automatic image annotation using global and local features. In: Proceedings of the 2011 Sixth International Workshop on Semantic Media Adaptation and Personalization. SMAP ’11, pp. 33–38. IEEE Computer Society, Washington (2011)

    Google Scholar 

  11. Bizer, C., Heath, T., Berners-Lee, T.: Linked data—the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)

    Article  Google Scholar 

  12. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia—a crystallization point for the web of data. Web Semant. 7, 154–165 (2009)

    Article  Google Scholar 

  13. Bolettieri, P., Falchi, F., Gennaro, C., Rabitti, F.: Automatic metadata extraction and indexing for reusing e-learning multimedia objects. In: Workshop on Multimedia Information Retrieval on The Many Faces of Multimedia Semantics. MS ’07, pp. 21–28. ACM, New York (2007)

    Google Scholar 

  14. Botev, C., Amer-Yahia, S., Shanmugasundaram, J.: Expressiveness and performance of full-text search languages. In: Proceedings of the 10th International Conference on Advances in Database Technology. EDBT’06, pp. 349–367. Springer, Berlin, Heidelberg (2006)

    Google Scholar 

  15. Buitelaar, P., Cimiano, P., Frank, A., Hartung, M., Racioppa, S.: Ontology-based information extraction and integration from heterogeneous data sources. Int. J. Hum Comput Stud. 66(11), 759–788 (2008)

    Google Scholar 

  16. Chang, E., Goh, K., Sychay, G., Wu, G.: Cbsa: content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Trans. Cir. and Sys. Video Technol. 13(1), 26–38 (2003)

    Google Scholar 

  17. Cusano, C., Ciocca, G., Schettini, R.: Image annotation using SVM. Proc. SPIE 5304, 330–338 (2004)

    Article  Google Scholar 

  18. Dalvi, N., Kumar, R., Pang, B., Ramakrishnan, R., Tomkins, A., Bohannon, P., Keerthi, S., Merugu, S.: A web of concepts. In: Proceedings of the Twenty-Eighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1–12. ACM (2009)

    Google Scholar 

  19. Das, R., Vukovic, M.: Emerging theories and models of human computation systems: a brief survey. In: Proceedings of the 2nd International Workshop on Ubiquitous Crowdsouring, UbiCrowd ’11, pp. 1–4. ACM, New York (2011)

    Google Scholar 

  20. Di Maio, P.: ‘Just enough’ ontology engineering. In: Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS ’11, pp. 8:1–8:10. ACM, New York (2011)

    Google Scholar 

  21. Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 86–96 (2011)

    Article  Google Scholar 

  22. Duygulu, P., Barnard, K.: Freitas, J.F.G.d., Forsyth, D.A.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision-Part IV. ECCV ’02, pp. 97–112. Springer, London (2002)

    Google Scholar 

  23. Erickson, T.: Some thoughts on a framework for crowdsourcing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI’11. A Position Paper for the CHI 2011 Workshop on Crowdsourcing and Human Computation. ACM, New York (2011)

    Google Scholar 

  24. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA (1998)

    MATH  Google Scholar 

  25. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR’04, pp. 1002–1009. IEEE Computer Society, Washington (2004)

    Google Scholar 

  26. Ferrara, A., Ludovico, L.A., Montanelli, S., Castano, S., Haus, G.: A semantic web ontology for context-based classification and retrieval of music resources. ACM Trans. Multimedia Comput. Commun. Appl. 2(3), 177–198 (2006)

    Article  Google Scholar 

  27. Guarino, N., Welty, C.: Evaluating ontological decisions with ontoclean. Commun. ACM 45(2), 61–65 (2002)

    Article  Google Scholar 

  28. Gulla, J.A., Sugumaran, V.: An interactive ontology learning workbench for non-experts. In: Proceedings of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web. ONISW ’08, pp. 9–16. ACM, New York (2008)

    Google Scholar 

  29. Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6) (2006). http://www.wired.com/wired/archive/14.06/crowds.html

  30. Jarrar, M.: Position paper: towards the notion of gloss, and the adoption of linguistic resources in formal ontology engineering. In: Proceedings of the 15th International Conference on World Wide Web. WWW ’06, pp. 497–503. ACM, New York (2006)

    Google Scholar 

  31. Jačala, M., Tvarožek, J.: Named entity disambiguation based on explicit semantics. In: Proceedings of the 38th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM’12, pp. 456–466. Springer, Berlin, Heidelberg (2012)

    Google Scholar 

  32. Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. Knowl. Eng. Rev. 18(1):1–31 (2003)

    Google Scholar 

  33. Köhler, J., Philippi, S., Specht, M., Rüegg, A.: Ontology based text indexing and querying for the semantic web. Know. Based Syst. 19(8), 744–754 (2006)

    Google Scholar 

  34. Kompan, M., Zeleník, D., Bieliková, M.: Methods for personalized recommendation of newspaper articles. In: Znalosti (In Slovak) (2011)

    Google Scholar 

  35. Kozareva, Z.: Bootstrapping named entity recognition with automatically generated gazetteer lists. In: Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research W. on - EACL ’06, pp. 15–21. Association for Computational Linguistics, Morristown (2006)

    Google Scholar 

  36. Kramár, T., Barla, M., Bieliková, M.: Disambiguating search by leveraging the social network context based on the stream of user’s activity. In: Proceedings of the 18th International Conference on User Modeling, Adaptation, and Personalization,UMAP ’10, pp. 387–392. Springer, Hawaii (2010)

    Google Scholar 

  37. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Proceedings of Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2003)

    Google Scholar 

  38. Lenat, D.B.: CYC: a large-scale investment in knowledge infrastructure. Commun. ACM 38(11), 33–38 (1995)

    Article  Google Scholar 

  39. Liu, H., Singh, P.: Conceptnet—a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)

    Article  MathSciNet  Google Scholar 

  40. Liu, Q., Sung, A.H., Qiao, M.: Novel stream mining for audio steganalysis. In: Proceedings of the 17th ACM International Conference on Multimedia. MM ’09, pp. 95–104. ACM, New York (2009)

    Google Scholar 

  41. Lu, L., Hanjalic, A.: Towards optimal audio “keywords” detection for audio content analysis and discovery. In: Proceedings of the 14th Annual ACM International Conference on Multimedia. MULTIMEDIA ’06, pp. 825–834. ACM, New York (2006)

    Google Scholar 

  42. Magistrali, M., Catenazzi, N., Sommaruga, L.: Tonal mir: a music retrieval engine based on semantic web technologies. In: Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS ’10, pp. 21:1–21:5. ACM, New York (2010).

    Google Scholar 

  43. Maleewong, K., Anutariya, C., Wuwongse, V.: A semantic argumentation approach to collaborative ontology engineering. In: Proceedings of the 11th International Conference on Information Integration and Web-based Applications and Services. iiWAS ’09, pp. 56–63. ACM, New York (2009)

    Google Scholar 

  44. Marchionini, G.: From finding to understanding. Commun. ACM 49(4), 41–46 (2006)

    Article  Google Scholar 

  45. Mashhadi, A.J., Capra, L.: Quality control for real-time ubiquitous crowdsourcing. In: Proceedings of the 2nd International Workshop on Ubiquitous Crowdsouring. UbiCrowd ’11, pp. 5–8. ACM, New York (2011)

    Google Scholar 

  46. Mcdowell, L., Cafarella, M.: Ontology-driven, unsupervised instance population. Web Semant. Sci. Serv. Agents World Wide Web 6(3), 218–236 (2008)

    Article  Google Scholar 

  47. Mizoguchi, R., Sunagawa, E., Kozaki, K., Kitamura, Y.: The model of roles within an ontology development tool: Hozo. Appl. Ontol. 2(2), 159–179 (2007)

    Google Scholar 

  48. Moor, A.D., Leenheer, P.D., Meersman, R., Starlab, V.: Dogma-mess: a meaning evolution support system for interorganizational ontology engineering. In: Proceedings of the 14th International Conference on Conceptual Structures, (ICCS 2006), pp. 189–203. Springer, Heidelberg (2006)

    Google Scholar 

  49. Mullins, M., Fizzano, P.: Treelicious: a system for semantically navigating tagged web pages. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 3, 91–96 (2010)

    Google Scholar 

  50. Orio, N.: Automatic identification of audio recordings based on statistical modeling. Signal Process. 90(4), 1064–1076 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  51. Pantel, P., Pennacchiotti, M.: Automatically harvesting and ontologizing semantic relations. In: Proceedings of the 2008 Conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pp. 171–195. IOS Press, Amsterdam (2008)

    Google Scholar 

  52. Papadopoulos, G.T., Mylonas, P., Mezaris, V., Avrithis, Y.S., Kompatsiaris, I.: Knowledge-assisted image analysis based on context and spatial optimization. Int. J. Semantic Web Inf. Syst. 2(3), 17–36 (2006)

    Article  Google Scholar 

  53. Park, L.a.F., Ramamohanarao, K.: An analysis of latent semantic term self-correlation. ACM Trans. Inf. Syst. 27(2), 1–35 (2009)

    Google Scholar 

  54. Parshotam, K.: Crowd computing: a literature review and definition. In: Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference. SAICSIT ’13, pp. 121–130. ACM, New York (2013)

    Google Scholar 

  55. Quinn, A.J., Bederson, B.B.: Human computation: a survey and taxonomy of a growing field. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’11, pp. 1403–1412. ACM, New York (2011)

    Google Scholar 

  56. Radhakrishnan, R., Divakaran, A., Xiong, Z.: A time series clustering based framework for multimedia mining and summarization using audio features. In: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. MIR ’04, pp. 157–164. ACM, New York (2004)

    Google Scholar 

  57. Richter, S., Perkmann Berger, S., Koch, G., Füller, J.: Online idea contests: identifying factors for user retention. Proceedings of the 5th International Conference on Online Communities and Social Computing. OCSC’13, pp. 76–85. Springer, Berlin, Heidelberg (2013)

    Google Scholar 

  58. Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: lessons from natural language processing. In: Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, i-KNOW ’12, pp. 17:1–17:8. ACM, New York (2012)

    Google Scholar 

  59. Sanchez, D.: A methodology to learn ontological attributes from the web. Data Knowl. Eng. 69(6), 573–597 (2010)

    Article  Google Scholar 

  60. Sanchez, D., Moreno, A.: Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl. Eng. 64(3), 600–623 (2008)

    Article  Google Scholar 

  61. Schedl, M., Widmer, G., Knees, P., Pohle, T.: A music information system automatically generated via web content mining techniques. Inf. Process. Manage. 47(3), 426–439 (2011)

    Article  Google Scholar 

  62. Siorpaes, K., Hepp, M.: Games with a purpose for the semantic web. IEEE Intell. Syst. 23, 50–60 (2008)

    Article  Google Scholar 

  63. Stewart, R., Scott, G., Zelevinsky, V.: Idea navigation: structured browsing for unstructured text. In: Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems, CHI ’08, pp. 1789–1792. ACM, New York (2008)

    Google Scholar 

  64. Tokarchuk, O., Cuel, R., Zamarian, M.: Analyzing crowd labor and designing incentives for humans in the loop. IEEE Internet Comput. 16(5), 45–51 (2012)

    Article  Google Scholar 

  65. Tsinaraki, C., Polydoros, P., Kazasis, F., Christodoulakis, S.: Ontology-based semantic indexing for mpeg-7 and tv-anytime audiovisual content. Multimedia Tools Appl. 26(3), 299–325 (2005)

    Article  Google Scholar 

  66. Tudorache, T., Noy, N.F., Falconer, S.M., Musen, M.A.: A knowledge base driven user interface for collaborative ontology development. Proceedings of the 16th International Conference on Intelligent User Interfaces. IUI ’11, pp. 411–414. ACM, New York (2011)

    Google Scholar 

  67. Tvarožek, M.: Exploratory search in the adaptive social semantic web. Inf. Sci. Technol. Bull. ACM Slovakia 3(1), 42–51 (2011)

    Google Scholar 

  68. Tvarožek, M., Bieliková, M.: Generating exploratory search interfaces for the semantic web. In:Forbrig, P., Paternó, F., Mark Pejtersen, A. (eds.) Human-Computer Interaction, IFIP Advances in Information and Communication Technology, vol. 332, pp. 175–186. Springer, Boston (2010)

    Google Scholar 

  69. Verborgh, R., Van Deursen, D., Mannens, E., Poppe, C., Van de Walle, R.: Enabling context-aware multimedia annotation by a novel generic semantic problem-solving platform. Multimedia Tools Appl. 61(1), 105–129 (2012)

    Article  Google Scholar 

  70. Wang, Y., Mei, T., Gong, S., Hua, X.S.: Combining global, regional and contextual features for automatic image annotation. Pattern Recogn. 42(2), 259–266 (2009)

    Article  MATH  Google Scholar 

  71. Weichselbraun, A., Wohlgenannt, G., Scharl, A.: Refining non-taxonomic relation labels with external structured data to support ontology learning. Data Knowl. Eng. 69(8), 763–778 (2010)

    Article  Google Scholar 

  72. Witbrock, M., Matuszek, C., Brusseau, A., Kahlert, R., Fraser, C.B., Lenat, D.: Knowledge begets knowledge: steps towards assisted knowledge acquisition in cyc. In: Proceedings of the AAAI (2005)

    Google Scholar 

  73. Zhu, S., Kane, S., Feng, J., Sears, A.: A crowdsourcing quality control model for tasks distributed in parallel. In: CHI ’12 Extended Abstracts on Human Factors in Computing Systems. CHI EA ’12, pp. 2501–2506. ACM, New York (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakub Šimko .

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Šimko, J., Bieliková, M. (2014). State-of-the-Art: Semantics Acquisition and Crowdsourcing. In: Semantic Acquisition Games. Springer, Cham. https://doi.org/10.1007/978-3-319-06115-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06115-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06114-6

  • Online ISBN: 978-3-319-06115-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics