Using Crowdsourcing to Capture Complexity in Human Interpretations of Multimedia Content

Larson, Martha; Melenhorst, Mark; Menéndez, María; Xu, Peng

doi:10.1007/978-3-319-05696-8_10

Martha Larson⁷,
Mark Melenhorst⁷,
María Menéndez⁸ &
…
Peng Xu⁷

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

1813 Accesses
7 Citations
1 Altmetric

Abstract

Large-scale crowdsourcing platforms are a key tool allowing researchers in the area of multimedia content analysis to gain insight into how users interpret social multimedia. The goal of this article is to support this process in a practical manner that opens the path for productive exploitation of complex human interpretations of multimedia content within multimedia systems. We first discuss in detail the nature of complexity in human interpretations of multimedia, and why we, as researchers, should look outward to the crowd, rather than inward to ourselves, to determine what users consider important about the content of images and videos. Then, we present strategies and insights from our own experience in designing tasks for crowdworkers. Our techniques are useful to researchers interested in eliciting information about the elements and aspects of multimedia that are important in the contexts in which humans use social multimedia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Barthes R (1977) Image music text. Hill and Wang, New York
Google Scholar
Beaver DI, Geurts B (2013) Presupposition. In: Zalta EN (ed) The Stanford encyclopedia of philosophy (Summer 2011 Edition). http://www.plato.stanford.edu/archives/fall2013/entries/presupposition/
Borth D, Ji R, Chen T, Breuel T, Chang S-F (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on multimedia (MM ’13). ACM, New York, pp 223–232
Google Scholar
Chen K-T, Chang C-J, Wu C-C, Chang Y-C, Lei C-L (2010) Quadrant of euphoria: a crowdsourcing platform for QoE assessment. IEEE Netw 24(2):28–35
Google Scholar
Cockton G, Woolrich A (2001) Understanding inspection methods: lessons from an assessment of heuristic evaluation. In: Blandford A, Vanderdonckt J, Gray Ph (eds) People and computers XV–interaction without Frontiers, pp 171–191
Google Scholar
Conotter V, Dang-Nguyen D-T, Boato G, Menéndez M, Larson M (2014) Assessing the impact of image manipulation on users’ perceptions of deception. In: Proc. SPIE 9014, Human Vision and Electronic Imaging XIX, 90140Y
Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: CVPR’09: IEEE conference on computer vision and pattern recognition, pp 248–255
Google Scholar
Galli L, Fraternali P, Martinenghi D, Novak J (2012) A draw-and-guess game to segment images. In: SocialComm 2012 ASE/IEEE international conference on social computing, pp 914–917
Google Scholar
Genevieve P (2012) SUN attribute database: discovering, annotating, and recognizing scene attributes. In Proceedings of the 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2751–2758
Google Scholar
Glaser BG, Strauss A (1967) Discovery of grounded theory. In: Strategies for qualitative research. Sociology Press, Mill Valley
Google Scholar
Hossfeld T, Keimel C, Hirth M, Gardlo B, Habigt J, Diepold K, Tran-Gia P (2014) Best practices for QoE crowdtesting: QoE assessment with crowdsourcing. IEEE Trans Multimedia 16:541–558
Google Scholar
Howe J (2006) The Rise of crowdsourcing wired. Wired Mag 14(06):1–6
Google Scholar
Kennedy L, Hauptmann A (2006) LSCOM Lexicon definitions and annotations (Version 1.0). Computer science department, Paper 949. http://www.repository.cmu.edu/compsci/949
Liu D, Hua X-S, Yang L, Wang M, Zhang H-J (2009) Tag ranking. In: WWW 2009 Proceedings of the 18th international conference on world wide web, pp. 351–360
Google Scholar
Mason W, Suri S (2012) Conducting behavioral research on Amazon’s Mechanical Turk. Behav Res Methods 44(1):1–23
Google Scholar
Morris RR, Dontcheva M, Gerber EM (2012) Priming for better performance in microtask crowdsourcing environments. IEEE Internet Comput 16(5):13–19
Google Scholar
Murphy GL (2002) The big book of concepts. The MIT Press, Cambridge
Google Scholar
Nie L, Yan S, Wang M, Hong R, Chua T-S (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on multimedia (MM’12). ACM, New York, pp 59–68
Google Scholar
Nielsen J (1994) Enhancing the explanatory power of usability heuristics. In: Proceedings of the ACM CHI’94 conference, pp 152–158
Google Scholar
Nielsen J, Landauer TKA (1993) Mathematical model of the finding of usability problems. In: Proceedings of ACM INTERCHI’93 conference, pp 206–213
Google Scholar
Nowak S, Rueger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: MIR’10 Proceedings of the international conference on multimedia, information retrieval
Google Scholar
Paolacci G, Chandler J, Ipeirotis PG (2010) Running experiments on amazon mechanical turk. Judgment Decision Making 5:5
Google Scholar
Parikh D, Grauman K (2011) Interactively building a discriminative vocabulary of nameable attributes. In: Computer vision and pattern recognition (CVPR), pp 1681–1688
Google Scholar
Prelec D (2004) A bayesian truth serum for subjective data. Science 306(5695):462–6
Article Google Scholar
Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing field. Proc ACM CHI 2011:1403–12
Google Scholar
Rafferty P, Hidderly R (2005) Indexing multimedia and creative works. Ashgate, Farnham
Google Scholar
Ross J, Lilly Irani M, Silberman S, Zaldivar A, Tomlinson B (2010) Who are the crowdworkers?: shifting demographics in mechanical turk. In: CHI’10 Extended abstracts on human factors in computing systems (CHI EA’10). ACM, New York, pp 2863–2872
Google Scholar
Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval (MIR’06) pp. 321–330
Google Scholar
Snoek CGM, Freiburg B, Oomen J, Ordelman R (2010) Crowdsourcing rock n’ roll multimedia retrieval. In: MM’10 Proceedings of the international conference on multimedia, pp. 1535–1538
Google Scholar
Szabó ZG (2013) Compositionality. In: Zalta EN (ed) The stanford encyclopedia of philosophy. (Fall, Fall River). http://www.plato.stanford.edu/archives/fall2013/entries/compositionality/
Van den Haak MJ, Jong MDT, Schellens PJ (2004) Employing think-aloud protocols and constructive interaction to test the usability of online library catalogues: a methodological comparison. Interact Comput 16(6):1153–70
Article Google Scholar
Van der Geest T, Spyridakis JH (2000) Developing heuristics for web communication. Tech Commun 47(3):359–82
Google Scholar
Vliegendhart R, Larson MA, Pouwelse JA (2012) Discovering user perceptions of semantic similarity in near-duplicate multimedia files. In: CrowdSearch 2012: First international workshop on crowdsourcing web search, pp 54–58
Google Scholar
Wieneck L, Düring M, Sillaume G, Lallemand C, Croce V, Lazzaro M, Nucci F, Pasini C, Fraternali P, Tagliasacchi M, Melenhorst M, Novak J, Micheel I (2013) Building the social graph of the history of European integration. A pipeline for humanist-machine interaction in the digital humanities. In: Proceedings of the conference histoinformatics 2013, Kyoto
Google Scholar
Xirong Li; Snoek CGM, Worring M, Smeulders AWM (2012) Harvesting social images for bi-concept search. IEEE Trans Multimedia 14(4):1091–1104
Google Scholar

Download references

Acknowledgments

The research leading to these results has received funding from the European Commission’s 7th Framework Programme under grant agreements No. 287704 (CUbRIK) and No. 610594 (CrowdRec). It has also been supported by the Dutch national program COMMIT.

Author information

Authors and Affiliations

Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands
Martha Larson, Mark Melenhorst & Peng Xu
Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
María Menéndez

Authors

Martha Larson
View author publications
You can also search for this author in PubMed Google Scholar
Mark Melenhorst
View author publications
You can also search for this author in PubMed Google Scholar
María Menéndez
View author publications
You can also search for this author in PubMed Google Scholar
Peng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martha Larson .

Editor information

Editors and Affiliations

University Politehnica of Bucharest, Romania
Bogdan Ionescu
University of Bordeaux, Talence, France
Jenny Benois-Pineau
Queen Mary University of London, London, United Kingdom
Tomas Piatrik
Lab. of Informatics of Grenoble, France
Georges Quénot

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Larson, M., Melenhorst, M., Menéndez, M., Xu, P. (2014). Using Crowdsourcing to Capture Complexity in Human Interpretations of Multimedia Content. In: Ionescu, B., Benois-Pineau, J., Piatrik, T., Quénot, G. (eds) Fusion in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-05696-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-05696-8_10
Published: 26 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05695-1
Online ISBN: 978-3-319-05696-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics