Abstract
Large-scale crowdsourcing platforms are a key tool allowing researchers in the area of multimedia content analysis to gain insight into how users interpret social multimedia. The goal of this article is to support this process in a practical manner that opens the path for productive exploitation of complex human interpretations of multimedia content within multimedia systems. We first discuss in detail the nature of complexity in human interpretations of multimedia, and why we, as researchers, should look outward to the crowd, rather than inward to ourselves, to determine what users consider important about the content of images and videos. Then, we present strategies and insights from our own experience in designing tasks for crowdworkers. Our techniques are useful to researchers interested in eliciting information about the elements and aspects of multimedia that are important in the contexts in which humans use social multimedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barthes R (1977) Image music text. Hill and Wang, New York
Beaver DI, Geurts B (2013) Presupposition. In: Zalta EN (ed) The Stanford encyclopedia of philosophy (Summer 2011 Edition). http://www.plato.stanford.edu/archives/fall2013/entries/presupposition/
Borth D, Ji R, Chen T, Breuel T, Chang S-F (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on multimedia (MM ’13). ACM, New York, pp 223–232
Chen K-T, Chang C-J, Wu C-C, Chang Y-C, Lei C-L (2010) Quadrant of euphoria: a crowdsourcing platform for QoE assessment. IEEE Netw 24(2):28–35
Cockton G, Woolrich A (2001) Understanding inspection methods: lessons from an assessment of heuristic evaluation. In: Blandford A, Vanderdonckt J, Gray Ph (eds) People and computers XV–interaction without Frontiers, pp 171–191
Conotter V, Dang-Nguyen D-T, Boato G, Menéndez M, Larson M (2014) Assessing the impact of image manipulation on users’ perceptions of deception. In: Proc. SPIE 9014, Human Vision and Electronic Imaging XIX, 90140Y
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: CVPR’09: IEEE conference on computer vision and pattern recognition, pp 248–255
Galli L, Fraternali P, Martinenghi D, Novak J (2012) A draw-and-guess game to segment images. In: SocialComm 2012 ASE/IEEE international conference on social computing, pp 914–917
Genevieve P (2012) SUN attribute database: discovering, annotating, and recognizing scene attributes. In Proceedings of the 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2751–2758
Glaser BG, Strauss A (1967) Discovery of grounded theory. In: Strategies for qualitative research. Sociology Press, Mill Valley
Hossfeld T, Keimel C, Hirth M, Gardlo B, Habigt J, Diepold K, Tran-Gia P (2014) Best practices for QoE crowdtesting: QoE assessment with crowdsourcing. IEEE Trans Multimedia 16:541–558
Howe J (2006) The Rise of crowdsourcing wired. Wired Mag 14(06):1–6
Kennedy L, Hauptmann A (2006) LSCOM Lexicon definitions and annotations (Version 1.0). Computer science department, Paper 949. http://www.repository.cmu.edu/compsci/949
Liu D, Hua X-S, Yang L, Wang M, Zhang H-J (2009) Tag ranking. In: WWW 2009 Proceedings of the 18th international conference on world wide web, pp. 351–360
Mason W, Suri S (2012) Conducting behavioral research on Amazon’s Mechanical Turk. Behav Res Methods 44(1):1–23
Morris RR, Dontcheva M, Gerber EM (2012) Priming for better performance in microtask crowdsourcing environments. IEEE Internet Comput 16(5):13–19
Murphy GL (2002) The big book of concepts. The MIT Press, Cambridge
Nie L, Yan S, Wang M, Hong R, Chua T-S (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on multimedia (MM’12). ACM, New York, pp 59–68
Nielsen J (1994) Enhancing the explanatory power of usability heuristics. In: Proceedings of the ACM CHI’94 conference, pp 152–158
Nielsen J, Landauer TKA (1993) Mathematical model of the finding of usability problems. In: Proceedings of ACM INTERCHI’93 conference, pp 206–213
Nowak S, Rueger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: MIR’10 Proceedings of the international conference on multimedia, information retrieval
Paolacci G, Chandler J, Ipeirotis PG (2010) Running experiments on amazon mechanical turk. Judgment Decision Making 5:5
Parikh D, Grauman K (2011) Interactively building a discriminative vocabulary of nameable attributes. In: Computer vision and pattern recognition (CVPR), pp 1681–1688
Prelec D (2004) A bayesian truth serum for subjective data. Science 306(5695):462–6
Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing field. Proc ACM CHI 2011:1403–12
Rafferty P, Hidderly R (2005) Indexing multimedia and creative works. Ashgate, Farnham
Ross J, Lilly Irani M, Silberman S, Zaldivar A, Tomlinson B (2010) Who are the crowdworkers?: shifting demographics in mechanical turk. In: CHI’10 Extended abstracts on human factors in computing systems (CHI EA’10). ACM, New York, pp 2863–2872
Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval (MIR’06) pp. 321–330
Snoek CGM, Freiburg B, Oomen J, Ordelman R (2010) Crowdsourcing rock n’ roll multimedia retrieval. In: MM’10 Proceedings of the international conference on multimedia, pp. 1535–1538
Szabó ZG (2013) Compositionality. In: Zalta EN (ed) The stanford encyclopedia of philosophy. (Fall, Fall River). http://www.plato.stanford.edu/archives/fall2013/entries/compositionality/
Van den Haak MJ, Jong MDT, Schellens PJ (2004) Employing think-aloud protocols and constructive interaction to test the usability of online library catalogues: a methodological comparison. Interact Comput 16(6):1153–70
Van der Geest T, Spyridakis JH (2000) Developing heuristics for web communication. Tech Commun 47(3):359–82
Vliegendhart R, Larson MA, Pouwelse JA (2012) Discovering user perceptions of semantic similarity in near-duplicate multimedia files. In: CrowdSearch 2012: First international workshop on crowdsourcing web search, pp 54–58
Wieneck L, Düring M, Sillaume G, Lallemand C, Croce V, Lazzaro M, Nucci F, Pasini C, Fraternali P, Tagliasacchi M, Melenhorst M, Novak J, Micheel I (2013) Building the social graph of the history of European integration. A pipeline for humanist-machine interaction in the digital humanities. In: Proceedings of the conference histoinformatics 2013, Kyoto
Xirong Li; Snoek CGM, Worring M, Smeulders AWM (2012) Harvesting social images for bi-concept search. IEEE Trans Multimedia 14(4):1091–1104
Acknowledgments
The research leading to these results has received funding from the European Commission’s 7th Framework Programme under grant agreements No. 287704 (CUbRIK) and No. 610594 (CrowdRec). It has also been supported by the Dutch national program COMMIT.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Larson, M., Melenhorst, M., Menéndez, M., Xu, P. (2014). Using Crowdsourcing to Capture Complexity in Human Interpretations of Multimedia Content. In: Ionescu, B., Benois-Pineau, J., Piatrik, T., Quénot, G. (eds) Fusion in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-05696-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-05696-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05695-1
Online ISBN: 978-3-319-05696-8
eBook Packages: Computer ScienceComputer Science (R0)