Enhancing Computer Vision Using the Collective Intelligence of Social Media

Chatzilari, Elisavet; Nikolopoulos, Spiros; Patras, Ioannis; Kompatsiaris, Ioannis

doi:10.1007/978-3-642-17551-0_9

Elisavet Chatzilari^4,6,
Spiros Nikolopoulos^4,5,
Ioannis Patras⁵ &
…
Ioannis Kompatsiaris⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 331))

740 Accesses
4 Citations

Abstract

Teaching the machine has been a great challenge for computer vision scientists since the very first steps of artificial intelligence. Throughout the decades there have been remarkable achievements that drastically enhanced the capabilities of the machines both from the perspective of infrastructure (i.e., computer networks, processing power, storage capabilities), as well as from the perspective of processing and understanding of the data. Nevertheless, computer vision scientists are still confronted with the problem of designing techniques and frameworks that will be able to facilitate effortless learning and allow analysis methods to easily scale in many different domains and disciplines. It is true that state of the art approaches cannot produce highly effective models, unless there is dedicated, and thus costly, human supervision in the process of learning that dictates the relation between the content and its meaning (i.e., annotation). Recently, we have been witnessing the rapid growth of Social Media that emerged as the result of users’ willingness to communicate, socialize, collaborate and share content. The outcome of this massive activity was the generation of a tremendous volume of user contributed data that have been made available on the Web, usually along with an indication of their meaning (i.e., tags). This has motivated the research objective of investigating whether the Collective Intelligence that emerges from the users’ contributions inside a Web 2.0 application, can be used to remove the need for dedicated human supervision during the process of learning. In this chapter we deal with a very demanding learning problem in computer vision that consists of detecting and localizing an object within the image content. We present a method that exploits the Collective Intelligence that is fostered inside an image Social Tagging System in order to facilitate the automatic generation of training data and therefore object detection models. The experimental results shows that although there are still many issues to be addressed, computer vision technology can definitely benefit from Social Media.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

MPEG-7 Visual Experimentation Model (XM). Version 10.0, ISO/IEC/JTC1/SC29/WG11, Doc. N4062 (2001)
Google Scholar
Aurnhammer, M., Hanappe, P., Steels, L.: Augmenting navigation for collaborative tagging with emergent semantics. In: International Semantic Web Conference (2006)
Google Scholar
Barnard, K., Duygulu, P., Forsyth, D.A., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Article MATH Google Scholar
Begelman, G.: Automated tag clustering: Improving search and exploration in the tag space. In: Proc. of the Collaborative Web Tagging Workshop at WWW 2006 (2006)
Google Scholar
Bennett, K.P., Demiriz, A., Maclin, R.: Exploiting unlabeled data in ensemble methods. In: KDD 2002: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 289–296. ACM, New York (2002), http://doi.acm.org/10.1145/775047.775090
Chapter Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
MATH Google Scholar
Biederman, I.: Recognition-by-components: A theory of human image understanding. Psychological Review 94, 115–147 (1987)
Article Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)
MATH Google Scholar
d’Alché-Buc, F., Grandvalet, Y., Ambroise, C.: Semi-supervised marginboost. In: NIPS, pp. 553–560 (2001)
Google Scholar
Cao, L., Luo, J., Huang, T.S.: Annotating photo collections by label propagation according to multiple similarity cues. In: MM 2008: Proceeding of the 16th ACM international conference on Multimedia, pp. 121–130. ACM, New York (2008), http://doi.acm.org/10.1145/1459359.1459376
Chapter Google Scholar
Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)
Article Google Scholar
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: Image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1026–1038 (1999)
Article Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(5), 603–619 (2002), doi:10.1109/34.1000236
Article Google Scholar
Conrady, R.: Travel technology in the era of Web 2.0. Trends and Issues in Global Tourism 2007. Springer, Heidelberg (2007)
Google Scholar
Cour, T., Sapp, B., Jordan, C., Taskar, B.: Learning from ambiguously labeled images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 919–926 (2009), http://doi.ieeecomputersociety.org/10.1109/CVPRW.2009.5206667
Cour, T., Sapp, B., Jordan, C., Taskar, B.: Learning from ambiguously labeled images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009) (2009)
Google Scholar
Domingos, P., Pazzani, M.J.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29(2-3), 103–130 (1997), citeseer.ist.psu.edu/domingos97optimality.html
Article MATH Google Scholar
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Egmont-Petersen, M., de Ridder, D., Handels, H.: Image processing with neural networks–a review. Pattern Recognition 35(10), 2279–2301 (2002), doi:10.1016/S0031-3203(01)00178-9
Article MATH Google Scholar
Faloutsos, C., Barber, R., Flickner, M., Hafner, J., Niblack, W., Petkovic, D., Equitz, W.: Efficient and effective querying by image content. J. Intell. Inf. Syst. 3(3-4), 231–262 (1994), http://dx.doi.org/10.1007/BF00962238
Article Google Scholar
Fergus, R., Li, F.F., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: ICCV, pp. 1816–1823 (2005)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997), http://dx.doi.org/10.1006/jcss.1997.1504
Article MATH MathSciNet Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007), www.psi.toronto.edu/affinitypropagation
Article MathSciNet Google Scholar
Ghosh, H., Poornachander, P., Mallik, A., Chaudhury, S.: Learning ontology for personalized video retrieval. In: MS 2007: Workshop on multimedia information retrieval on The many faces of multimedia semantics, pp. 39–46. ACM, New York (2007), http://doi.acm.org/10.1145/1290067.1290075
Chapter Google Scholar
Giannakidou, E., Kompatsiaris, I., Vakali, A.: Semsoc: Semantic, social and content-based clustering in multimedia collaborative tagging systems. In: ICSC, pp. 128–135 (2008)
Google Scholar
Giannakidou, E., Koutsonikola, V.A., Vakali, A., Kompatsiaris, Y.: Co-clustering tags and social data sources. In: WAIM, pp. 317–324 (2008)
Google Scholar
Golder, S.A., Huberman, B.A.: The structure of collaborative tagging systems. CoRR abs/cs/0508082 (2005)
Google Scholar
Grahl, M., Hotho, A., Stumme, G.: Conceptual clustering of social bookmarking sites. In: 7th International Conference on Knowledge Management (I-KNOW 2007), Know-Center, Graz, Austria, pp. 356–364 (2007)
Google Scholar
Gruber, T.: Ontology of folksonomy: A mash-up of apples and oranges (2005), http://tomgruber.org/writing/ontology-of-folksonomy.htm
Jaschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias–an algorithm for mining iceberg tri-lattices. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 907–911. IEEE Computer Society, Washington (2006), http://dx.doi.org/10.1109/ICDM.2006.162
Joachims, T.: Making large-scale support vector machine learning practical, pp. 169–184 (1999)
Google Scholar
Johnson, S.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)
Article Google Scholar
Joshi, D., Luo, J.: Inferring generic activities and events from image content and bags of geo-tags. In: CIVR 2008: Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval, pp. 37–46. ACM, New York (2008), http://doi.acm.org/10.1145/1386352.1386361
Chapter Google Scholar
Kennedy, L.S., Chang, S.-F., Kozintsev, I.: To search or to label?: predicting the performance of search-based automatic image classifiers. In: Multimedia Information Retrieval, pp. 249–258 (2006)
Google Scholar
Kennedy, L.S., Naaman, M., Ahern, S., Nair, R., Rattenbury, T.: How flickr helps us make sense of the world: context and content in community-contributed media collections. In: ACM Multimedia, pp. 631–640 (2007)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: An implicit shape model for combined object categorization and segmentation. In: Toward Category-Level Object Recognition, pp. 508–524 (2006)
Google Scholar
Leistner, C., Grabner, H., Bischof, H.: Semi-supervised boosting using visual similarity learning. In: CVPR (2008)
Google Scholar
Li, F.F., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)
Article Google Scholar
Li, F.F., Perona, P., Technology, C.I: A bayesian hierarchical model for learning natural scene categories. In: CVPR, vol. 2, pp. 524–531 (2005)
Google Scholar
Li, J., Wang, J.Z.: Real-time computerized annotation of pictures. In: MULTIMEDIA 2006: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 911–920. ACM, New York (2006), http://doi.acm.org/10.1145/1180639.1180841
Chapter Google Scholar
Li, J., Wang, J.Z.: Real-time computerized annotation of pictures. IEEE Trans. Pattern Anal. Mach. Intell. 30(6), 985–1002 (2008), http://dx.doi.org/10.1109/TPAMI.2007.70847
Article Google Scholar
Li, L.-J., Socher, R., Fei-Fei, L.: Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Li, Y., Shapiro, L.G.: Consistent line clusters for building recognition in cbir. In: ICPR, vol. (3), pp. 952–956 (2002)
Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999), doi:10.1109/ICCV.1999.790410
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004), http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Lukaszyk, S.: A new concept of probability metric and its applications in approximation of scattered data sets. Computational Mechanics 33, 299–304 (2004), http://www.ingentaconnect.com/content/klu/466/2004/00000033/00000004/art00007
Article MATH MathSciNet Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
Mallapragada, P.K., Jin, R., Jain, A.K., Liu, Y.: Semiboost: Boosting for semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(11), 2000–2014 (2008), doi:10.1109/TPAMI.2008.235
Article Google Scholar
Marlow, C., Naaman, M., Boyd, D., Davis, M.: Ht06, tagging paper, taxonomy, flickr, academic article, to read. In: Hypertext, pp. 31–40 (2006)
Google Scholar
Meadow, C.T.: Text Information Retrieval Systems. Academic Press, Inc., Orlando (1992)
Google Scholar
Meyer, D., Leisch, F., Hornik, K.: The support vector machine under test. Neurocomputing 55(1-2), 169–186 (2003)doi:10.1016/S0925-2312(03)00431-4, http://www.sciencedirect.com/science/article/B6V10-49CRCBP-1/2/346ddc665b1b67be089a7d5d46edca07
Article Google Scholar
Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: Still image segmentation tools for object-based multimedia applications. IJPRAI 18(4), 701–725 (2004)
Google Scholar
Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: Still image segmentation tools for object-based multimedia applications. IJPRAI 18(4), 701–725 (2004)
Google Scholar
Mika, P.: Ontologies are us: A unified model of social networks and semantics. Web Semant. 5(1), 5–15 (2007), http://dx.doi.org/10.1016/j.websem.2006.11.002
MathSciNet Google Scholar
O’Really, T.: What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software. O’Reilly Media Inc., Sebastopol (2005)
Google Scholar
Palen, L., Hiltz, S.R., Liu, S.B.: Online forums supporting grassroots participation in emergency preparedness and response. Commun. ACM 50(3), 54–58 (2007), http://doi.acm.org/10.1145/1226736.1226766
Article Google Scholar
Quack, T., Leibe, B., Gool, L.J.V.: World-scale mining of objects and events from community photo collections. In: CIVR, pp. 47–56 (2008)
Google Scholar
Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR, vol. (2), pp. 1605–1614 (2006)
Google Scholar
van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 99(1) (doi:5555), http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.154
Schmitz, P.: Inducing ontology from flickr tags. In: Proc. of the Collaborative Web Tagging Workshop (WWW 2006) (2006), http://www.rawsugar.com/www2006/22.pdf
Scholkopf, B., Smola, A., Williamson, R., Bartlett, P.: New support vector algorithms. Neural Networks 22, 1083–1121 (2000)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 0, p. 731 (1997), http://doi.ieeecomputersociety.org/10.1109/CVPR.1997.609407
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their localization in images. In: ICCV, pp. 370–377 (2005)
Google Scholar
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV 2003: Proceedings of the Ninth IEEE International Conference on Computer Vision, p. 1470. IEEE Computer Society, Washington (2003)
Chapter Google Scholar
Sun, Y., Shimada, S., Taniguchi, Y., Kojima, A.: A novel region-based approach to visual concept modeling using web images. In: ACM Multimedia, 635–638 (2008)
Google Scholar
Sung, K.K., Poggio, T.: Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 39–51 (1998)
Article Google Scholar
Torralba, A.B., Murphy, K.P., Freeman, W.T.: Contextual models for object detection using boosted random fields. In: NIPS (2004)
Google Scholar
Tsikrika, T., Diou, C., de Vries, A.P., Delopoulos, A.: Image annotation using clickthrough data. In: 8th ACM International Conference on Image and Video Retrieval, Santorini, Greece (2009)
Google Scholar
Vasconcelos, M., Vasconcelos, N., Carneiro, G.: Weakly supervised top-down image segmentation. In: CVPR, vol. (1), pp. 1001–1006 (2006)
Google Scholar
Viola, P.A., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. (1), pp. 511–518 (2001)
Google Scholar
Wang, Z., Feng, D.D., Chi, Z., Xia, T.: Annotating image regions using spatial context. In: International Symposium on Multimedia, vol. 0, pp. 55–61 (2006), http://doi.ieeecomputersociety.org/10.1109/ISM.2006.32
Wu, L., Hua, X.-S., Yu, N., Ma, W.-Y., Li, S.: Flickr distance. In: ACM Multimedia, 31–40 (2008)
Google Scholar
Yanai, K.: Generic image classification using visual knowledge on the web. In: ACM Multimedia, 167–176 (2003)
Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007), http://dx.doi.org/10.1007/s11263-006-9794-4
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Research & Technology Hellas, Informatics and Telematics Institute, 6th km Charilaou-Thermi Road, Thermi-Thessaloniki, GR-57001, Thessaloniki, Greece
Elisavet Chatzilari, Spiros Nikolopoulos & Ioannis Kompatsiaris
School of Electronic Engineering and Computer Science, Queen Mary University of London, E1 4NS, London, UK
Spiros Nikolopoulos & Ioannis Patras
Centre for Vision, Speech and Signal Processing, University of Surrey Guildford, GU2 7XH, UK
Elisavet Chatzilari

Authors

Elisavet Chatzilari
View author publications
You can also search for this author in PubMed Google Scholar
Spiros Nikolopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Patras
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Kompatsiaris
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Aristotle University, 54124, Thessaloniki, Greece
Athena Vakali
School of Electrical and Information Engineering, University of South Australia, Adelaide Mawson Lakes Campus, 5095, South Australia, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chatzilari, E., Nikolopoulos, S., Patras, I., Kompatsiaris, I. (2011). Enhancing Computer Vision Using the Collective Intelligence of Social Media. In: Vakali, A., Jain, L.C. (eds) New Directions in Web Data Management 1. Studies in Computational Intelligence, vol 331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17551-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-17551-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17550-3
Online ISBN: 978-3-642-17551-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics