Abstract
Recently, object recognition has been successfully implemented in a couple of multimedia content annotation and retrieval applications. The employed recognition approaches are carefully selected and adapted to the specific needs of their tasks. In this work, we propose a framework to automate the simultaneous selection and customization of the entire recognition process. This framework only requires an annotated set of sample images or videos and precisely specified task requirements to select an appropriate setup among thousands of possibilities. We use an efficient recognition infrastructure and iterative analysis strategies to make this approach practicable for real-world applications. A case study for face recognition from a single image per person demonstrates the capabilities of this holistic approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stavens, D., Thrun, S.: Unsupervised Learning of Invariant Features using Video. In: CVPR (2010)
Babenko, B., Dollár, P., Belongie, S.: Task Specific Local Region Matching. In: ICCV (2007)
Winder, S., Hua, G., Brown, M.: Picking the best DAISY. In: CVPR (2009)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. IJCV 88(2), 303–338 (2010)
Tuytelaars, T., Mikolajczyk, K.: Local Invariant Feature Detectors: A Survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2008)
Mikolajczyk, K., Schmid, C.: A Performance Evaluation of Local Descriptors. In: PAMI (2005)
Hsu, C., Chang, C., Lin, D.: A Practical Guide to Support Vector Classification. Technical report, Nat. Taiwan University, Taipei (2003), http://www.csie.ntu.edu.tw/~cjlin/papers/guide/
Varma, M., Ray, D.: Learning the Discriminative Power-Invariance Trade-Off. In: ICCV (2007)
Jiang, Y., Ngo, C., Yang, J.: Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval. In: Int. Conf. Image and Video Retrieval (2007)
Winder, S.A.J., Brown, M.: Learning Local Image Descriptors. In: CVPR (2007)
Jahrer, M., Grabner, M., Bischof, H.: Learned Local Descriptors for Recognition and Matching. In: Computer Vision Winter Workshop (2008)
Torralba, A., Russell, B.C., Yeun, J.: LabelMe: Online Image Annotation and Applications. Proceedings of the IEEE 98(8), 1467–1484 (2010)
Doermann, D., Mihalcik, D.: Tools and Techniques for Video Performance Evaluation. In: ICPR, vol. 4, pp. 167-170 (2000)
Leistner, C., Godec, M., Schulter, S., Saffari, A., Werlberger, M., Bischof, H.: Improving Classifiers with Unlabeled Weakly-Related Videos. In: CVPR (2011)
Klemmer, S.R.: Papier-Mâché: Toolkit support for tangible interaction. In: Human Factors in Computing Systems (2004)
Maynes-Aminzade, D., Winograd, T., Igarashi, T.: Eyepatch: Prototyping Camera-based Interaction through Examples. In: Symp. User Interface Software and Technology (2007)
Muja, M., Rusu, R., Bradski, G., Lowe, D.: REIN - A Fast, Robust, Scalable REcognition INfrastructure. In: International Conference on Robotics and Automation (2011)
Sorschag, R.: CORI: A Configurable Object Recognition Infrastructure. In: Int. Conf. on Signal and Image Processing Applications (2011)
Bradski, G., Kaehler, A.: Learning OpenCV, Computer Vision with the Open Source Computer Vision Library. O’Reilly Press (2008), http://opencv.willowgarage.com
Lowe, D.: Distinctive Image Features from Scale-invariant Keypoints. IJCV (2004)
Tan, S., Chen, S., Zhou, Z.-H., Zhang, F.: Face Recognition from a Single Image per Person: A Survey. Pattern Recognition 39, 1725–1745 (2006)
Viola, P., Jones, M.J.: Robust Real-time Face Detection. IJCVÂ 57(2) (2004)
Phillips, P.J., Wechsler, H., Huang, J., Rauss, P.J.: The FERET Database and Evaluation Procedure for Face Recognition Algorithms. In: Image and Vision Computing (1998)
Frigo, M., Johnson, S.: The Design and Implementation of FFTW3. In: Proc. Program Generation, Optimization, and Platform Adaptation, vol. 93(2), pp. 216–231 (2005)
Manjunath, B., Ohm, J.-R., Vasudevan, V., Yamada, A.: Color and Texture Descriptors. Trans. on Circuits and Systems for Video Technology 11, 703–715 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sorschag, R. (2012). How to Select and Customize Object Recognition Approaches for an Application?. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-27355-1_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27354-4
Online ISBN: 978-3-642-27355-1
eBook Packages: Computer ScienceComputer Science (R0)