Abstract
Visual objects in mobile photos are usually captured in uncontrolled conditions, such as various viewpoints, positions, scales, and background clutter. In this paper, therefore, we developed a MObile Sensing framework for robust Real-scene Object recognition and localization (MOSRO). By extending the conventional structured output learning with the proposed grid based representation as the output structure, MOSRO is not only able to locate the visual objects precisely but also achieve real-time performances. The experimental results showed that the proposed framework outperforms the state-of-the-art methods on public real-scene image datasets. Further, to demonstrate its effectiveness for practical applications, the proposed MOSRO framework was implemented on Android mobile platforms as a prototype system for sensing various business signs on the street and instantly retrieving relevant information of the recognized businesses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Girod, B., Chandrasekhar, V., Grzeszczuk, R., Reznik, Y.A.: Mobile visual search: Architectures, technologies, and the emerging mpeg standard. In: IEEE MultiMedia (2011)
You, C.-W., Cheng, W.-H., Wen Tsui, A., Tsai, T.-H., Campbell, A.: Mobilequeue: an image-based queue card management system through augmented reality phones. In: UbiComp (2012)
Girod, B., Chandrasekhar, V., Chen, D.M., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S.S., Vedantham, R.: Mobile visual search. IEEE Signal Processing Magazine (2011)
Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC 2006) Results (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Ji, R., Duan, L.-Y., Chen, J., Yang, S., Huang, T., Yao, H., Gao, W.: Pkubench: A context rich mobile visual search benchmark. In: ICIP (2011)
OPS data set, http://mclab.citi.sinica.edu.tw/dataset/ops62/ops62.html
Yu, F.X., Ji, R., Chang, S.-F.: Active query sensing for mobile location search. In: ACM Multimedia (2011)
Kuo, Y.-H., Lee, W.-Y., Hsu, W.H., Cheng, W.-H.: Augmenting mobile city-view image retrieval with context-rich user-contributed photos. In: ACM Multimedia (2011)
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: CVPR (2008)
Yeh, T., Lee, J.J., Darrell, T.: Fast concurrent object localization and recognition. In: CVPR (2009)
Zhang, Z., Cao, Y., Salvi, D., Oliver, K., Waggoner, J., Wang, S.: Free-shape subwindow search for object localization. In: CVPR (2010)
Vijayanarasimhan, S., Grauman, K.: Efficient region search for object detection. In: CVPR (2011)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. (2005)
Fergus, R., Li, F.-F., Perona, P., Zisserman, A.: Learning object categories from internet image searches. Proceedings of the IEEE (2010)
Nowozin, S., Lampert, C.H.: Global connectivity potentials for random field models. In: CVPR (2009)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2008 (VOC 2008) Results (2008), http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chi, HY., Cheng, WH., Chen, MS., Tsui, A.W. (2014). MOSRO: Enabling Mobile Sensing for Real-Scene Objects with Grid Based Structured Output Learning. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds) MultiMedia Modeling. MMM 2014. Lecture Notes in Computer Science, vol 8325. Springer, Cham. https://doi.org/10.1007/978-3-319-04114-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-04114-8_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04113-1
Online ISBN: 978-3-319-04114-8
eBook Packages: Computer ScienceComputer Science (R0)