MOSRO: Enabling Mobile Sensing for Real-Scene Objects with Grid Based Structured Output Learning

Chi, Heng-Yu; Cheng, Wen-Huang; Chen, Ming-Syan; Tsui, Arvin Wen

doi:10.1007/978-3-319-04114-8_18

Heng-Yu Chi^22,23,
Wen-Huang Cheng²³,
Ming-Syan Chen^22,23 &
…
Arvin Wen Tsui²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8325))

Included in the following conference series:

International Conference on Multimedia Modeling

3372 Accesses
4 Citations

Abstract

Visual objects in mobile photos are usually captured in uncontrolled conditions, such as various viewpoints, positions, scales, and background clutter. In this paper, therefore, we developed a MObile Sensing framework for robust Real-scene Object recognition and localization (MOSRO). By extending the conventional structured output learning with the proposed grid based representation as the output structure, MOSRO is not only able to locate the visual objects precisely but also achieve real-time performances. The experimental results showed that the proposed framework outperforms the state-of-the-art methods on public real-scene image datasets. Further, to demonstrate its effectiveness for practical applications, the proposed MOSRO framework was implemented on Android mobile platforms as a prototype system for sensing various business signs on the street and instantly retrieving relevant information of the recognized businesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Girod, B., Chandrasekhar, V., Grzeszczuk, R., Reznik, Y.A.: Mobile visual search: Architectures, technologies, and the emerging mpeg standard. In: IEEE MultiMedia (2011)
Google Scholar
You, C.-W., Cheng, W.-H., Wen Tsui, A., Tsai, T.-H., Campbell, A.: Mobilequeue: an image-based queue card management system through augmented reality phones. In: UbiComp (2012)
Google Scholar
Girod, B., Chandrasekhar, V., Chen, D.M., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S.S., Vedantham, R.: Mobile visual search. IEEE Signal Processing Magazine (2011)
Google Scholar
Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC 2006) Results (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Ji, R., Duan, L.-Y., Chen, J., Yang, S., Huang, T., Yao, H., Gao, W.: Pkubench: A context rich mobile visual search benchmark. In: ICIP (2011)
Google Scholar
OPS data set, http://mclab.citi.sinica.edu.tw/dataset/ops62/ops62.html
Yu, F.X., Ji, R., Chang, S.-F.: Active query sensing for mobile location search. In: ACM Multimedia (2011)
Google Scholar
Kuo, Y.-H., Lee, W.-Y., Hsu, W.H., Cheng, W.-H.: Augmenting mobile city-view image retrieval with context-rich user-contributed photos. In: ACM Multimedia (2011)
Google Scholar
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: CVPR (2008)
Google Scholar
Yeh, T., Lee, J.J., Darrell, T.: Fast concurrent object localization and recognition. In: CVPR (2009)
Google Scholar
Zhang, Z., Cao, Y., Salvi, D., Oliver, K., Waggoner, J., Wang, S.: Free-shape subwindow search for object localization. In: CVPR (2010)
Google Scholar
Vijayanarasimhan, S., Grauman, K.: Efficient region search for object detection. In: CVPR (2011)
Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. (2005)
Google Scholar
Fergus, R., Li, F.-F., Perona, P., Zisserman, A.: Learning object categories from internet image searches. Proceedings of the IEEE (2010)
Google Scholar
Nowozin, S., Lampert, C.H.: Global connectivity potentials for random field models. In: CVPR (2009)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2008 (VOC 2008) Results (2008), http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Google Scholar
van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electronic Engineering, National Taiwan University, Taiwan R.O.C.
Heng-Yu Chi & Ming-Syan Chen
Research Center for IT Innovation, Academia Sinica, Taiwan R.O.C.
Heng-Yu Chi, Wen-Huang Cheng & Ming-Syan Chen
Industrial Technology Research Institute, Taiwan R.O.C.
Arvin Wen Tsui

Authors

Heng-Yu Chi
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Huang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Syan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Arvin Wen Tsui
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, Dublin City University, Dublin 9, Ireland
Cathal Gurrin
Fakultät IV für Elektrotechnik und Informatik, Technische Universität Berlin / DAI-Labor, 10587, Berlin, Germany
Frank Hopfgartner
Department of Information and Computing Sciences, Universiteit Utrecht, 3584 CC, Utrecht, The Netherlands
Wolfgang Hurst
UiT The Arctic University of Norway, 9019, Tromsø, Norway
Håvard Johansen
Singapore University of Technology and Design, Singapore
Hyowon Lee
School of Electrical Engineering, Dublin City University, Ireland
Noel O’Connor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chi, HY., Cheng, WH., Chen, MS., Tsui, A.W. (2014). MOSRO: Enabling Mobile Sensing for Real-Scene Objects with Grid Based Structured Output Learning. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds) MultiMedia Modeling. MMM 2014. Lecture Notes in Computer Science, vol 8325. Springer, Cham. https://doi.org/10.1007/978-3-319-04114-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-04114-8_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04113-1
Online ISBN: 978-3-319-04114-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics