Skip to main content

Using Detected Visual Objects to Index Video Database

  • Conference paper
  • First Online:
Databases Theory and Applications (ADC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9877))

Included in the following conference series:

Abstract

In this paper, we focus on how to use visual objects to index the videos. Two tables are constructed for this purpose, namely the unique object table and the occurrence table. The former table stores the unique objects which appear in the videos, while the latter table stores the occurrence information of these unique objects in the videos. In previous works, these two tables are generated manually by a top-down process. That is, the unique object table is given by the experts at first, then the occurrence table is generated by the annotators according to the unique object table. Obviously, such process which heavily depends on human labors limits the scalability especially when the data are dynamic or large-scale. To improve this, we propose to perform a bottom-up process to generate these two tables. The novelties are: we use object detector instead of human annotation to create the occurrence table; we propose a hybrid method which consists of local merge, global merge and propagation to generate the unique object table and fix the occurrence table. In fact, there are another three candidate methods for implementing the bottom-up process, namely, recognizing-based, matching-based and tracking-based methods. Through analyzing their mechanism and evaluating their accuracy, we find that they are not suitable for the bottom-up process. The proposed hybrid method leverages the advantages of the matching-based and tracking-based methods. Our experiments show that the hybrid method is more accurate and efficient than the candidate methods, which indicates that it is more suitable for the proposed bottom-up process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/openalpr/openalpr.

References

  1. Adali, S., Candan, K.S., Chen, S., Erol, K., Subrahmanian, V.S.: The advanced video information system: data structures and query processing. MMS 4(4), 172–186 (1996)

    Google Scholar 

  2. Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Gool, L.J.V.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV, pp. 1515–1522 (2009)

    Google Scholar 

  3. Dönderler, M.E., Saykol, E., Arslan, U., Ulusoy, Ö., Güdükbay, U.: Bilvideo: design and implementation of a video database management system. MTA 27(1), 79–104 (2005)

    Google Scholar 

  4. Dönderler, M.E., Ulusoy, Ö., Güdükbay, U.: A rule-based video database system architecture. Inf. Sci. 143(1–4), 13–45 (2002)

    Article  MATH  Google Scholar 

  5. Dönderler, M.E., Ulusoy, Ö., Güdükbay, U.: Rule-based spatiotemporal query processing for video databases. VLDB J. 13(1), 86–103 (2004)

    Article  Google Scholar 

  6. Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: the QBIC system. IEEE Comput. 28(9), 23–32 (1995)

    Article  Google Scholar 

  7. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  8. Hjelsvold, R., Midtstraum, R.: Modelling and querying video data. In: VLDB, pp. 686–694 (1994)

    Google Scholar 

  9. Hu, W., Xie, N., Li, L., Zeng, X., Maybank, S.J.: A survey on visual content-based video indexing and retrieval. SMC C 41(6), 797–819 (2011)

    Google Scholar 

  10. Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. TMM 12(5), 386–398 (2010)

    Google Scholar 

  11. Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. TOIS 27(3), 17 (2009)

    Article  Google Scholar 

  12. Köprülü, M., Cicekli, N.K., Yazici, A.: Spatio-temporal querying in video databases. Inf. Sci. 160(1–4), 131–152 (2004)

    Article  Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

    Google Scholar 

  14. Kuo, T.C.T., Chen, A.L.P.: Content-based query processing for video databases. TMM 2(1), 1–13 (2000)

    Google Scholar 

  15. Kuznetsova, A., Ju Hwang, S., Rosenhahn, B., Sigal, L.: Expanding object detector’s horizon: incremental learning framework for object detection in videos. In: CVPR, pp. 28–36 (2015)

    Google Scholar 

  16. Le, T.-L., Thonnat, M., Boucher, A., Brémond, F.: A query language combining object features and semantic events for surveillance video retrieval. In: Satoh, S., Nack, F., Etoh, M. (eds.) MMM 2008. LNCS, vol. 4903, pp. 307–317. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Oomoto, E., Tanaka, K.: OVID: design and implementation of a video-object database system. TKDE 5(4), 629–643 (1993)

    Google Scholar 

  18. Petkovic, M., Jonker, W.: Content-Based Video Retrieval - A Database Perspective, Multimedia systems and applications, vol. 25. Springer (2003)

    Google Scholar 

  19. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

    Google Scholar 

  20. Shen, H.T., Shao, J., Huang, Z., Zhou, X.: Effective and efficient query processing for video subsequence identification. TKDE 21(3), 321–334 (2009)

    Google Scholar 

  21. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS, pp. 2553–2561 (2013)

    Google Scholar 

  22. Ulusoy, Ö., Güdükbay, U., Dönderler, M.E., Saykol, E., Alper, C.: Bilvideo video database management system. In: VLDB, pp. 1373–1376 (2004)

    Google Scholar 

  23. Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013)

    Google Scholar 

  24. Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. PAMI 37(9), 1834–1848 (2015)

    Article  Google Scholar 

  25. Yang, Y., Huang, Z., Shen, H.T., Zhou, X.: Mining multi-tag association for image tagging. WWWJ 14(2), 133–156 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

This research was jointly supported by the ARC project (Grant No. DP150103008) and the ARC DECRA project (Grant No. DE160100308).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofang Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Du, X., Yin, H., Huang, Z., Yang, Y., Zhou, X. (2016). Using Detected Visual Objects to Index Video Database. In: Cheema, M., Zhang, W., Chang, L. (eds) Databases Theory and Applications. ADC 2016. Lecture Notes in Computer Science(), vol 9877. Springer, Cham. https://doi.org/10.1007/978-3-319-46922-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46922-5_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46921-8

  • Online ISBN: 978-3-319-46922-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics