Advertisement

Multimedia Tools and Applications

, Volume 76, Issue 1, pp 163–197 | Cite as

Interactive 3D content insertion in images for multimedia applications

  • Rui Nóbrega
  • Nuno Correia
Article

Abstract

This article addresses the problem of creating interactive mixed reality applications where virtual objects interact in images of real world scenarios. This is relevant to create games and architectural or space planning applications that interact with visual elements in the images such as walls, floors and empty spaces. These scenarios are intended to be captured by the users with regular cameras or using previously taken photographs. Introducing virtual objects in photographs presents several challenges, such as pose estimation and the creation of a visually correct interaction between virtual objects and the boundaries of the scene. The two main research questions addressed in this article include, the study of the feasibility of creating interactive augmented reality (AR) applications where virtual objects interact in a real world scenario using the image detected high-level features and, also, verifying if untrained users are capable and motivated enough to perform AR initialization steps. The proposed system detects the scene automatically from an image with additional features obtained using basic annotations from the user. This operation is significantly simple to accommodate the needs of non-expert users. The system analyzes one or more photos captured by the user and detects high-level features such as vanishing points, floor and scene orientation. Using these features it will be possible to create mixed and augmented reality applications where the user interactively introduces virtual objects that blend with the picture in real time and respond to the physical environment. To validate the solution several system tests are described and compared using available external image datasets.

Keywords

Mixed and augmented reality Multimedia Computer vision Computer graphics Human-computer interaction 

Notes

Acknowledgments

The authors would like to thank the support from everyone at IMG and CITI. This work was funded by the Portuguese Science and Technology Foundation, FCT/MEC, through grants SFRH/BD/47511/2008, PEst-OE/EEI/UI0527/2011 (CITI/FCT/UNL now NOVA-LINCS) and to the MAT Project. The Media Arts and Technologies project (MAT), NORTE-07-0124-FEDER-000061, is financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, Fundao para a Cincia e a Tecnologia (FCT). The authors also thank the Project I-City for Future Mobility: NORTE-07-0124-FEDER-000064, and European Project FP7 Future Cities: FP7-REGPOT-2012-2013-1.

References

  1. 1.
    ARToolKit (2003) http://www.hitl.washington.edu/artoolkit/. (last access October 2013)
  2. 2.
    Azuma R (1997) A survey of augmented reality. Presence-Teleoperators and Virtual Environments, MIT Press 4:355–385CrossRefGoogle Scholar
  3. 3.
    Bunnun P, Damen D, Calway A, Mayol-Cuevas W (2012) Integrating 3D object detection, modelling and tracking on a mobile phone. In: Proceedings of the 2012 IEEE international symposium on mixed and augmented reality (ISMAR’12). IEEE Computer Society, Atlanta, pp 273– 274CrossRefGoogle Scholar
  4. 4.
    Canny J (1986) A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(6):679–698CrossRefGoogle Scholar
  5. 5.
    Coughlan JM, Yuille AL (1999) Manhattan world : compass direction from a single Image by bayesian inference. In: Proceedings of the international conference on computer vision (ICCV ’99), vol 2. IEEE Computer Society, Kerkyra, pp 1–10Google Scholar
  6. 6.
    Criminisi A, Reid I, Zisserman A (2000) Single view metrology. Int J Comput Vis, Springer 40(2):123–148CrossRefMATHGoogle Scholar
  7. 7.
    Del Pero L, Guan J, Brau E, Schlecht J, Barnard K (2011) Sampling bedrooms. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’11). IEEE Computer Society, Colorado Springs, pp 2009–2016Google Scholar
  8. 8.
    Delong A, Boykov Y (2009) Globally optimal segmentation of multi-region objects. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 285–292CrossRefGoogle Scholar
  9. 9.
    Fischler MA, Bolles RC (1981) Random sample consensus: a para- digm for model Fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395CrossRefGoogle Scholar
  10. 10.
    Fite-Georgel P (2011) Is there a reality in industrial augmented reality?. In: Proceedings of the 2011 10th IEEE international symposium on mixed and augmented reality (ISMAR’11). IEEE Computer Society, Basel, pp 201–210CrossRefGoogle Scholar
  11. 11.
    Forsyth D (2013) Understanding pictures of rooms. Commun ACM 56(4):91CrossRefGoogle Scholar
  12. 12.
    Furukawa Y, Curless B, Seitz SM, Szeliski R (2009) Manhattan-world stereo. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’09). IEEE Computer Society, Miami, pp 1422–1429CrossRefGoogle Scholar
  13. 13.
    Gioi RGV, Jakubowicz J, Morel JM, Randall G (2008) On straight line segment detection. Journal of Mathematical Imaging and Vision, Springer 32(3):1–45MathSciNetGoogle Scholar
  14. 14.
    Gould S, Fulton R, Koller D (2009) Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 1–8CrossRefGoogle Scholar
  15. 15.
    Gupta A, Satkin S, Efros A, Hebert M (2011) From 3D scene geometry to human workspace. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’11). IEEE Computer Society, Colorado Springs, pp 1961–1968Google Scholar
  16. 16.
    Hedau V, Hoiem D (2009) Recovering the spatial layout of cluttered rooms. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 1849–1856CrossRefGoogle Scholar
  17. 17.
    Hedau V, Hoiem D, Forsyth D (2012) Recovering free space of indoor scenes from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’12). IEEE Computer Society, Providence, pp 2807–2814CrossRefGoogle Scholar
  18. 18.
    Hoiem D, Efros A.a, Hebert M (2007). Recovering surface layout from an image. International Journal of Computer Vision, Springer 75(1):151–172Google Scholar
  19. 19.
    Hough PVC Method and means for recognizing complex patternsGoogle Scholar
  20. 20.
    Karsch K, Hedau V, Forsyth D (2011) Rendering synthetic objects into legacy photographs. ACM Transactions on Graphics (TOG) 30(6):1–12CrossRefGoogle Scholar
  21. 21.
    Klein G, Murray D (2007) Parallel tracking and mapping for small AR workspaces. In: Proceedings of the 6th IEEE and ACM international symposium on mixed and augmented reality (ISMAR’07), vol. 07. IEEE Computer Society, Nara, pp 1–10CrossRefGoogle Scholar
  22. 22.
    Lee DC, Hebert M, Kanade T (2009) Geometric reasoning for single image structure recovery. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’09). IEEE Computer Society, Miami, pp 2136–2143CrossRefGoogle Scholar
  23. 23.
    Lee D, Gupta A, Hebert M (2010) Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. NIPS Foundation 1:1–9Google Scholar
  24. 24.
    Li B, Peng K, Ying X, Zha H (2012) Vanishing point detection using cascaded 1D Hough Transform from single images. Pattern Recogn Lett, Elsevier 33 (1):1–8CrossRefGoogle Scholar
  25. 25.
    Liu B, Gould S (2010) Single image depth estimation from predicted semantic labels. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’10). IEEE Computer Society, San FranciscoGoogle Scholar
  26. 26.
    Liu H, Jiang S, Huang Q, Xu C (2008) A generic virtual content insertion system based on visual attention analysis. In: Proceedings of the 16th ACM international conference on multimedia (MM ’08). ACM, Vancouver, pp 379–388CrossRefGoogle Scholar
  27. 27.
    Metaio (2013) http://www.metaio.com/. (last access October 2013)
  28. 28.
    Milgram P, Takemura H, Ustimi A, Kishino F (1994) Augmented reality: a class of display on the reality-virtuality continuum. Telemanipulator and Telepresence Technologies 2351:282–292CrossRefGoogle Scholar
  29. 29.
    Mulloni A, Seichter H, Schmalstieg D (2012) Indoor navigation with mixed reality world-in-miniature views and sparse localization on mobile devices. In: Proceedings of the 2012 international working conference on advaced visual interfaces (AVI’12). ACM, Capri Island, pp 212–215Google Scholar
  30. 30.
    Nguyen V, Tran M, Le T, Bui Q, Duong A (2012) Augmented media for traditional magazines. In: Proceedings of the third symposium on information and communication technology (SoICT ’12). ACM, Da Nang, pp 97–106CrossRefGoogle Scholar
  31. 31.
    Nóbrega R, Correia N (2012) Magnetic augmented reality: virtual objects in your space. In: Proceedings of the 2012 international working conference on advanced visual interfaces (AVI’12). ACM, Capri Island, pp 332–335Google Scholar
  32. 32.
    Nóbrega R., Correia N. (2013) Photo-based multimedia applications using image features detection. In: Proceedings of international conference on computer graphics theory and applications (GRAPP’13). INSTICC Press, Barcelona, pp 298–307Google Scholar
  33. 33.
    Nóbrega R, Correia N (2014) Dynamic Insertion of virtual objects in photographs. Int J Creative Interfaces Comput Graph (IJCICG) 4(2):22–39Google Scholar
  34. 34.
    OpenCV (2013) http://opencv.org. (last access October 2013)
  35. 35.
    openFrameworks (2013) http://www.openframeworks.cc. (last access October 2013)
  36. 36.
    Rother C (2002) A new approach to vanishing point detection in architectural environments. Image Vis Comput, Elsevier 20(1):647–655CrossRefGoogle Scholar
  37. 37.
    Rother C, Kolmogorov V (2004) GrabCut Interactive foreground extraction using iterated Graph Cuts. ACM Transactions on Graphics (TOG)Google Scholar
  38. 38.
    Saxena A, Sun M, Ng AY (2009) Make3D: learning 3D scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(5):824–40CrossRefGoogle Scholar
  39. 39.
    Simon G. (2006) Automatic online walls detection for immediate use in AR tasks. In: Proceedings IEEE international symposium on mixed and augmented reality (ISMAR’06). IEEE Computer Society, Santa Barbara, pp 4–7Google Scholar
  40. 40.
    Simon G, Berger MO (2002) Pose estimation for planar structures. IEEE Comput Graph Appl 22(6):46–53CrossRefGoogle Scholar
  41. 41.
    Simon G, Fitzgibbon AW, Zisserman A (2000) Markerless tracking using planar structures in the scene. In: Proceedings IEEE and ACM international symposium on augmented reality (ISAR’00), vol 9. IEEE Computer Society, Munich, pp 120–128CrossRefGoogle Scholar
  42. 42.
    StudiertubeTracker (2011) http://handheldar.icg.tugraz.at/stbtracker.php. (last access October 2013)
  43. 43.
    Tillon AB, Marchal I (2011) Mobile augmented reality in the museum: Can a lace-like technology take you closer to works of art?. In: Proceedings of the IEEE international symposium on mixed and augmented reality - arts, media, and humanities (ISMAR-AMH). IEEE Computer Society, Basel, pp 41–47Google Scholar
  44. 44.
    Uchiyama H (2011) Toward augmenting everything: Detecting and tracking geometrical features on planar objects. In: Proceedings of the 2011 10th IEEE international symposium on mixed and augmented reality (ISMAR’11). IEEE Computer Society, Basel, pp 17–25CrossRefGoogle Scholar
  45. 45.
    Uchiyama H, Teichrieb V, Marchand E (2012) Texture-less planar object detection and pose estimation using depth-assisted rectification of contours. In: Proceedings of the 2012 IEEE international symposium on mixed and augmented reality (ISMAR’12). IEEE Computer Society, Atlanta, pp 297–298Google Scholar
  46. 46.
    Vallino J (1998) Interactive augmented realityGoogle Scholar
  47. 47.
    von Gioi R, Jakubowicz J, Randall G (2007) Multisegment detection. In: Proceedings of the IEEE international conference on image processing (ICIP’07). IEEE Computer Society, San Antonio, pp 1–4Google Scholar
  48. 48.
    Vuforia (2013) https://www.vuforia.com/. (last access October 2013)
  49. 49.
    Wagner D, Schmalstieg D, Bischof H (2009) Multiple target detection and tracking with guaranteed framerates on mobile phones. In: Proceedings of the 8th IEEE international symposium on mixed and augmented reality (ISMAR’09). IEEE Computer Society, Orlando, pp 57–64CrossRefGoogle Scholar
  50. 50.
    Wagner D, Reitmayr G, Mulloni A, Drummond T, Schmalstieg D (2010) Real-time detection and tracking for augmented reality on mobile phones. IEEE Trans Vis Comput Graph 16(3):355–368CrossRefGoogle Scholar
  51. 51.
    Xiong X, Munoz D, Bagnell JA, Hebert M (2011) 3-D scene analysis via sequenced predictions over points and regions. In: Proceedings of the IEEE international conference on robotics and automation (ICRA’11). IEEE Computer Society, Shanghai, pp 2609–2616CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.CITI, NOVA-LINCS, Faculdade de Ciências e TecnologiaUniversidade Nova de Lisboa, FCT, UNLCaparicaPortugal
  2. 2.DEI-FEUP/INESC TECUniversidade do PortoPortoPortugal

Personalised recommendations