Multimedia Tools and Applications

, Volume 74, Issue 17, pp 6709–6728 | Cite as

An interactive multimedia storybook demonstration system

  • Yuan-Hsiang Chang
  • Hui-Lun Liao
  • Li-Der Jeng
  • Yung-Chung Chiu


Different from a real, physical storybook that shows texts and still pictures only, we present an Interactive Multimedia Storybook Demonstration System. The objective is to develop an automatic system through which users can browse digital multimedia by turning pages of a physical storybook, scribbling with an infrared pen, and interacting with a virtual keyboard. Our system supports the three functionalities: (a) page number recognition; (b) infrared interaction; and (c) virtual keyboard. Our system is vision-based using digital webcams and infrared cameras as input devices and a projector for displaying digital multimedia. The method is based on finite-state machines with a combination of computer vision techniques using C/C++ programming and the OpenCV library, resulting in a relatively low-cost and near real-time interactive system. For the implementation, our system incorporates a projector, two digital webcams, an infrared camera, an interactive platform (i.e., the storybook), and a personal computer. Results of the system functionalities are shown as examples during the demonstration. In addition, pilot studies were performed for system evaluation indicating that our system could achieve high recognition rates of over 90 % in near real time (system response time <0.2 s). The usability evaluation indicated that our system could yield an effective interaction for the three functionalities, although minor lag may be observed during the infrared interaction. In summary, our system has been successfully demonstrated in a large exhibition with over hundreds of user experience and interaction, which clearly shows the potential of our system in numerous applications for storybooks and the like.


Computer vision Digital multimedia Human-computer interface (HCI) Image processing 


  1. 1.
    Amma C, Gehrig D, Schultz T (2010) Airwriting recognition using wearable motion sensors. Proceedings of the 1st Augmented Human International Conference. ACM. doi: 10.1145/1785455.1785465
  2. 2.
    Bradski G, Kaehler A (2008) Learning OpenCV. O’Reilly Media, Inc., SebastopolGoogle Scholar
  3. 3.
    Gonzalez RC (2008) Digital image processing, 3rd edn. Prentice Hall, New YorkGoogle Scholar
  4. 4.
    Höysniemi J, Hämäläinen P, Turkki L (2004) Wizard of Oz prototyping of computer vision based action games for children. Proceedings of the 2004 conference on Interaction design and children: building a community, pp 27–34. doi: 10.1145/1017833.1017837
  5. 5.
    Jaimes A, Sebe N (2007) Multimodal human-computer interaction: a survey. Comput Vis Image Underst 108:116–134. doi: 10.1016/j.cviu.2006.10.019 MATHCrossRefGoogle Scholar
  6. 6.
    Kim KN, Ramakrishna RS (1999) Vision-based eye-gaze tracking for human computer interface. Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics 2:324–329. doi: 10.1109/ICSMC.1999.825279 Google Scholar
  7. 7.
    Koštomaj M, Boh B (2011) Design and evaluation of user’s physical experience in an Ambient Interactive Storybook and full body interaction games. Multimed Tools Appl 54:499–525. doi: 10.1007/s11042-010-0549-4 CrossRefGoogle Scholar
  8. 8.
    Lee JC (2008) Hacking the Nintendo wii remote. IEEE Trans on Pervasive Computing 7(3):39–45. doi: 10.1109/MPRV.2008.53 CrossRefGoogle Scholar
  9. 9.
    Mampusti ET, Ng JS, Quinto JJI, Teng GL, Suarez MTC, Trogo RS (2011). Measuring academic affective states of students via brainwave signals. Proceedings of the Third International Conference on Knowledge and Systems Engineering, pp 226–231. doi: 10.1109/KSE.2011.43
  10. 10.
    Margetis G, Ntelidakis A, Zabulis X, Ntoa S, Koutlemanis P, Stephanidis C (2013) Augmenting physical books towards education enhancement. Workshop on User-Centred Computer Vision, in Conjunction with Workshop on the Applications of Computer Vision 2013 in Clearwater, Florida. Jan 16–18, 2013. doi: 10.1109/UCCV.2013.6530807
  11. 11.
    McLuhan M (1964) Understanding media: the extensions of man. McGraw-Hill, New YorkGoogle Scholar
  12. 12.
    Microsoft PixelSense (2013) Accessed December 2013
  13. 13.
    Mistry P, Maes P (2009) SixthSense: a wearable gestural interface. Proceeding of ACM SIGGRAPH Asia. doi: 10.1145/1667146.1667160
  14. 14.
    Oikonomidis I, Kyriazis N, Argyros A (2011) Efficient Model-based 3D Tracking of Hand Articulations using Kinect. Proceedings of the British Machine Vision Conference, pp 101.1–101.11. doi: 10.5244/C.25.101
  15. 15.
    Shen HC, Lee CN (2010) An interactive Whistle-to-Music composing system based on transcription, variation and chords generation. Multimed Tools Appl 53:253–269. doi: 10.1007/s11042-010-0510-6 CrossRefGoogle Scholar
  16. 16.
    Vatavu RD (2011) Point & click mediated interactions for large home entertainment displays. Multimed Tools Appl 59:113–128. doi: 10.1007/s11042-010-0698-5 CrossRefGoogle Scholar
  17. 17.
    Wilson AD (2005) PlayAnywhere: A Compact Interactive Tabletop Projection-Vision System. Proceedings of the 18th annual ACM symposium on User interface software and technology, pp 83–92. doi: 10.1145/1095034.1095047
  18. 18.
    Yamabe T, Nakajima T (2012) Playful training with augmented reality games: case studies towards reality-oriented system design. Multimed Tools Appl 62:259–286. doi: 10.1007/s11042-011-0979-7 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Yuan-Hsiang Chang
    • 1
  • Hui-Lun Liao
    • 1
  • Li-Der Jeng
    • 2
  • Yung-Chung Chiu
    • 3
  1. 1.Department of Information & Computer EngineeringChung Yuan Christian UniversityChung-LiRepublic of China
  2. 2.Department of Electronic EngineeringChung Yuan Christian UniversityChung-LiTaiwan
  3. 3.Department of Commercial DesignChung Yuan Christian UniversityChung-LiTaiwan

Personalised recommendations