Deep Photo Rally: Let’s Gather Conversational Pictures
In this paper, we propose an anthropomorphic approach to generate speech sentences of a specific object according to surrounding circumstances using the recent Deep Neural Networks technology. In the proposal approach, the user can have pseudo communication with the object by photographing the object with a mobile terminal. We introduce some examples of application of the proposal approach to entertainment products, and show that this is an anthropomorphic approach capable of interacting with the environment.
KeywordsAugmented reality Anthropomorphic Deep Neural Networks
- 1.Waytz, A.: Social connection and seeing human. In: The Oxford Handbook of Social Exclusion, pp. 251–256 (2013)Google Scholar
- 3.Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. arXiv preprint arXiv:1612.08242 (2016)
- 4.Vinyals, O., et al.: Show and tell: A neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)Google Scholar
- 5.Chen, X., et al.: Microsoft COCO captions: Data collection and evaluation server (2015). arXiv preprint arXiv:1504.00325