Mobile App for Text-to-Image Synthesis

Kang, Ryan; Sunil, Athira; Chen, Min

doi:10.1007/978-3-030-28468-8_3

Ryan Kang¹⁹,
Athira Sunil²⁰ &
Min Chen²¹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 290))

Included in the following conference series:

International Conference on Mobile Computing, Applications, and Services

443 Accesses

Abstract

Generating visual representation of textual information is a challenging yet interesting topic with many potential applications. In this paper, we propose a novel approach to visualize natural language sentences using ImageNet to enhance language education. Currently the focus is to assist English language learners in building their vocabulary of common nouns and developing an in-depth understanding of the various prepositions of locations. To achieve this goal, real-world images representing nouns are obtained from ImageNet and their foreground objects of interest are extracted using image segmentation. The objects are then re-arranged on a canvas based on their spatial relationship specified in the sentence. To demonstrate the effectiveness of the proposed approach, we have developed a mobile application that uses the RESTful API to retrieve the images from the web service that operate the image generation program. The prototype mobile application can create visual representations of natural language sentences and a text description of the spatial relationship of objects to assist in learning new vocabulary and spatial prepositions during language education.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Omaggio, A.C.: Pictures and second language comprehension: do they help? Foreign Lang. Ann. 12(2), 107–116 (1979)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Miller, G.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Delgado, D., Magalhaes, J., Correia, N.: Assisted news reading with automated illustration. In: Proceedings of the International Conference on Multimedia – MM 2010 (2010)
Google Scholar
Inaba, S. Kanezaki, A., Harada, T.: Automatic image synthesis from keywords using scene context. Ibn: Proceedings of the ACM International Conference on Multimedia – MM 2014 (2014)
Google Scholar
Zitnick, C., Parikh, D., Vanderwende, L.: Learning the visual interpretation of sentences. In: 2013 IEEE International Conference on Computer Vision (2013)
Google Scholar
Coyne, B., Sproat, R.: WordsEye. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques – SIGGRAPH 2001 (2001)
Google Scholar
Mano, T., Yamane, H. Harada, T.: Scene image synthesis from natural sentences using hierarchical syntactic analysis. In: Proceedings of the 2016 ACM on Multimedia Conference – MM 2016 (2016)
Google Scholar
Bird, S., Loper, E., Klein, E.: Natural Language Processing with Python. O’Reilly Media Inc., Sebastopol (2009)
MATH Google Scholar
Rother, C., Kolmogorov, V., Blake, A.: GrabCut. ACM Trans. Graph. 23(3), 309 (2004)
Article Google Scholar
Efros, A.: Image Compositing and Blending, Carnegie Mellon University (2007). http://graphics.cs.cmu.edu/courses/15-463/2007_fall/Lectures/blending.pdf. Accessed 4 Feb 2019
Apple Developer Documentation Web Page. https://developer.apple.com/documentation/speech. Accessed 4 Feb 2019
Apple Developer Documentation Web Page. https://developer.apple.com/documentation/foundation/urlsession. Accessed 4 Feb 2019

Download references

Author information

Authors and Affiliations

Tableau Software, Seattle, WA, 98103, USA
Ryan Kang
eBay Inc., San Jose, CA, 95125, USA
Athira Sunil
University of Washington Bothell, Bothell, WA, 98011, USA
Min Chen

Authors

Ryan Kang
View author publications
You can also search for this author in PubMed Google Scholar
Athira Sunil
View author publications
You can also search for this author in PubMed Google Scholar
Min Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Chen .

Editor information

Editors and Affiliations

Hangzhou Dianzi University, Hangzhou, China
Yuyu Yin
Zhejiang University, Hangzhou, China
Ying Li
Shanghai University, Shanghai, China
Honghao Gao
Hangzhou Dianzi University, Hangzhou, China
Jilin Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, R., Sunil, A., Chen, M. (2019). Mobile App for Text-to-Image Synthesis. In: Yin, Y., Li, Y., Gao, H., Zhang, J. (eds) Mobile Computing, Applications, and Services. MobiCASE 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 290. Springer, Cham. https://doi.org/10.1007/978-3-030-28468-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-28468-8_3
Published: 25 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28467-1
Online ISBN: 978-3-030-28468-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics