Skip to main content

Pixel-Based LSTM Generative Model

  • Conference paper
  • First Online:
Computational Intelligence in Information Systems (CIIS 2018)

Abstract

Applying computational intelligence techniques to create generative models of digits or alphabets has received somewhat little attention as compared to classification task. It is also more challenging to create a generative model that could successfully capture styles and detailed characteristics of symbols. In this paper, we describe the application of the Long Short-Term Memory (LSTM) model trained using a supervised learning approach for generating a variety of the letter A. LSTM is a recurrent neural network with a strong salient feature in its ability to handle long range dependencies, hence, it is a popular choice for building intelligent applications for speech recognition, conversation agent and other problems in time series domains. To formulate the problem as a generative task, all the pixels in a 2D image representing an alphabet (i.e., the letter A in this study) are flattened into a long vector to train the LSTM model. We have shown that LSTM has successfully learned to generate new letters A showing many coherent stylistic features with the original letters from the training sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html.

  2. 2.

    Hence, our predictive model can also be thought of as a binary classification model.

  3. 3.

    https://en.wikipedia.org/wiki/AARON.

References

  1. Norman, D.A.: The Design of Everyday Things. MIT Press, London (2013)

    Google Scholar 

  2. Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115–147 (1987)

    Article  Google Scholar 

  3. Hubel, D.H., Wiesel, T.N.: Receptive fields of single neurones in cat’s striate cortex. Phisiology 148(3), 574–591 (1959)

    Article  Google Scholar 

  4. Zeiler, M.D., Fergus, B.: Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision (ECCV 2014), pp. 818–833 (2013)

    Google Scholar 

  5. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  7. Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)

    Google Scholar 

  8. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. http://arxiv.org/abs/1409.2329 (2015)

  9. Arnheim, R.: Art and Visual Perception: A Psychology of the Creative Eye. University of California Press, London (1974)

    Google Scholar 

  10. Santoro, S.W.: Guide to Graphic Design. Pearson (2014)

    Google Scholar 

  11. Costello, V., Youngblood, S.A., Youngblood, N.E: Multimedia Foundations: Core Concepts for Digital Design, 2 edn. Focal Press (2013)

    Google Scholar 

  12. Pollen, D.A.: On the neural correlates of visual perception. Cereb. Cortex 9(1), 4–19 (1999)

    Article  Google Scholar 

  13. Prucinkiewicz, P., Lindenmayer, A.: The Algorithmic Beauty of Plants. Springers, New York (1996)

    Google Scholar 

  14. Wolfram, S.: Cellular automata as models of complexity. Nature 331(4), 419–424 (1984)

    Article  Google Scholar 

  15. Phon-Amnuaisuk, S., Panjapornpon, J.: Controlling generative processes of generative art. In: Proceedings of the International Neural Network Society Winter Conference (INNS-WC 2012). Procedia Computer Science, vol. 13, pp. 43–52 (2012)

    Google Scholar 

  16. Ariffin, M.K., Hadi, S., Phon-Amnuaisuk, S.: Evolving 3D models using interactive genetic algorithms and L-systems. In: Proceedings of the 11th International Workshop on Multi-disciplinary Trends in Artificial Intelligence (MIWAI 2017), pp. 485–493 (2017)

    Google Scholar 

  17. Goodfellow, I., Pouget-Abadie, J., Mehdi, M., Bing, X., Warde-Farley, D., Ozair, S., Courville, A., Bengio, J.: Generative adversarial networks. http://arxiv.org/abs/1406.2661 (2014)

Download references

Acknowledgments

We wish to thank anonymous reviewers for their comments that have helped improve this paper. We would like to thank the GSR office for their partial financial support given to this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Somnuk Phon-Amnuaisuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Phon-Amnuaisuk, S., Salleh, N.D.H.M., Woo, SL. (2019). Pixel-Based LSTM Generative Model. In: Omar, S., Haji Suhaili, W., Phon-Amnuaisuk, S. (eds) Computational Intelligence in Information Systems. CIIS 2018. Advances in Intelligent Systems and Computing, vol 888. Springer, Cham. https://doi.org/10.1007/978-3-030-03302-6_18

Download citation

Publish with us

Policies and ethics