CodeeGAN: Code Generation via Adversarial Training

Deng, Youqiang; Fu, Cai; Li, Yang

doi:10.1007/978-981-15-1304-6_2

Youqiang Deng¹¹,
Cai Fu¹¹ &
Yang Li¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1123))

Included in the following conference series:

International Conference on Dependability in Sensor, Cloud, and Big Data Systems and Applications

1167 Accesses
2 Citations

Abstract

The automatic generation of code is an important research problem in the field of Machine Learning. Generative Adversarial Network (GAN) exhibits a powerful ability in image generation. However, generating code via GAN is so far an unexplored research area, the reason of which is the discrete output of language model hinders the application of gradient-based GANs. In this paper, we propose a model called CodeeGAN to generate code via adversarial training. First, we adopt Policy Gradient method in Reinforcement Learning (RL) to solve the problem of discrete data. Data generated by the generative model is discrete data which makes the generative model cannot be adjusted by gradient descent. Second, we use Monte Carlo Tree Search (MCTS) to create our rollout network for evaluating the loss of generated tokens. Based on the two mechanisms above, we create CodeeGAN model to generate code via adversarial training. We evaluate the model with datasets from four different platforms. Our model shows a better performance than other existing works and proves that code generation via adversarial training is an advanced efficient method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Beltramelli, T.: pix2code: Generating Code from a Graphical User Interface Screenshot, arXiv preprint arXiv:1705.07962 (2017)
Wallner, E.: Screenshot-to-code (2017). https://github.com/emilwallner/Screenshot-to-code
Wilkins, B., Gold, J., Owens, G., Chen, D., Smith, L.: Sketching Interfaces: Generating code from low fidelity wireframes (2017). https://airbnb.design/sketching-interfaces
Deng, Y., Kanervisto, A., Ling, J., Rush, A.M.: Image-to-markup generation with coarse-to-fine attention. In: ICML (1997)
Google Scholar
Kumar, A.: Sketch-code (2018). https://github.com/ashnkumar/sketch-code
Microsoft ailab. Sketch2Code (2018). https://github.com/Microsoft/ailab/tree/mas-ter/Sketch2Code
Goodfellow, I.J., et al.: Generative Adversarial Nets, arXiv preprint arXiv:1406.2661 (2014)
Huszar, F.: How (not) to train your generative model: scheduled sampling, likelihood, adversary? arXiv preprint arXiv:1511.05101 (2015)
Goodfellow, I.J.: Generative adversarial networks for text (2016). http://goo.gl/Wg9DR7
Yu, L., Zhang, W., Wang, J., Yu, Y.: SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient, arXiv preprint arXiv:1609.05473 (2017)
Bachman, P., Precup, D.: Data generation as sequential decision making. In: NIPS, pp. 3249–3257 (2015)
Google Scholar
Bahdanau, D., Brakel, P., Xu, K., et al.: An actor-critic algorithm for sequence prediction, arXiv preprint arXiv:1607.07086 (2016)
Sutton, R.S., Barto, A.G.: Finite Markov decision processes. Reinforcement Learn. 47–71 (2018)
Google Scholar
Vesely, K., Ghoshal, A., Burget, L., Povey, D.: Sequence-discriminative training of deep neural networks. In: INTERSPEECH, pp. 2345–2349 (2013)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification, arXiv preprint arXiv:1408.5882 (2014)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, pp. 2267–2273 (2015)
Google Scholar
Zhang, X., LeCun, Y.: Text understanding from scratch, arXiv preprint arXiv:1502.01710 (2015)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
MATH Google Scholar
Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs, arXiv preprint arXiv:1611.01989 (2016)
Kusner, M., Lobato, J.: GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution, arXiv preprint arXiv:1611.04051 (2016)
Zhang, Y., Gan, Z., Carin, L.: Generating Text via Adversarial Training, arXiv preprint arXiv:1725.07232 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Guo, J., Lu, S., Cai, H., Zhang, W., Yu, Y., Wang, J.: Long Text Generation via Adversarial Training with Leaked Information, arXiv preprint arXiv:1709.08624 (2017)
Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., Jurafsky, D.: Adversarial Learning for Neural Dialogue Generation, arXiv preprint arXiv:1701.06547 (2017)
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: Proceedings of The 33rd International Conference on Machine Learning, vol. 3 (2016)
Google Scholar
Zhang, H., et al.: Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. arXiv preprint arXiv:1612.03242 (2016)
Shetty, R., Rohrbach, M., Hendricks, L.A., Fritz, M., Schiele, B.: Speaking the same language: Matching machine to human captions by adversarial training. arXiv preprint arXiv:1703.10476 (2017)
Alrowaily, M., Alenezi, F., Lu, Z.: Effectiveness of machine learning based intrusion detection systems. In: Wang, G., Feng, J., Bhuiyan, M.Z.A., Lu, R. (eds.) SpaCCS 2019. LNCS, vol. 11611, pp. 277–288. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24907-6_21
Chapter Google Scholar
Manavi, M., Zhang, Y.: A new intrusion detection system based on gated recurrent unit (GRU) and genetic algorithm. In: Wang, G., Feng, J., Bhuiyan, M.Z.A., Lu, R. (eds.) SpaCCS 2019. LNCS, vol. 11611, pp. 368–383. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24907-6_28
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Huazhong University of Science and Technology, Wuhan, 430074, China
Youqiang Deng & Cai Fu
Wuhan Maritime Communication Research Institute, Wuhan, 430074, China
Yang Li

Authors

Youqiang Deng
View author publications
You can also search for this author in PubMed Google Scholar
Cai Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cai Fu .

Editor information

Editors and Affiliations

Guangzhou University, Guangzhou, China
Guojun Wang
Fordham University, New York, USA
Md Zakirul Alam Bhuiyan
Università degli Studi di Milano, Milan, Italy
Sabrina De Capitani di Vimercati
Hangzhou Dianzi University, Hangzhou, China
Yizhi Ren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, Y., Fu, C., Li, Y. (2019). CodeeGAN: Code Generation via Adversarial Training. In: Wang, G., Bhuiyan, M.Z.A., De Capitani di Vimercati, S., Ren, Y. (eds) Dependability in Sensor, Cloud, and Big Data Systems and Applications. DependSys 2019. Communications in Computer and Information Science, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-15-1304-6_2

Download citation

DOI: https://doi.org/10.1007/978-981-15-1304-6_2
Published: 05 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1303-9
Online ISBN: 978-981-15-1304-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics