The Second Conversational Intelligence Challenge (ConvAI2)

Dinan, Emily; Logacheva, Varvara; Malykh, Valentin; Miller, Alexander; Shuster, Kurt; Urbanek, Jack; Kiela, Douwe; Szlam, Arthur; Serban, Iulian; Lowe, Ryan; Prabhumoye, Shrimai; Black, Alan W.; Rudnicky, Alexander; Williams, Jason; Pineau, Joelle; Burtsev, Mikhail; Weston, Jason

doi:10.1007/978-3-030-29135-8_7

Emily Dinan⁶,
Varvara Logacheva⁷,
Valentin Malykh⁷,
Alexander Miller⁶,
Kurt Shuster⁶,
Jack Urbanek⁶,
Douwe Kiela⁶,
Arthur Szlam⁶,
Iulian Serban⁸,
Ryan Lowe^9,6,
Shrimai Prabhumoye¹⁰,
Alan W. Black¹⁰,
Alexander Rudnicky¹⁰,
Jason Williams¹¹,
Joelle Pineau^6,9,
Mikhail Burtsev⁷ &
…
Jason Weston⁶

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

1523 Accesses
46 Citations

Abstract

We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (1) pretrained Transformer variants are currently the best performing models on this task, (2) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts (e.g. how many questions asked vs. answered).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://convai.io/.
2.
http://convai.io/2017/data/.
3.
https://github.com/DeepPavlov/convai/tree/master/2017/solutions.
4.
https://developer.amazon.com/alexaprize.
5.
https://en.wikipedia.org/wiki/Loebner_Prize.
6.
https://github.com/facebookresearch/ParlAI/tree/master/parlai/tasks/convai2.
7.
https://github.com/facebookresearch/ParlAI/tree/master/projects/convai2,
8.
ConvAI2 dataset of non-goal-oriented human-to-bot dialogues (2019). V. Logacheva, V. Malykh, A. Litinsky, M. Burtsev.
9.
http://github.com/DeepPavlov/convai/data.
10.
http://convai.io/NeurIPSParticipantSlides.pptx.
11.
https://github.com/atselousov/transformer_chatbot.
12.
http://workshop.colips.org/dstc7/.

References

Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. Personalizing dialogue agents: I have a dog, do you have pets too? arXiv preprint arXiv:1801.07243, 2018.
Google Scholar
Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, and Joelle Pineau. Generative deep neural networks for dialogue: A short review. arXiv preprint arXiv:1611.06216, 2016.
Google Scholar
Oriol Vinyals and Quoc Le. A neural conversational model. arXiv preprint arXiv:1506.05869, 2015.
Google Scholar
Jiwei Li, Michel Galley, Chris Brockett, Georgios P Spithourakis, Jianfeng Gao, and Bill Dolan. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155, 2016.
Google Scholar
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.
Google Scholar
Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the SIGDIAL 2015 Conference, The 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2–4 September 2015, Prague, Czech Republic, pages 285–294. The Association for Computer Linguistics, 2015.
Google Scholar
Wenchao Du and Alan W. Black. Data augmentation for neural online chats response selection. In Aleksandr Chuklin, Jeff Dalton, Julia Kiseleva, Alexey Borisov, and Mikhail Burtsev, editors, Proceedings of the 2nd International Workshop on Search-Oriented Conversational AI, SCAI@EMNLP 2018, Brussels, Belgium, October 31, 2018, pages 52–58. Association for Computational Linguistics, 2018.
Google Scholar
Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. Multi-turn response selection for chatbots with deep attention matching network. In Iryna Gurevych and Yusuke Miyao, editors, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: Long Papers, pages 1118–1127. Association for Computational Linguistics, 2018.
Google Scholar
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. Dailydialog: A manually labelled multi-turn dialogue dataset. In Greg Kondrak and Taro Watanabe, editors, Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP 2017, Taipei, Taiwan, November 27 - December 1, 2017 - Volume 1: Long Papers, pages 986–995. Asian Federation of Natural Language Processing, 2017.
Google Scholar
Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. Building end-to-end dialogue systems using generative hierarchical neural network models. In Dale Schuurmans and Michael P. Wellman, editors, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., pages 3776–3784. AAAI Press, 2016.
Google Scholar
Alexander H Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, and Jason Weston. Parlai: A dialog research software platform. arXiv preprint arXiv:1705.06476, 2017.
Google Scholar
Chia-Wei Liu, Ryan Lowe, Iulian Vlad Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. CoRR, abs/1603.08023, 2016.
Google Scholar
Oriol Vinyals and Quoc V. Le. A neural conversational model. CoRR, abs/1506.05869, 2015.
Google Scholar
Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. Deep reinforcement learning for dialogue generation. In Su et al. [26], pages 1192–1202.
Google Scholar
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. In Kevin Knight, Ani Nenkova, and Owen Rambow, editors, NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, pages 110–119. The Association for Computational Linguistics, 2016.
Google Scholar
Chia-Wei Liu, Ryan Lowe, Iulian Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Su et al. [26], pages 2122–2132.
Google Scholar
Ilya Kulikov, Alexander H. Miller, Kyunghyun Cho, and Jason Weston. Importance of a search strategy in neural dialogue modelling. CoRR, abs/1811.00907, 2018.
Google Scholar
Thomas Wolf, Victor Sanh, Julien Chaumond, and Clement Delangue. Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.
Google Scholar
Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627, 2016.
Google Scholar
Jason Weston, Emily Dinan, and Alexander H Miller. Retrieve and refine: Improved sequence generation models for dialogue. arXiv preprint arXiv:1808.04776, 2018.
Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
Google Scholar
Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. Real-time inference in multi-sentence tasks with deep pretrained transformers. arXiv preprint arXiv:1905.01969, 2019.
Google Scholar
Sean Welleck, Jason Weston, Arthur Szlam, and Kyunghyun Cho. Dialogue natural language inference. arXiv preprint arXiv:1811.00671, 2018.
Google Scholar
Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. What makes a good conversation? how controllable attributes affect human judgments. arXiv preprint arXiv:1902.08654, 2019.
Google Scholar
Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. Wizard of wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241, 2018.
Google Scholar
Jian Su, Xavier Carreras, and Kevin Duh, editors. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016. The Association for Computational Linguistics, 2016.
Google Scholar

Download references

Acknowledgements

We thank all the competitors for taking part and making this a successful competition. We especially thank the competition’s sponsors, Facebook Academics and Amazon Web Services. Participation of Mikhail Burtsev, Varvara Logacheva, and Valentin Malykh was supported by National Technology Initiative and PAO Sberbank project ID 0000000007417F630002.

Author information

Authors and Affiliations

Facebook AI Research, New York, NY, USA
Emily Dinan, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Ryan Lowe, Joelle Pineau & Jason Weston
Moscow Institute of Physics and Technology, Moscow, Russia
Varvara Logacheva, Valentin Malykh & Mikhail Burtsev
University of Montreal, Montreal, QC, Canada
Iulian Serban
McGill University, Montreal, QC, Canada
Ryan Lowe & Joelle Pineau
Carnegie Mellon University, Pittsburgh, PA, USA
Shrimai Prabhumoye, Alan W. Black & Alexander Rudnicky
Microsoft Research, Redmond, WA, USA
Jason Williams

Authors

Emily Dinan
View author publications
You can also search for this author in PubMed Google Scholar
Varvara Logacheva
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Malykh
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Miller
View author publications
You can also search for this author in PubMed Google Scholar
Kurt Shuster
View author publications
You can also search for this author in PubMed Google Scholar
Jack Urbanek
View author publications
You can also search for this author in PubMed Google Scholar
Douwe Kiela
View author publications
You can also search for this author in PubMed Google Scholar
Arthur Szlam
View author publications
You can also search for this author in PubMed Google Scholar
Iulian Serban
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Lowe
View author publications
You can also search for this author in PubMed Google Scholar
Shrimai Prabhumoye
View author publications
You can also search for this author in PubMed Google Scholar
Alan W. Black
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Rudnicky
View author publications
You can also search for this author in PubMed Google Scholar
Jason Williams
View author publications
You can also search for this author in PubMed Google Scholar
Joelle Pineau
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail Burtsev
View author publications
You can also search for this author in PubMed Google Scholar
Jason Weston
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emily Dinan .

Editor information

Editors and Affiliations

Universitat de Barcelona and Computer, Vision Center, Barcelona, Spain
Sergio Escalera
Amazon (Berlin), Berlin, Berlin, Germany
Ralf Herbrich

Appendix: Example Dialogues

Example dialogues for some of the top models are given in Figs. 6, 7, 8, 9, 10, and 11.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dinan, E. et al. (2020). The Second Conversational Intelligence Challenge (ConvAI2). In: Escalera, S., Herbrich, R. (eds) The NeurIPS '18 Competition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-29135-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-29135-8_7
Published: 30 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29134-1
Online ISBN: 978-3-030-29135-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Second Conversational Intelligence Challenge (ConvAI2)

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Example Dialogues

Appendix: Example Dialogues

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation