What is Visual Dialog?

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the task is to answer the question.

    VisDial dataset stats:
  • 120k images from COCO
  • 1 dialog / image
  • 10 rounds of question-answers / dialog
  • Total 1.2M dialog question-answers

Email — [email protected]

Improving Generative Visual Dialog by Answering Diverse Questions

Vishvak Murahari, Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das
EMNLP 2019 [Bibtex] [PDF] [Code]

Visual Coreference Resolution in Visual Dialog using Neural Module Networks

Satwik Kottur, José M. F. Moura, Devi Parikh, Dhruv Batra and Marcus Rohrbach
ECCV 2018 [Bibtex] [PDF] [Code]

Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Prithvijit Chattopadhyay*, Deshraj Yadav*, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra and Devi Parikh
* equal contribution
HCOMP 2017 [Bibtex] [PDF] [Code]

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Abhishek Das*, Satwik Kottur*, José M.F. Moura, Stefan Lee and Dhruv Batra
* equal contribution
ICCV 2017 (Oral) [Bibtex] [PDF] [Code]

Visual Dialog

Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh and Dhruv Batra
CVPR 2017 (Spotlight) [Bibtex] [PDF] [Code]


We thank Harsh Agrawal and Jiasen Lu for help on the AMT data collection interface; Xiao Lin, Ramprasaath Selvaraju and Latha Pemula for model discussions; Marco Baroni, Antoine Bordes, Mike Lewis, and Marc'Aurelio Ranzato for helpful discussions. Finally, we are grateful to the developers of Torch for building an excellent framework. This work was funded in part by the NSF CAREER awards to Dhruv Batra and Devi Parikh, ONR YIP awards to Dhruv Batra and Devi Parikh, ONR Grant N00014-14-1-0679 to Dhruv Batra, a Sloan Fellowship to Devi Parikh, ARO YIP awards to Dhruv Batra and Devi Parikh, an Allen Distinguished Investigator award to Devi Parikh from the Paul G. Allen Family Foundation, ICTAS Junior Faculty awards to Dhruv Batra and Devi Parikh, Google Faculty Research Awards to Dhruv Batra and Devi Parikh, Amazon Academic Research Awards to Dhruv Batra and Devi Parikh, AWS in Education Research grant to Dhruv Batra and NVIDIA GPU donations to Dhruv Batra.


Visual Dialog annotations and this website are licensed under a Creative Commons Attribution 4.0 International License.


Visual Dialog does not own the copyright of the images. Use of the images must abide by the COCO and Flickr Terms of Use. The users of the images accept full responsibility for the use of the dataset, including but not limited to the use of any copies of copyrighted images that they may create from the dataset.