What is Visual Dialog?

Visual Dialog is a novel task that requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the agent has to answer the question.

    VisDial dataset:
  • 120k images from COCO
  • 1 dialog / image
  • 10 rounds of question-answers / dialog
  • Total 1.2M dialog question-answers

Visual Chatbot demo

Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Prithvijit Chattopadhyay*, Deshraj Yadav*, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra and Devi Parikh
* equal contribution
HCOMP 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Abhishek Das*, Satwik Kottur*, José M.F. Moura, Stefan Lee and Dhruv Batra
* equal contribution
ICCV 2017 (Oral)

Visual Dialog

Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh and Dhruv Batra
CVPR 2017 (Spotlight)


