Visual Question Answering (VQA) raises a great challenge for computer vision and natural language processing communities. Most of the existing approaches consider video-question pairs individually during training. However, we observe that there are usually multiple (either sequentially generated or not) questions for the target video in a VQA task, and the questions themselves have abundant semantic relations. To explore these relations, we propose a new paradigm for VQA termed Multi-Question Learning (MQL). Inspired by the multi-task learning, MQL learns from multiple questions jointly together with their corresponding answers for a target video sequence. The learned representations of video-question pairs are then more general to be trans...
To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in visi...
Visual Question Answering is a multi-modal task that aims to measure high-level visual understanding...
One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredic...
Visual Question Answering (VQA) raises a great challenge for computer vision and natural language pr...
Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machin...
Visual question answering (VQA) demands simultaneous comprehension of both the image visual content ...
CVPR2019 accepted paperInternational audienceMultimodal attentional networks are currently state-of-...
Rich and dense human labeled datasets are among the main enabling factors for the recent advance on ...
We propose a scalable approach to learn video-based question answering (QA): to answer a free-form n...
Visual Question Answering (VQA) is an extremely stimulating and challenging research area where Comp...
This paper proposes to improve visual question answering (VQA) with structured representations of bo...
Visual Question Answering~(VQA) requires a simultaneous understanding of images and questions. Exist...
Video Question Answering (VideoQA) requires fine-grained understanding of both video and language mo...
© 2017 IEEE. Visual question answering (VQA) is challenging because it requires a simultaneous under...
Recently, the Visual Question Answering (VQA) task has gained increasing attention in artificial int...
To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in visi...
Visual Question Answering is a multi-modal task that aims to measure high-level visual understanding...
One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredic...
Visual Question Answering (VQA) raises a great challenge for computer vision and natural language pr...
Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machin...
Visual question answering (VQA) demands simultaneous comprehension of both the image visual content ...
CVPR2019 accepted paperInternational audienceMultimodal attentional networks are currently state-of-...
Rich and dense human labeled datasets are among the main enabling factors for the recent advance on ...
We propose a scalable approach to learn video-based question answering (QA): to answer a free-form n...
Visual Question Answering (VQA) is an extremely stimulating and challenging research area where Comp...
This paper proposes to improve visual question answering (VQA) with structured representations of bo...
Visual Question Answering~(VQA) requires a simultaneous understanding of images and questions. Exist...
Video Question Answering (VideoQA) requires fine-grained understanding of both video and language mo...
© 2017 IEEE. Visual question answering (VQA) is challenging because it requires a simultaneous under...
Recently, the Visual Question Answering (VQA) task has gained increasing attention in artificial int...
To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in visi...
Visual Question Answering is a multi-modal task that aims to measure high-level visual understanding...
One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredic...