In the past few years, Visual Question Answering (VQA) has seen immense progress both in terms of accuracy and network architectures. From a simple end-to-end neural network-based architecture to complex modular architectures that incorporate interpretability and explainability, VQA has been a very dynamic area of research. Recent work have shown despite significant progress, VQA models are notoriously brittle to linguistic variations in the questions, wherein a small rephrasing of the question leads the VQA models to change their answer. However the variations in the images, by editing them in a semantic fashion, have not been studied before (to the best of our knowledge). In my thesis, we explore how consistent these models are when we ma...
In this paper, we exploit memory-augmented neural networks to predict accurate answers to visual que...
Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image)...
Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machin...
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far...
Visual Question Answering (VQA) is the task of answering questions based on an image. The field has ...
Deep neural networks have been playing an essential role in many computer vision tasks including Vis...
Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biase...
International audienceSince its inception, Visual Question Answering (VQA) is notoriously known as a...
In recent years, visual question answering (VQA) has become topical. The premise of VQA's significan...
One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredic...
Visual Question Answering (VQA) is an extremely stimulating and challenging research area where Comp...
Machine learning has advanced dramatically, narrowing the accuracy gap to humans in multimodal tasks...
Due to the significant advancement of Natural Language Processing and Computer Vision-based models, ...
This paper proposes to improve visual question answering (VQA) with structured representations of bo...
Given visual input and a natural language question about it, the visual question answering (VQA) tas...
In this paper, we exploit memory-augmented neural networks to predict accurate answers to visual que...
Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image)...
Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machin...
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far...
Visual Question Answering (VQA) is the task of answering questions based on an image. The field has ...
Deep neural networks have been playing an essential role in many computer vision tasks including Vis...
Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biase...
International audienceSince its inception, Visual Question Answering (VQA) is notoriously known as a...
In recent years, visual question answering (VQA) has become topical. The premise of VQA's significan...
One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredic...
Visual Question Answering (VQA) is an extremely stimulating and challenging research area where Comp...
Machine learning has advanced dramatically, narrowing the accuracy gap to humans in multimodal tasks...
Due to the significant advancement of Natural Language Processing and Computer Vision-based models, ...
This paper proposes to improve visual question answering (VQA) with structured representations of bo...
Given visual input and a natural language question about it, the visual question answering (VQA) tas...
In this paper, we exploit memory-augmented neural networks to predict accurate answers to visual que...
Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image)...
Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machin...