Knowledge-based visual question answering (QA) aims to answer a question which requires visually-grounded external knowledge beyond image content itself. Answering complex questions that require multi-hop reasoning under weak supervision is considered as a challenging problem since i) no supervision is given to the reasoning process and ii) high-order semantics of multi-hop knowledge facts need to be captured. In this paper, we introduce a concept of hypergraph to encode high-level semantics of a question and a knowledge base, and to learn high-order associations between them. The proposed model, Hypergraph Transformer, constructs a question hypergraph and a query-aware knowledge hypergraph, and infers an answer by encoding inter-associatio...
Multi-hop logical reasoning is an established problem in the field of representation learning on kno...
We propose a method for visual question answering which combines an internal representation of the c...
International audienceAchieving artificial visual reasoning — the ability to answer image-related qu...
Knowledge-based visual question answering (VQA) is a vision-language task that requires an agent to ...
Collaborative reasoning for knowledge-based visual question answering is challenging but vital and e...
Accurately answering a question about a given image requires combining observations with general kno...
Humans have a remarkable capability to learn new concepts, process them in relation to their existin...
We describe a method for visual question answering which is capable of reasoning about an image on t...
Visual Question Answering (VQA) has emerged as an important problem spanning Computer Vision, Natura...
We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Ques...
Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provid...
Recent works on knowledge base question answering (KBQA) retrieve subgraphs for easier reasoning. A ...
Pathology imaging is routinely used to detect the underlying effects and causes of diseases or injur...
Recently, there has been an increasing interest in building question answering (QA) models that reas...
In visual reasoning, the achievement of deep learning significantly improved the accuracy of results...
Multi-hop logical reasoning is an established problem in the field of representation learning on kno...
We propose a method for visual question answering which combines an internal representation of the c...
International audienceAchieving artificial visual reasoning — the ability to answer image-related qu...
Knowledge-based visual question answering (VQA) is a vision-language task that requires an agent to ...
Collaborative reasoning for knowledge-based visual question answering is challenging but vital and e...
Accurately answering a question about a given image requires combining observations with general kno...
Humans have a remarkable capability to learn new concepts, process them in relation to their existin...
We describe a method for visual question answering which is capable of reasoning about an image on t...
Visual Question Answering (VQA) has emerged as an important problem spanning Computer Vision, Natura...
We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Ques...
Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provid...
Recent works on knowledge base question answering (KBQA) retrieve subgraphs for easier reasoning. A ...
Pathology imaging is routinely used to detect the underlying effects and causes of diseases or injur...
Recently, there has been an increasing interest in building question answering (QA) models that reas...
In visual reasoning, the achievement of deep learning significantly improved the accuracy of results...
Multi-hop logical reasoning is an established problem in the field of representation learning on kno...
We propose a method for visual question answering which combines an internal representation of the c...
International audienceAchieving artificial visual reasoning — the ability to answer image-related qu...