We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utter...
Distributional semantic models capture word-level meaning that is useful in many natural language pr...
Artificial Intelligence (AI) technologies affect many facets of our daily lives. AI systems help us ...
In this paper we explore the use of visual common-sense knowledge and other kinds of knowledge (such...
We present a visually-grounded language understanding model based on a study of how people verbally ...
We propose a computational model of visually-grounded spatial language under-standing, based on a st...
Grounding language in the physical world enables humans to use words and sentences in context and to...
Zarrieß S, Schlangen D. Deriving continous grounded meaning representations from referentially struc...
We present a visually grounded model of speech perception which projects spoken utterances and image...
AbstractThe fundamental claim of this paper is that salience—both visual and linguistic—is an import...
Humans naturally use referring expressions with verbal utterances and nonverbal gestures to refer to...
The problem of how abstract symbols, such as those in sys-tems of natural language, may be grounded ...
Vorweg C, Wachsmuth S, Socher G. Visually grounded language processing in object reference. In: Rick...
The problem of how abstract symbols, such as those in systems of natural language, may be grounded i...
© 2016 John Benjamins Publishing Company. This is the accepted manuscript of a chapter published in ...
Referring expression comprehension aims at grounding the object in an image referred to by the expre...
Distributional semantic models capture word-level meaning that is useful in many natural language pr...
Artificial Intelligence (AI) technologies affect many facets of our daily lives. AI systems help us ...
In this paper we explore the use of visual common-sense knowledge and other kinds of knowledge (such...
We present a visually-grounded language understanding model based on a study of how people verbally ...
We propose a computational model of visually-grounded spatial language under-standing, based on a st...
Grounding language in the physical world enables humans to use words and sentences in context and to...
Zarrieß S, Schlangen D. Deriving continous grounded meaning representations from referentially struc...
We present a visually grounded model of speech perception which projects spoken utterances and image...
AbstractThe fundamental claim of this paper is that salience—both visual and linguistic—is an import...
Humans naturally use referring expressions with verbal utterances and nonverbal gestures to refer to...
The problem of how abstract symbols, such as those in sys-tems of natural language, may be grounded ...
Vorweg C, Wachsmuth S, Socher G. Visually grounded language processing in object reference. In: Rick...
The problem of how abstract symbols, such as those in systems of natural language, may be grounded i...
© 2016 John Benjamins Publishing Company. This is the accepted manuscript of a chapter published in ...
Referring expression comprehension aims at grounding the object in an image referred to by the expre...
Distributional semantic models capture word-level meaning that is useful in many natural language pr...
Artificial Intelligence (AI) technologies affect many facets of our daily lives. AI systems help us ...
In this paper we explore the use of visual common-sense knowledge and other kinds of knowledge (such...