In this paper, we describe Docforia, a multilayer document model and application programming interface (API) to store formatting, lexical, syntactic, and semantic annotations on Wikipedia and other kinds of text and visualize them. While Wikipedia has become a major NLP resource, its scale and heterogeneity makes it relatively difficult to do experimentations on the whole corpus. These experimentations are rendered even more complexas,to the best of our knowledge,there is no available tool to visualize easily the results of a processing pipeline. We designed Docforia so that it can store millions of documents and billions of tokens, annotated using different processing tools,that themselves use multiple formats, and compatible with cluster c...
Wikipedia has become one of the most popular resources in natural language processing and it is used...
While Wikipedia exists in 287 languages, its content is unevenly distributed among them. It is there...
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. F...
In this paper, we describe Docforia, a multilayer document model and application programming interfa...
In this paper, we describe Docforia, a multilayer document model and application programming interfa...
In this paper, we describe Langforia, a multilingual processing pipeline to annotate texts with mult...
In this paper, we describe KOSHIK, an end-to-end framework to process the unstructured natural langu...
In this paper, we describe a new system to extract, index, search, and visualize entities on Wikiped...
This paper describes SW1, the first version of a semantically annotated snapshot of the EnglishWikip...
{zesch,gurevych,max} (at) tk.informatik.tu-darmstadt.de Abstract. We analyze Wikipedia as a lexical ...
Abstract. The advent of Wikipedia as the best digital representation of cross-domain knowledge is no...
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes i...
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes i...
WikiWoods is an ongoing initiative to provide rich syntacto-semantic annotations for English Wikiped...
Wikipedia is a goldmine of information. Each article describes a single concept, and together they c...
Wikipedia has become one of the most popular resources in natural language processing and it is used...
While Wikipedia exists in 287 languages, its content is unevenly distributed among them. It is there...
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. F...
In this paper, we describe Docforia, a multilayer document model and application programming interfa...
In this paper, we describe Docforia, a multilayer document model and application programming interfa...
In this paper, we describe Langforia, a multilingual processing pipeline to annotate texts with mult...
In this paper, we describe KOSHIK, an end-to-end framework to process the unstructured natural langu...
In this paper, we describe a new system to extract, index, search, and visualize entities on Wikiped...
This paper describes SW1, the first version of a semantically annotated snapshot of the EnglishWikip...
{zesch,gurevych,max} (at) tk.informatik.tu-darmstadt.de Abstract. We analyze Wikipedia as a lexical ...
Abstract. The advent of Wikipedia as the best digital representation of cross-domain knowledge is no...
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes i...
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes i...
WikiWoods is an ongoing initiative to provide rich syntacto-semantic annotations for English Wikiped...
Wikipedia is a goldmine of information. Each article describes a single concept, and together they c...
Wikipedia has become one of the most popular resources in natural language processing and it is used...
While Wikipedia exists in 287 languages, its content is unevenly distributed among them. It is there...
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. F...