This repository contains a ground truth corpus for open domain relation extraction from sentences, acquired with crowdsourcing and processed with CrowdTruth metrics that capture ambiguity in annotations by measuring inter-annotator disagreement. The dataset contains annotations for 4,100 sentences sampled from Angeli et al. (1) and Riedel et al. (2), over 16 relations, with each sentence annotated by 15 workers. The sentences have been pre-processed with Distant Supervision (3) using the Freebase knowledge base, in order to identify the term pairs in each sentence that are likely to express a relation. The crowdsourced data was collected from Figure Eight and Amazon Mechanical Turk. This corpus has been discussed in the following papers: ...
This paper describes a crowdsourcing experiment on the annotation of plot-like structures in En- gli...
Abstract. Extracting information from Web pages for populating large, cross-domain knowledge bases r...
Crowdsourcing is a common strategy for collecting the “gold standard ” labels required for many natu...
<p>One of the rst steps in most web data analytics is creating a human annotated ground truth, typic...
<p>The lack of annotated datasets for training and benchmarking is one of the main challenges of Cli...
<p>The lack of annotated datasets for training and benchmarking is one of the main challenges of Cli...
Cognitive computing systems require human-labeled data for evaluation and often for training. The st...
A widespread use of linked data for information extraction is distant supervision, in which relation...
We present a new large dataset of 12403 context-sensitive verb relations manually annotated via crow...
Abstract. This paper proposes an approach to gathering semantic an-notation, which rejects the notio...
Abstract. In this paper, we introduce the CrowdTruth open-source soft-ware framework for machine-hum...
Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a...
The process of gathering ground truth data through human annotation is a major bottleneck in the use...
Information Extraction is an important task in Natural Language Processing, consisting of finding a ...
This paper describes a crowdsourcing experiment on the annotation of plot-like structures in En- gli...
This paper describes a crowdsourcing experiment on the annotation of plot-like structures in En- gli...
Abstract. Extracting information from Web pages for populating large, cross-domain knowledge bases r...
Crowdsourcing is a common strategy for collecting the “gold standard ” labels required for many natu...
<p>One of the rst steps in most web data analytics is creating a human annotated ground truth, typic...
<p>The lack of annotated datasets for training and benchmarking is one of the main challenges of Cli...
<p>The lack of annotated datasets for training and benchmarking is one of the main challenges of Cli...
Cognitive computing systems require human-labeled data for evaluation and often for training. The st...
A widespread use of linked data for information extraction is distant supervision, in which relation...
We present a new large dataset of 12403 context-sensitive verb relations manually annotated via crow...
Abstract. This paper proposes an approach to gathering semantic an-notation, which rejects the notio...
Abstract. In this paper, we introduce the CrowdTruth open-source soft-ware framework for machine-hum...
Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a...
The process of gathering ground truth data through human annotation is a major bottleneck in the use...
Information Extraction is an important task in Natural Language Processing, consisting of finding a ...
This paper describes a crowdsourcing experiment on the annotation of plot-like structures in En- gli...
This paper describes a crowdsourcing experiment on the annotation of plot-like structures in En- gli...
Abstract. Extracting information from Web pages for populating large, cross-domain knowledge bases r...
Crowdsourcing is a common strategy for collecting the “gold standard ” labels required for many natu...