Datasets to train models for abusive language detection are at the same time necessary and still scarce. One the reasons for their limited availability is the cost of their creation. It is not only that manual annotation is expensive, it is also the case that the phenomenon is sparse, causing human annotators having to go through a large number of irrelevant examples in order to obtain some significant data. Strategies used until now to increase density of abusive language and obtain more meaningful data overall, include data filtering on the basis of pre-selected keywords and hate-rich sources of data. We suggest a recipe that at the same time can provide meaningful data with possibly higher density of abusive language and also reduce top-...
Sources, in the form of selected Facebook pages, can be used as indicators of hate-rich content. Pol...
Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech...
The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has ...
Datasets to train models for abusive language detection are at the same time necessary and still sca...
In this paper, we introduce HateBERT, a re-trained BERT model for abusive language detection in Engl...
Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though...
As research on hate speech becomes more and more relevant every day, most of it is still focused on ...
As research on hate speech becomes more and more relevant every day, most of it is still focused on ...
Automated hate speech detection systems have great potential in the realm of social media but have s...
The automatic detection of hate speech online is an active research area in NLP. Most of the studies...
In this paper we present a proposal to address the problem of the pricey and unreliable human annota...
Abusive language is a concerning problem in online social media. Past research on detecting abusive ...
We present a method to generate polarised word embeddings using controversial topics as search terms...
Sources, in the form of selected Facebook pages, can be used as indicators of hate-rich content. Pol...
Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech...
The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has ...
Datasets to train models for abusive language detection are at the same time necessary and still sca...
In this paper, we introduce HateBERT, a re-trained BERT model for abusive language detection in Engl...
Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though...
As research on hate speech becomes more and more relevant every day, most of it is still focused on ...
As research on hate speech becomes more and more relevant every day, most of it is still focused on ...
Automated hate speech detection systems have great potential in the realm of social media but have s...
The automatic detection of hate speech online is an active research area in NLP. Most of the studies...
In this paper we present a proposal to address the problem of the pricey and unreliable human annota...
Abusive language is a concerning problem in online social media. Past research on detecting abusive ...
We present a method to generate polarised word embeddings using controversial topics as search terms...
Sources, in the form of selected Facebook pages, can be used as indicators of hate-rich content. Pol...
Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech...
The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has ...