The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death. Machine learning models have been developed to automatically detect abusive language, but these models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time. This study aims to investigate the nature and impact of temporal bias in abusive language detection across various languages and explore mitigation methods. We evaluate the performance of models on abusive data sets from different time periods. Our results demonstrate that temporal bias is a significan...
Platforms that feature user-generated content (social media, online forums, newspaper comment sectio...
Detection of abusive language in user generated online con-tent has become an issue of increasing im...
Data Availability: The data used in this work is a public dataset.Copyright © The Author(s) 2023. So...
The availability of large annotated corpora from social media and the development of powerful classi...
We discuss the impact of data bias on abusive language detection. We show that classification scores...
The rise of online communication platforms has been accompanied by some undesirable effects, such as...
Datasets to train models for abusive language detection are at the same time necessary and still sca...
Online abusive language has been given increasing prominence as a societal problem over the past few...
We propose a new computational approach for tracking and detecting statistically significant linguis...
We propose a new computational approach for tracking and detecting statistically significant linguis...
Keeping the performance of language technologies optimal as time passes is of great practical intere...
International audienceRapidly changing social media content calls for robust and generalisable abuse...
Platforms that feature user-generated content (social media, online forums, newspaper comment sectio...
The automatic detection of hate speech online is an active research area in NLP. Most of the studies...
The datasets most widely used for abusive language detection contain lists of messages, usually twe...
Platforms that feature user-generated content (social media, online forums, newspaper comment sectio...
Detection of abusive language in user generated online con-tent has become an issue of increasing im...
Data Availability: The data used in this work is a public dataset.Copyright © The Author(s) 2023. So...
The availability of large annotated corpora from social media and the development of powerful classi...
We discuss the impact of data bias on abusive language detection. We show that classification scores...
The rise of online communication platforms has been accompanied by some undesirable effects, such as...
Datasets to train models for abusive language detection are at the same time necessary and still sca...
Online abusive language has been given increasing prominence as a societal problem over the past few...
We propose a new computational approach for tracking and detecting statistically significant linguis...
We propose a new computational approach for tracking and detecting statistically significant linguis...
Keeping the performance of language technologies optimal as time passes is of great practical intere...
International audienceRapidly changing social media content calls for robust and generalisable abuse...
Platforms that feature user-generated content (social media, online forums, newspaper comment sectio...
The automatic detection of hate speech online is an active research area in NLP. Most of the studies...
The datasets most widely used for abusive language detection contain lists of messages, usually twe...
Platforms that feature user-generated content (social media, online forums, newspaper comment sectio...
Detection of abusive language in user generated online con-tent has become an issue of increasing im...
Data Availability: The data used in this work is a public dataset.Copyright © The Author(s) 2023. So...