Language models are ubiquitous in current NLP, and their multilingual capacity has recently attracted considerable attention. However, current analyses have almost exclusively focused on (multilingual variants of) standard benchmarks, and have relied on clean pre-training and task-specific corpora as multilingual signals. In this paper, we introduce XLM-T, a model to train and evaluate multilingual language models in Twitter. In this paper we provide: (1) a new strong multilingual baseline consisting of an XLM-R (Conneau et al. 2020) model pre-trained on millions of tweets in over thirty languages, alongside starter code to subsequently fine-tune on a target task; and (2) a set of unified sentiment analysis Twitter datasets in eight differe...
We study subjective language in social media and create Twitter-specific lexi-cons via bootstrapping...
Word embeddings represent words in a numeric space in such a way that semantic relations between wor...
This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Inti...
Sentiment analysis is currently a very dynamic field in Computational Linguistics. Research herein h...
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on l...
Twitter Sentiment Analysis is one of the leading research fields nowadays. Most of the researchers h...
The digitalization of almost all aspects of our everyday lives has led to unprecedented amounts of d...
The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators...
Twitter sentiment analysis is one of the leading research fields. Most of the researchers were contr...
In online domain-specific customer service applications, many companies struggle to deploy advanced ...
Emotions are experienced and expressed differently across the world. In order to use Large Language ...
The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part...
Since BERT appeared, Transformer language models and transfer learning have become state-of-the-art ...
We present TwHIN-BERT, a multilingual language model trained on in-domain data from the popular soci...
We carried out a study in which we explored the feasibility of machine translation for Twitter for t...
We study subjective language in social media and create Twitter-specific lexi-cons via bootstrapping...
Word embeddings represent words in a numeric space in such a way that semantic relations between wor...
This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Inti...
Sentiment analysis is currently a very dynamic field in Computational Linguistics. Research herein h...
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on l...
Twitter Sentiment Analysis is one of the leading research fields nowadays. Most of the researchers h...
The digitalization of almost all aspects of our everyday lives has led to unprecedented amounts of d...
The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators...
Twitter sentiment analysis is one of the leading research fields. Most of the researchers were contr...
In online domain-specific customer service applications, many companies struggle to deploy advanced ...
Emotions are experienced and expressed differently across the world. In order to use Large Language ...
The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part...
Since BERT appeared, Transformer language models and transfer learning have become state-of-the-art ...
We present TwHIN-BERT, a multilingual language model trained on in-domain data from the popular soci...
We carried out a study in which we explored the feasibility of machine translation for Twitter for t...
We study subjective language in social media and create Twitter-specific lexi-cons via bootstrapping...
Word embeddings represent words in a numeric space in such a way that semantic relations between wor...
This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Inti...