International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into BERT embeddings without retraining or finetuning data, based on Part-Of-Speech (POS). Although fine-tuning is the most popular method to apply BERT models on domain datasets, it remains expensive in terms of training time, computing resources, training data selection and retraining frequency. Our alternative works at the preprocessing level and relies on POS tagging sentences. It gives interesting results for words similarity regarding out-of-vocabulary both in terms of domain-specific words and misspellings. More specifically, the experiments were done on French language, but we believe that they would be similar on others
This paper presents a methodology for improving part-of-speech disambiguation using word classes. We...
Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such a...
Large pretrained masked language models have become state-of-theart solutions for many NLP problems....
International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into B...
We describe the system submitted to SemEval-2020 Task 6, Subtask 1. The aim of this subtask is to pr...
International audienceLanguage model pretrained representation are now ubiquitous in Natural Languag...
Defending models against Natural Language Processing adversarial attacks is a challenge because of t...
Nowadays, contextual language models can solve a wide range of language tasks such as text classific...
We present a simple yet effective approach to adapt part-of-speech (POS) taggers to new domains. Our...
International audienceLanguage models have proven to be very useful when adapted to specific domains...
Pre-trained language models such as BERT (Bidirectional Encoder Representations from Transformers) h...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
Large pre-trained masked language models have become state-of-the-art solutions for many NLP problem...
Semantic similarity detection is a fundamental task in natural language understanding. Adding topic ...
Language identification is the task of automatically determining the identity of a language conveyed...
This paper presents a methodology for improving part-of-speech disambiguation using word classes. We...
Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such a...
Large pretrained masked language models have become state-of-theart solutions for many NLP problems....
International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into B...
We describe the system submitted to SemEval-2020 Task 6, Subtask 1. The aim of this subtask is to pr...
International audienceLanguage model pretrained representation are now ubiquitous in Natural Languag...
Defending models against Natural Language Processing adversarial attacks is a challenge because of t...
Nowadays, contextual language models can solve a wide range of language tasks such as text classific...
We present a simple yet effective approach to adapt part-of-speech (POS) taggers to new domains. Our...
International audienceLanguage models have proven to be very useful when adapted to specific domains...
Pre-trained language models such as BERT (Bidirectional Encoder Representations from Transformers) h...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
Large pre-trained masked language models have become state-of-the-art solutions for many NLP problem...
Semantic similarity detection is a fundamental task in natural language understanding. Adding topic ...
Language identification is the task of automatically determining the identity of a language conveyed...
This paper presents a methodology for improving part-of-speech disambiguation using word classes. We...
Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such a...
Large pretrained masked language models have become state-of-theart solutions for many NLP problems....