We present a benchmark suite of four datasets for evaluating the fairness of pre-trained legal language models and the techniques used to fine-tune them for downstream tasks. Our benchmarks cover four jurisdictions (European Council, USA, Swiss, and Chinese), five languages (English, German, French, Italian, and Chinese), and fairness across five attributes (gender, age, nationality/region, language, and legal area). In our experiments, we evaluate pre-trained language models using several group-robust fine-tuning techniques and show that performance group disparities are vibrant in many cases, while none of these techniques guarantee fairness, nor consistently mitigate group disparities. Furthermore, we provide a quantitative and qualitati...
Large, high-quality datasets are crucial for training Large Language Models (LLMs). However, so far,...
The recent advances of deep learning have dramatically changed how machine learning, especially in t...
Availability of challenging benchmarks is the key to advancement of AI in a specific field.Since Leg...
Massively multilingual pre-trained language models, such as mBERT and XLM-RoBERTa, have received sig...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
We introduce four new datasets within the Swiss jurisdiction for two classification and two generati...
Resolving the scope of a negation within a sentence is a challenging NLP task. The complexity of leg...
Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP fiel...
Recent strides in Large Language Models (LLMs) have saturated many NLP benchmarks (even professional...
Law, interpretations of law, legal arguments, agreements, etc. are typically expressed in writing, l...
Natural Language Processing (NLP) models have been used for more and more complex tasks such as Lega...
Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, ...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
Large, high-quality datasets are crucial for training Large Language Models (LLMs). However, so far,...
The recent advances of deep learning have dramatically changed how machine learning, especially in t...
Availability of challenging benchmarks is the key to advancement of AI in a specific field.Since Leg...
Massively multilingual pre-trained language models, such as mBERT and XLM-RoBERTa, have received sig...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
This benchmark dataset is published with the article: Ilias Chalkidis, Abhik Jana, Dirk Hartung, M...
We introduce four new datasets within the Swiss jurisdiction for two classification and two generati...
Resolving the scope of a negation within a sentence is a challenging NLP task. The complexity of leg...
Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP fiel...
Recent strides in Large Language Models (LLMs) have saturated many NLP benchmarks (even professional...
Law, interpretations of law, legal arguments, agreements, etc. are typically expressed in writing, l...
Natural Language Processing (NLP) models have been used for more and more complex tasks such as Lega...
Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, ...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
Large, high-quality datasets are crucial for training Large Language Models (LLMs). However, so far,...
The recent advances of deep learning have dramatically changed how machine learning, especially in t...
Availability of challenging benchmarks is the key to advancement of AI in a specific field.Since Leg...