Recent work has demonstrated that pre-training in-domain language models can boost performance when adapting to a new domain. However, the costs associated with pre-training raise an important question: given a fixed budget, what steps should an NLP practitioner take to maximize performance? In this paper, we view domain adaptation with a constrained budget as a consumer choice problem, where the goal is to select an optimal combination of data annotation and pre-training. We measure annotation costs of three procedural text datasets, along with the pre-training costs of several in-domain language models. The utility of different combinations of pre-training and data annotation are evaluated under varying budget constraints to assess which ...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
The best performing NLP models to date are learned from large volumes of manually-annotated data. Fo...
• Porting to new domains or applications is expensive • Current technology requires IE experts • Exp...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Natural language processing (NLP) algorithms are rapidly improving but often struggle when applied t...
Pretrained language models have become the standard approach for many NLP tasks due to strong perfor...
In this paper, we propose a new domain adaptation technique for neural machine translation called co...
Pretrained language models have shown success in various areas of natural language processing, inclu...
The remarkable success of large language models has been driven by dense models trained on massive u...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, L...
As the demand for sophisticated Natural Language Processing (NLP) models continues to grow, so does ...
Neural network training has been shown to be advantageous in many natural language processing appli...
We study the highly practical but comparatively under-studied problem of latent-domain adaptation, w...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
The best performing NLP models to date are learned from large volumes of manually-annotated data. Fo...
• Porting to new domains or applications is expensive • Current technology requires IE experts • Exp...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Natural language processing (NLP) algorithms are rapidly improving but often struggle when applied t...
Pretrained language models have become the standard approach for many NLP tasks due to strong perfor...
In this paper, we propose a new domain adaptation technique for neural machine translation called co...
Pretrained language models have shown success in various areas of natural language processing, inclu...
The remarkable success of large language models has been driven by dense models trained on massive u...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, L...
As the demand for sophisticated Natural Language Processing (NLP) models continues to grow, so does ...
Neural network training has been shown to be advantageous in many natural language processing appli...
We study the highly practical but comparatively under-studied problem of latent-domain adaptation, w...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
The best performing NLP models to date are learned from large volumes of manually-annotated data. Fo...
• Porting to new domains or applications is expensive • Current technology requires IE experts • Exp...