Instruction tuning has remarkably advanced large language models (LLMs) in understanding and responding to diverse human instructions. Despite the success in high-resource languages, its application in lower-resource ones faces challenges due to the imbalanced foundational abilities of LLMs across different languages, stemming from the uneven language distribution in their pre-training data. To tackle this issue, we propose pivot language guided generation (PLUG), an approach that utilizes a high-resource language, primarily English, as the pivot to enhance instruction tuning in lower-resource languages. It trains the model to first process instructions in the pivot language, and then produce responses in the target language. To evaluate ou...
Neural Machine Translation has been shown to enable in-ference and cross-lingual knowledge transfer ...
Chinese and Spanish are the most spoken languages in the world. However, there is not much research ...
Pivoting through a popular language with more parallel corpora available (e.g. English and Chinese) ...
A key technology for the development of large language models (LLMs) involves instruction tuning tha...
Open-sourced large language models (LLMs) have demonstrated remarkable efficacy in various tasks wit...
Pre-trained multilingual language models show significant performance gains for zero-shot cross-ling...
We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Taggi...
Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language u...
Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Pivoting through a popular language with more parallel corpora available (e.g. English and Chinese) ...
Large Language Models (LLMs), trained predominantly on extensive English data, often exhibit limitat...
In cross-lingual language understanding, machine translation is often utilized to enhance the transf...
Translation with pivot languages has recently gained attention as a means to circumvent the data bot...
The large language model (LLM) has garnered significant attention due to its in-context learning mec...
Neural Machine Translation has been shown to enable in-ference and cross-lingual knowledge transfer ...
Chinese and Spanish are the most spoken languages in the world. However, there is not much research ...
Pivoting through a popular language with more parallel corpora available (e.g. English and Chinese) ...
A key technology for the development of large language models (LLMs) involves instruction tuning tha...
Open-sourced large language models (LLMs) have demonstrated remarkable efficacy in various tasks wit...
Pre-trained multilingual language models show significant performance gains for zero-shot cross-ling...
We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Taggi...
Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language u...
Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Pivoting through a popular language with more parallel corpora available (e.g. English and Chinese) ...
Large Language Models (LLMs), trained predominantly on extensive English data, often exhibit limitat...
In cross-lingual language understanding, machine translation is often utilized to enhance the transf...
Translation with pivot languages has recently gained attention as a means to circumvent the data bot...
The large language model (LLM) has garnered significant attention due to its in-context learning mec...
Neural Machine Translation has been shown to enable in-ference and cross-lingual knowledge transfer ...
Chinese and Spanish are the most spoken languages in the world. However, there is not much research ...
Pivoting through a popular language with more parallel corpora available (e.g. English and Chinese) ...