Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on dif...
Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quant...
The ability to follow instructions is crucial for Large Language Models (LLMs) to handle various rea...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
Prompt-based learning has been an effective paradigm for large pretrained language models (LLM), ena...
Black-Box Tuning (BBT) is a derivative-free approach to optimize continuous prompt tokens prepended ...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Instruction tuning is instrumental in enabling Large Language Models~(LLMs) to follow user instructi...
Recently, Instruction fine-tuning has risen to prominence as a potential method for enhancing the ze...
Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been shown to achieve remarkable ...
Enhancing the zero-shot performance of instruction-following models requires heavy computation, eith...
The practice of transferring knowledge from a sophisticated, proprietary large language model (LLM) ...
This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization fo...
Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge...
Sparse Mixture-of-Experts (MoE) is a neural architecture design that can be utilized to add learnabl...
Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quant...
The ability to follow instructions is crucial for Large Language Models (LLMs) to handle various rea...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
Prompt-based learning has been an effective paradigm for large pretrained language models (LLM), ena...
Black-Box Tuning (BBT) is a derivative-free approach to optimize continuous prompt tokens prepended ...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Instruction tuning is instrumental in enabling Large Language Models~(LLMs) to follow user instructi...
Recently, Instruction fine-tuning has risen to prominence as a potential method for enhancing the ze...
Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been shown to achieve remarkable ...
Enhancing the zero-shot performance of instruction-following models requires heavy computation, eith...
The practice of transferring knowledge from a sophisticated, proprietary large language model (LLM) ...
This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization fo...
Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge...
Sparse Mixture-of-Experts (MoE) is a neural architecture design that can be utilized to add learnabl...
Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quant...
The ability to follow instructions is crucial for Large Language Models (LLMs) to handle various rea...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...