Large Language Models (LLMs) have been a significant landmark of Artificial Intelligence (AI) advancement. Aligning LLMs to be helpful and harmless is a booming trend in Natural Language Processing (NLP). One of the dominant alignment techniques is reinforcement learning from human feedback (RLHF). RLHF aims to optimize one objective based on human preferences. However, the cost of high-quality human feedback is enormous. Having all human annotators consistent in their opinions on desirable behaviors is also challenging. LLM alignment is intrinsically a multi-objective optimization task since the goal is to train models to be helpful and harmless. It is found that helpfulness and harmlessness sometimes have problems in trade-offs, making it...
Large Language Models (LLMs) are central to a multitude of applications but struggle with significan...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) a...
While Reinforcement Learning from Human Feedback (RLHF) aligns Large Language Models (LLMs) with gen...
With the development of large language models (LLMs), striking a balance between the performance and...
Reinforcement learning from human feedback (RLHF) is effective at aligning large language models (LL...
We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If...
Large Multimodal Models (LMM) are built across modalities and the misalignment between two modalitie...
Recent years have witnessed remarkable progress made in large language models (LLMs). Such advanceme...
In this paper, we introduce a new approach to Reinforcement Learning (RL) called “supervised attenti...
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align wi...
Deploying learning systems in the real-world requires aligning their objectives with those of the hu...
A key technology for the development of large language models (LLMs) involves instruction tuning tha...
One of the important goals of Artificial Intelligence (AI) is to mimic the ability of humans to leve...
Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models...
Large Language Models (LLMs) are central to a multitude of applications but struggle with significan...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) a...
While Reinforcement Learning from Human Feedback (RLHF) aligns Large Language Models (LLMs) with gen...
With the development of large language models (LLMs), striking a balance between the performance and...
Reinforcement learning from human feedback (RLHF) is effective at aligning large language models (LL...
We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If...
Large Multimodal Models (LMM) are built across modalities and the misalignment between two modalitie...
Recent years have witnessed remarkable progress made in large language models (LLMs). Such advanceme...
In this paper, we introduce a new approach to Reinforcement Learning (RL) called “supervised attenti...
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align wi...
Deploying learning systems in the real-world requires aligning their objectives with those of the hu...
A key technology for the development of large language models (LLMs) involves instruction tuning tha...
One of the important goals of Artificial Intelligence (AI) is to mimic the ability of humans to leve...
Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models...
Large Language Models (LLMs) are central to a multitude of applications but struggle with significan...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) a...