Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex tasks into multi-step reasoning through intermediate steps in natural language form. Briefly, CoT enables LLMs to think step by step. However, although many Natural Language Understanding (NLU) tasks also require thinking step by step, LLMs perform less well than small-scale Masked Language Models (MLMs). To migrate CoT from LLMs to MLMs, we propose Chain-of-Thought Tuning (CoTT), a two-step reasoning framework based on prompt tuning, to implement step-by-step thinking for MLMs on NLU tasks. From the perspective of CoT, CoTT's two-step framework enables MLMs to implement task decomposition; CoTT's prompt tuning allows intermediate steps to be...
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language p...
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- signific...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought...
Large language models (LLMs) have achieved remarkable advancements in the field of natural language ...
Large language models (LLMs) can perform complex reasoning by generating intermediate reasoning step...
Chain-of-Thought(CoT) prompting and its variants explore equipping large language models (LLMs) with...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Existing text scaling methods often require a large corpus, struggle with short texts, or require la...
The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is ide...
Logical reasoning remains a pivotal component within the realm of artificial intelligence. The recen...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainabi...
To augment language models with the ability to reason, researchers usually prompt or finetune them t...
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language p...
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- signific...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought...
Large language models (LLMs) have achieved remarkable advancements in the field of natural language ...
Large language models (LLMs) can perform complex reasoning by generating intermediate reasoning step...
Chain-of-Thought(CoT) prompting and its variants explore equipping large language models (LLMs) with...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Existing text scaling methods often require a large corpus, struggle with short texts, or require la...
The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is ide...
Logical reasoning remains a pivotal component within the realm of artificial intelligence. The recen...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainabi...
To augment language models with the ability to reason, researchers usually prompt or finetune them t...
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language p...
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- signific...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...