Large language models (LLMs) have achieved remarkable advancements in the field of natural language processing. However, the sheer scale and computational demands of these models present formidable challenges when considering their practical deployment in resource-constrained contexts. While techniques such as chain-of-thought (CoT) distillation have displayed promise in distilling LLMs into small language models (SLMs), there is a risk that distilled SLMs may still carry over flawed reasoning or hallucinations inherited from their LLM counterparts. To address these issues, we propose a twofold methodology: First, we introduce a novel method for distilling the self-evaluation capability inherent in LLMs into SLMs, which aims to mitigate the...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results ac...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
We explore the intriguing possibility that theory of mind (ToM), or the uniquely human ability to im...
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tas...
In the present study, we investigate and compare reasoning in large language models (LLM) and humans...
Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge ...
Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex ...
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging ...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
While large language models (LLMs) exhibit impressive language understanding and in-context learning...
We present an empirical evaluation of various outputs generated by nine of the most widely-available...
The development of highly fluent large language models (LLMs) has prompted increased interest in ass...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results ac...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
We explore the intriguing possibility that theory of mind (ToM), or the uniquely human ability to im...
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tas...
In the present study, we investigate and compare reasoning in large language models (LLM) and humans...
Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge ...
Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex ...
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging ...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
While large language models (LLMs) exhibit impressive language understanding and in-context learning...
We present an empirical evaluation of various outputs generated by nine of the most widely-available...
The development of highly fluent large language models (LLMs) has prompted increased interest in ass...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results ac...