Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainability of large language models (LLMs). However, uncertainties remain about how reasoning strategies formulated for previous model generations generalize to new model generations and different datasets. In this small-scale study, we compare different reasoning strategies induced by zero-shot prompting across six recently released LLMs (davinci-002, davinci-003, GPT-3.5-turbo, GPT-4, Flan-T5-xxl and Cohere command-xlarge) on a mixture of six question-answering datasets, including datasets from scientific and medical domains. Our findings demonstrate that while some variations in effectiveness occur, gains from CoT reasoning strategies remain rob...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Large language models (LLMs) have achieved remarkable advancements in the field of natural language ...
How can prompting a large language model like GPT-3 with explanations improve in-context learning? W...
Large language models (LLMs) can perform complex reasoning by generating intermediate reasoning step...
Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results ac...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- signific...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Most existing chain-of-thought (CoT) prompting methods suffer from the issues of generalizability an...
Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is ide...
Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of ar...
Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex ...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Large language models (LLMs) have achieved remarkable advancements in the field of natural language ...
How can prompting a large language model like GPT-3 with explanations improve in-context learning? W...
Large language models (LLMs) can perform complex reasoning by generating intermediate reasoning step...
Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results ac...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- signific...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
Most existing chain-of-thought (CoT) prompting methods suffer from the issues of generalizability an...
Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought...
Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generatin...
The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is ide...
Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of ar...
Chain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex ...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Large language models (LLMs) have achieved remarkable advancements in the field of natural language ...
How can prompting a large language model like GPT-3 with explanations improve in-context learning? W...