Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of artificial systems on various natural language processing tasks to human-like levels. However, their generalisation and robustness to perform logical reasoning remain under-evaluated. To probe this ability, we propose three new logical reasoning datasets named "ReClor-plus", "LogiQA-plus" and "LogiQAv2-plus", each featuring three subsets: the first with randomly shuffled options, the second with the correct choices replaced by "none of the other options are correct", and a combination of the previous two subsets. We carry out experiments on these datasets with both discriminative and generative LLMs and show that these simple tricks greatly hin...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract reasoning is a key ability for an intelligent system. Large language models achieve above-c...
Recently, large language models (LLMs), including notable models such as GPT-4 and burgeoning commun...
Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge ...
Combining large language models with logical reasoning enhance their capacity to address problems in...
Large language models (LLMs) have gained enormous attention from both academia and industry, due to ...
Logical reasoning remains a pivotal component within the realm of artificial intelligence. The recen...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
The development of highly fluent large language models (LLMs) has prompted increased interest in ass...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
The derivation of mathematical results in specialised fields, using Large Language Models (LLMs), is...
Large language models (LLMs) have significantly advanced the field of natural language processing, w...
Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainabi...
The impressive recent performance of large language models has led many to wonder to what extent the...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract reasoning is a key ability for an intelligent system. Large language models achieve above-c...
Recently, large language models (LLMs), including notable models such as GPT-4 and burgeoning commun...
Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge ...
Combining large language models with logical reasoning enhance their capacity to address problems in...
Large language models (LLMs) have gained enormous attention from both academia and industry, due to ...
Logical reasoning remains a pivotal component within the realm of artificial intelligence. The recen...
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought pro...
The development of highly fluent large language models (LLMs) has prompted increased interest in ass...
Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reprod...
The derivation of mathematical results in specialised fields, using Large Language Models (LLMs), is...
Large language models (LLMs) have significantly advanced the field of natural language processing, w...
Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainabi...
The impressive recent performance of large language models has led many to wonder to what extent the...
Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capabi...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Abstract reasoning is a key ability for an intelligent system. Large language models achieve above-c...