Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems, even solving some competitive-programming problems. Self-play has proven useful in games such as Go, and thus it is natural to ask whether LMs can generate their own instructive programming problems to improve their performance. We show that it is possible for an LM to synthesize programming problems and solutions, which are filtered for correctness by a Python interpreter. The LM's performance is then seen to improve when it is fine-tuned on its own synthetic problems and verified solutions; thus the model 'improves itself' using the Python interpreter. Problems are specified formally as programming puzzles [Schuster et...
Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndr...
We study whether language models can evaluate the validity of their own claims and predict which que...
Large language models are becoming increasingly practical for translating code across programming la...
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical mult...
This paper systematically investigates the generation of code explanations by Large Language Models ...
The advent of large language models trained on code (code LLMs) has led to significant progress in l...
Few-shot learning with large-scale, pre-trained language models is a powerful way to answer question...
Large language models have demonstrated outstanding performance on a wide range of tasks such as que...
Program synthesis or code generation aims to generate a program that satisfies a problem specificati...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Through their transfer learning abilities, highly-parameterized large pre-trained language models ha...
Despite the success of large pre-trained language models (LMs) such as Codex, they show below-par pe...
Recently, scores of high-performing code generation systems have surfaced. As has become a popular c...
Humans understand language by extracting information (meaning) from sentences, combining it with exi...
Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndr...
We study whether language models can evaluate the validity of their own claims and predict which que...
Large language models are becoming increasingly practical for translating code across programming la...
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical mult...
This paper systematically investigates the generation of code explanations by Large Language Models ...
The advent of large language models trained on code (code LLMs) has led to significant progress in l...
Few-shot learning with large-scale, pre-trained language models is a powerful way to answer question...
Large language models have demonstrated outstanding performance on a wide range of tasks such as que...
Program synthesis or code generation aims to generate a program that satisfies a problem specificati...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Through their transfer learning abilities, highly-parameterized large pre-trained language models ha...
Despite the success of large pre-trained language models (LMs) such as Codex, they show below-par pe...
Recently, scores of high-performing code generation systems have surfaced. As has become a popular c...
Humans understand language by extracting information (meaning) from sentences, combining it with exi...
Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndr...
We study whether language models can evaluate the validity of their own claims and predict which que...
Large language models are becoming increasingly practical for translating code across programming la...