The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation. State-of-the-art approaches in this area combine LLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics cannot well capture the semantic features of the execution results, such as data type and value range, which often indicates the correctness of the program. In this work, we propose LEVER, a simple approach to improve language-to-code generation by learning to verify the generated programs with their execution results. Specifically, we train verif...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Reasoning over natural language is a long-standing goal for the research community. However, studies...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
The advent of large language models trained on code (code LLMs) has led to significant progress in l...
Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on hum...
Generative models of code, pretrained on large corpora of programs, have shown great success in tran...
Synthesizing inductive loop invariants is fundamental to automating program verification. In this wo...
Program synthesis or code generation aims to generate a program that satisfies a problem specificati...
The increasing popularity of large language models (LLMs) has paved the way for their application in...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
With the growing popularity of Large Language Models (e.g. GitHub Copilot, ChatGPT, etc.) in softwar...
Large language models are becoming increasingly practical for translating code across programming la...
Unit tests play a key role in ensuring the correctness of software. However, manually creating unit ...
One of the critical phases in software development is software testing. Testing helps with identifyi...
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical mult...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Reasoning over natural language is a long-standing goal for the research community. However, studies...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
The advent of large language models trained on code (code LLMs) has led to significant progress in l...
Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on hum...
Generative models of code, pretrained on large corpora of programs, have shown great success in tran...
Synthesizing inductive loop invariants is fundamental to automating program verification. In this wo...
Program synthesis or code generation aims to generate a program that satisfies a problem specificati...
The increasing popularity of large language models (LLMs) has paved the way for their application in...
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension a...
With the growing popularity of Large Language Models (e.g. GitHub Copilot, ChatGPT, etc.) in softwar...
Large language models are becoming increasingly practical for translating code across programming la...
Unit tests play a key role in ensuring the correctness of software. However, manually creating unit ...
One of the critical phases in software development is software testing. Testing helps with identifyi...
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical mult...
High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collec...
Reasoning over natural language is a long-standing goal for the research community. However, studies...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...