Memorization and Generalization in Neural Code Intelligence Models

Rabin, Md Rafiqul Islam
Hussain, Aftab
Alipour, Mohammad Amin
Hellendoorn, Vincent J.

Open PDF

Open link

Publication date

September 2022

DOI

10.1016/j.infsof.2022.107066

Publisher

Elsevier BV

Language

English

Abstract

Deep Neural Networks (DNNs) are increasingly being used in software engineering and code intelligence tasks. These are powerful tools that are capable of learning highly generalizable patterns from large datasets through millions of parameters. At the same time, their large capacity can render them prone to memorizing data points. Recent work suggests that the memorization risk manifests especially strongly when the training dataset is noisy, involving many ambiguous or questionable samples, and memorization is the only recourse. The goal of this paper is to evaluate and compare the extent of memorization and generalization in neural code intelligence models. It aims to provide insights on how memorization may impact the learning behavior o...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Memorization and Generalization in Neural Code Intelligence Models

Abstract

Extracted data

Memorization and Generalization in Neural Code Intelligence Models

Abstract

Extracted data

Related items

Related items