A model regarding the lifetime of individual source code lines or tokens can estimate maintenance effort, guide preventive maintenance, and, more broadly, identify factors that can improve the efficiency of software development. We present methods and tools that allow tracking of each line's or token's birth and death. Through them, we analyze 3.3 billion source code element lifetime events in 89 revision control repositories. Statistical analysis shows that code lines are durable, with a median lifespan of about 2.3 years, and that young lines are more likely to be modified or deleted, following a Weibull distribution with the associated hazard rate decreasing over time. This behavior appears to be independent from specific characteris...
It is well known that maintenance is the most expensive stage of the software life cycle. Most large...
Many prediction models rely on past data about how a system evolves to learn and anticipate the numb...
If anything good can be said to have come from the Year 2000 systems problem, it is that it has crea...
A model regarding the lifetime of individual source code lines or tokens can estimate maintenance ef...
A central feature of the evolution of large software systems is that change -- which is necessary to...
In contrast to physically engineered artefacts, software does not deteriorate through use. Code qual...
An anti-pattern is a commonly occurring solution that will always have negative consequences, when a...
Preprint of paper published in: 16th European Conference on Software Maintenance and Reengineering (...
Code cloning is a controversial software engineering practice due to contradictory claims regarding ...
The data files and source code available here (68GB uncompressed) have been used for studying the ev...
We study the evolution of the largest known corpus of publicly available source code, i.e., the Soft...
Multiple studies found that developer questions about the history of code were among the hardest and...
Abstract—Maintainability is a desirable property of software, and a variety of metrics have been pro...
There are several previous studies in which machine learning algorithms are used to predict how faul...
Code reading is one of the most frequent activities in software maintenance. Such an activity aims a...
It is well known that maintenance is the most expensive stage of the software life cycle. Most large...
Many prediction models rely on past data about how a system evolves to learn and anticipate the numb...
If anything good can be said to have come from the Year 2000 systems problem, it is that it has crea...
A model regarding the lifetime of individual source code lines or tokens can estimate maintenance ef...
A central feature of the evolution of large software systems is that change -- which is necessary to...
In contrast to physically engineered artefacts, software does not deteriorate through use. Code qual...
An anti-pattern is a commonly occurring solution that will always have negative consequences, when a...
Preprint of paper published in: 16th European Conference on Software Maintenance and Reengineering (...
Code cloning is a controversial software engineering practice due to contradictory claims regarding ...
The data files and source code available here (68GB uncompressed) have been used for studying the ev...
We study the evolution of the largest known corpus of publicly available source code, i.e., the Soft...
Multiple studies found that developer questions about the history of code were among the hardest and...
Abstract—Maintainability is a desirable property of software, and a variety of metrics have been pro...
There are several previous studies in which machine learning algorithms are used to predict how faul...
Code reading is one of the most frequent activities in software maintenance. Such an activity aims a...
It is well known that maintenance is the most expensive stage of the software life cycle. Most large...
Many prediction models rely on past data about how a system evolves to learn and anticipate the numb...
If anything good can be said to have come from the Year 2000 systems problem, it is that it has crea...