The popularity of Python programming language has surged in recent years due to its increasing usage in Data Science. The availability of Python repositories in Github presents an opportunity for mining software repository research, e.g., suggesting the best practices in developing Data Science applications, identifying bug-patterns, recommending code enhancements, etc. To enable this research, we have created a new dataset that includes 1,558 mature Github projects that develop Python software for Data Science tasks. By analyzing the metadata and code, we have included the projects in our dataset which use a diverse set of machine learning libraries and managed by a variety of users and organizations. The dataset is made publicly available...
This repository contains the dataset of the manuscript: "An Empirical Study on the Usage and Availa...
The article is devoted to the experience of creating a specialized tool for Data Mining, built on th...
Machine learning is becoming an increasingly important part of many domains, both inside and outside...
Mining software repositories provides developers and researchers a chance to learn from previous dev...
Python programming for Data Scientists Preface Python programming language is an open source progr...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub...
AbstractThis paper introduces a recently published Python data mining book (chapters, topics, sample...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub...
Programming language researchers often study real-world projects to see how language features have b...
Data Science with Python will help you get comfortable with using the Python environment for data sc...
In the thesis we compare the systems for data mining that have an interface in the programming langu...
This paper introduces a recently published Python data mining book (chapters, topics, samples of Pyt...
Mining source code has become a common task for re-searchers and yielded significant benefits for th...
Today’s software development environment has migrated to online software repositories due to the nee...
Data Science Projects with Python will help you get comfortable with using the Python environment fo...
This repository contains the dataset of the manuscript: "An Empirical Study on the Usage and Availa...
The article is devoted to the experience of creating a specialized tool for Data Mining, built on th...
Machine learning is becoming an increasingly important part of many domains, both inside and outside...
Mining software repositories provides developers and researchers a chance to learn from previous dev...
Python programming for Data Scientists Preface Python programming language is an open source progr...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub...
AbstractThis paper introduces a recently published Python data mining book (chapters, topics, sample...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub...
Programming language researchers often study real-world projects to see how language features have b...
Data Science with Python will help you get comfortable with using the Python environment for data sc...
In the thesis we compare the systems for data mining that have an interface in the programming langu...
This paper introduces a recently published Python data mining book (chapters, topics, samples of Pyt...
Mining source code has become a common task for re-searchers and yielded significant benefits for th...
Today’s software development environment has migrated to online software repositories due to the nee...
Data Science Projects with Python will help you get comfortable with using the Python environment fo...
This repository contains the dataset of the manuscript: "An Empirical Study on the Usage and Availa...
The article is devoted to the experience of creating a specialized tool for Data Mining, built on th...
Machine learning is becoming an increasingly important part of many domains, both inside and outside...