This paper proposes an approach for using visual data profiling in tabular data cleaning and transformation processes. Visual data profiling is the statistical assessment of datasets to identify and visualize potential quality issues. The proposed approach was implemented in a software prototype and empirically validated in a usability study to determine to what extent visual data profiling is useful and how easy it is to use by data scientists. The study involved 24 users in a comparative usability test and 4 expert reviewers in cognitive walkthroughs. The evaluation results show that users find visual data profiling capabilities to be useful and easy to use in the process of data cleaning and transformation.acceptedVersio
The volume of data being published on the Web and made available as Open Data has significantly incr...
Bibliometric methods depend heavily on the quality of data, and cleaning and disambiguating data are...
Data cleaning is an action which includes a process of correcting and identifying the inconsistencie...
In this paper, we propose a tool that implements visual data profiling capabilities for data prepara...
Data quality management, especially data cleansing, has been extensively studied for many years in t...
In spite of advances in technologies for working with data, analysts still spend an inordinate amoun...
Cleaning data (i.e., making sure data contains no errors) can take a large part of a project’s lifet...
Reviewed by Mário SilvaData cleaning and Extract-Transform-Load processes are usually modeled as gra...
Data quality issues such as missing, erroneous, extreme and duplicate values undermine analysis and ...
Today, data plays an important role in people’s daily activities. With the help of some database app...
The goal of this thesis is to research to role of data profiling in data quality management and the ...
Abstract: Research on data quality is growing in importance in both industrial and academic communit...
The data processing results are commonly displayed in a dashboard with various graphic visualization...
Large and over the years grown databases are a persistent concern in the field of data quality. Data...
The availability of a large amount of data facilitates spreading a data-driven culture in which data...
The volume of data being published on the Web and made available as Open Data has significantly incr...
Bibliometric methods depend heavily on the quality of data, and cleaning and disambiguating data are...
Data cleaning is an action which includes a process of correcting and identifying the inconsistencie...
In this paper, we propose a tool that implements visual data profiling capabilities for data prepara...
Data quality management, especially data cleansing, has been extensively studied for many years in t...
In spite of advances in technologies for working with data, analysts still spend an inordinate amoun...
Cleaning data (i.e., making sure data contains no errors) can take a large part of a project’s lifet...
Reviewed by Mário SilvaData cleaning and Extract-Transform-Load processes are usually modeled as gra...
Data quality issues such as missing, erroneous, extreme and duplicate values undermine analysis and ...
Today, data plays an important role in people’s daily activities. With the help of some database app...
The goal of this thesis is to research to role of data profiling in data quality management and the ...
Abstract: Research on data quality is growing in importance in both industrial and academic communit...
The data processing results are commonly displayed in a dashboard with various graphic visualization...
Large and over the years grown databases are a persistent concern in the field of data quality. Data...
The availability of a large amount of data facilitates spreading a data-driven culture in which data...
The volume of data being published on the Web and made available as Open Data has significantly incr...
Bibliometric methods depend heavily on the quality of data, and cleaning and disambiguating data are...
Data cleaning is an action which includes a process of correcting and identifying the inconsistencie...