When users combine data from multiple sources into a spreadsheet or dataset, the result is often a mishmash of different formats, since phone numbers, dates, course numbers and other string-like kinds of data can each be written in many different formats. Although spreadsheets provide features for reformatting numbers and a few specific kinds of string data, they do not provide any support for the wide range of other kinds of string data encountered by users. We describe a user interface where a user can describe the formats of each kind of data. We provide an algorithm that uses these formats to automatically generate reformatting rules that transform strings from one format to another. In effect, our system enables users to create a small...
The number of end-users who write spreadsheet programs is at least an order of magnitude larger tha...
These are a collection of XLSX sheets containing some of my favorite Excel tricks to reformat data t...
Text collections can be represented mathematically as term-document matrices. A term-document ma-tri...
End-user programming tools offer no data types except “string ” for many categories of data, such as...
The number of end-users who write spreadsheet programs is at least an order of magnitude larger tha...
<p>Spreadsheets are widely used in industry. It is estimated that end-user programmers outnumber reg...
Format transformation is one of the most labor intensive tasks of a data wrangling process. Recent a...
We address the problem of performing semantic transfor-mations on strings, which may represent a var...
Abstract Millions of computer end users need to perform tasks over large spreadsheet data, yet lack ...
Data restructuring is often an integral but non-trivial part of information processing, especially w...
Raw data set for our paper entitled: The high resource impact of reformatting requirements for scien...
Abstract. Numbers are one of the most widely used data type in pro-gramming languages. Number transf...
Motivation A challenge for researchers at CBCS is the ability to efficiently manage the different da...
© 2016 IEEE.Data wrangling is the term used by data scientists for the work of re-organising data in...
Spreadsheets are used by millions of knowledge workers as a routine all-purpose tool for the storage...
The number of end-users who write spreadsheet programs is at least an order of magnitude larger tha...
These are a collection of XLSX sheets containing some of my favorite Excel tricks to reformat data t...
Text collections can be represented mathematically as term-document matrices. A term-document ma-tri...
End-user programming tools offer no data types except “string ” for many categories of data, such as...
The number of end-users who write spreadsheet programs is at least an order of magnitude larger tha...
<p>Spreadsheets are widely used in industry. It is estimated that end-user programmers outnumber reg...
Format transformation is one of the most labor intensive tasks of a data wrangling process. Recent a...
We address the problem of performing semantic transfor-mations on strings, which may represent a var...
Abstract Millions of computer end users need to perform tasks over large spreadsheet data, yet lack ...
Data restructuring is often an integral but non-trivial part of information processing, especially w...
Raw data set for our paper entitled: The high resource impact of reformatting requirements for scien...
Abstract. Numbers are one of the most widely used data type in pro-gramming languages. Number transf...
Motivation A challenge for researchers at CBCS is the ability to efficiently manage the different da...
© 2016 IEEE.Data wrangling is the term used by data scientists for the work of re-organising data in...
Spreadsheets are used by millions of knowledge workers as a routine all-purpose tool for the storage...
The number of end-users who write spreadsheet programs is at least an order of magnitude larger tha...
These are a collection of XLSX sheets containing some of my favorite Excel tricks to reformat data t...
Text collections can be represented mathematically as term-document matrices. A term-document ma-tri...