We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words etc.), and provide no control over granularity of profiles. We define syntactic profiling as a probl...
Representation of syntactic structure is a core area of research in Computational Linguistics, disam...
After a brief presentation of the data model, we describe a work in progress to define an initial se...
We motivate the need for dataset profiling in the context of evaluation, and show that textual datas...
We address the problem of learning a syntactic profile for a collection of s...
© 2018 IEEE. Many database columns contain string or numerical data that conforms to a pattern, such...
International audienceRepetitive tasks are most often tedious; in order to facilitate their executio...
Parsers – programs that extract structure from strings – are fundamental components of many software...
Datalog has witnessed promising applications in a variety of domains. We propose a programming-by-ex...
We present a method for profiling programs that are written using domain-specific languages. Instead...
Maintenance programming tasks often require answering questions such as "how is this data struc...
To Integrate The Benefits Of Statistical Methods Into Syntactic Pattern Recognition, A Bridging Appr...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
Syntactic databases are increasingly available and are put to a variety of uses, including serving a...
To integrate the benefits of statistical methods into syntactic pattern recognition, a Bridging Appr...
After a brief presentation of the data model, we describe a work in progress to define an initial se...
Representation of syntactic structure is a core area of research in Computational Linguistics, disam...
After a brief presentation of the data model, we describe a work in progress to define an initial se...
We motivate the need for dataset profiling in the context of evaluation, and show that textual datas...
We address the problem of learning a syntactic profile for a collection of s...
© 2018 IEEE. Many database columns contain string or numerical data that conforms to a pattern, such...
International audienceRepetitive tasks are most often tedious; in order to facilitate their executio...
Parsers – programs that extract structure from strings – are fundamental components of many software...
Datalog has witnessed promising applications in a variety of domains. We propose a programming-by-ex...
We present a method for profiling programs that are written using domain-specific languages. Instead...
Maintenance programming tasks often require answering questions such as "how is this data struc...
To Integrate The Benefits Of Statistical Methods Into Syntactic Pattern Recognition, A Bridging Appr...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
Syntactic databases are increasingly available and are put to a variety of uses, including serving a...
To integrate the benefits of statistical methods into syntactic pattern recognition, a Bridging Appr...
After a brief presentation of the data model, we describe a work in progress to define an initial se...
Representation of syntactic structure is a core area of research in Computational Linguistics, disam...
After a brief presentation of the data model, we describe a work in progress to define an initial se...
We motivate the need for dataset profiling in the context of evaluation, and show that textual datas...