Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table colu...
Abstract: Weblogs and other platforms used to organize a social life online have achieved an enormou...
Predicting which entities are likely to be mentioned in scientific articles is a task with significa...
Assigning the submitted text to one of the predetermined categories is required when dealing with ap...
The usefulness of tabular data such as web tables critically depends on understanding their semantic...
Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a...
We propose a new deep neural network architecture, TabNet, for table type classification. Table type...
Natural language interfaces to databases (NLIDB) has been a research topic for a decade. Significant...
Text prediction is the task of suggesting text while the user is typing. Its main aim is to reduce t...
There is an increasing amount of text data available on the web with multiple topical granularities;...
Text classification is a fundamental language task in Natural Language Processing. A variety of sequ...
Linked Open Data (LOD) and social media often contain the representations of the same real-world ent...
As more and more data are generated in daily life, traditional data analysis methods reach their bot...
Relational Web tables have become an important resource for applications such as factual search and ...
International audienceThis study investigates the value added by incorporating textual data into cus...
This work concerns the processing of a corpus made up of a financial weekly column. Specifically, we...
Abstract: Weblogs and other platforms used to organize a social life online have achieved an enormou...
Predicting which entities are likely to be mentioned in scientific articles is a task with significa...
Assigning the submitted text to one of the predetermined categories is required when dealing with ap...
The usefulness of tabular data such as web tables critically depends on understanding their semantic...
Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a...
We propose a new deep neural network architecture, TabNet, for table type classification. Table type...
Natural language interfaces to databases (NLIDB) has been a research topic for a decade. Significant...
Text prediction is the task of suggesting text while the user is typing. Its main aim is to reduce t...
There is an increasing amount of text data available on the web with multiple topical granularities;...
Text classification is a fundamental language task in Natural Language Processing. A variety of sequ...
Linked Open Data (LOD) and social media often contain the representations of the same real-world ent...
As more and more data are generated in daily life, traditional data analysis methods reach their bot...
Relational Web tables have become an important resource for applications such as factual search and ...
International audienceThis study investigates the value added by incorporating textual data into cus...
This work concerns the processing of a corpus made up of a financial weekly column. Specifically, we...
Abstract: Weblogs and other platforms used to organize a social life online have achieved an enormou...
Predicting which entities are likely to be mentioned in scientific articles is a task with significa...
Assigning the submitted text to one of the predetermined categories is required when dealing with ap...