International audienceLike for many text understanding and generation tasks, pre-trained languages models have emerged as a powerful approach for extracting information from business documents. However, their performance has not been properly studied in data-constrained settings which are often encountered in industrial applications. In this paper, we show that LayoutLM, a pre-trained model recently proposed for encoding 2D documents, reveals a high sample-efficiency when fine-tuned on public and real-world Information Extraction (IE) datasets. Indeed, LayoutLM reaches more than 80% of its full performance with as few as 32 documents for fine-tuning. When compared with a strong baseline learning IE from scratch, the pre-trained model needs ...
Nowadays, voluminous and unstructured textual data is found on the Internet that could provide varie...
This thesis proposes a joint Information-Extraction and Classification model for document analysis i...
The field of service automation is progressing rapidly, and increasingly complex tasks are being aut...
International audienceLike for many text understanding and generation tasks, pre-trained languages m...
International audienceTransformer-based Language Models are widely used in Natural Language Processi...
Transformer-based Language Models are widely used in Natural Language Processing related tasks. Than...
Key information extraction (KIE) from document images requires understanding the contextual and spat...
International audienceThe predominant approaches for extracting key information from documents resor...
Extracting information from documents usually relies on natural language processing methods working ...
The task of data-to-text generation amounts to describing structured data, such as RDF triples, in f...
In this paper, we examine an important recent rule-based information extraction (IE) technique named...
This chapter presents a model for knowledge extraction from documents written in natural language. T...
Due to the massive and increasing amount of documents received each day and the number of steps to p...
Information extraction (IE) plays a significant role in automating the knowledge acquisition process...
In this paper, we examine an important recent rule-based information extraction (IE) technique named...
Nowadays, voluminous and unstructured textual data is found on the Internet that could provide varie...
This thesis proposes a joint Information-Extraction and Classification model for document analysis i...
The field of service automation is progressing rapidly, and increasingly complex tasks are being aut...
International audienceLike for many text understanding and generation tasks, pre-trained languages m...
International audienceTransformer-based Language Models are widely used in Natural Language Processi...
Transformer-based Language Models are widely used in Natural Language Processing related tasks. Than...
Key information extraction (KIE) from document images requires understanding the contextual and spat...
International audienceThe predominant approaches for extracting key information from documents resor...
Extracting information from documents usually relies on natural language processing methods working ...
The task of data-to-text generation amounts to describing structured data, such as RDF triples, in f...
In this paper, we examine an important recent rule-based information extraction (IE) technique named...
This chapter presents a model for knowledge extraction from documents written in natural language. T...
Due to the massive and increasing amount of documents received each day and the number of steps to p...
Information extraction (IE) plays a significant role in automating the knowledge acquisition process...
In this paper, we examine an important recent rule-based information extraction (IE) technique named...
Nowadays, voluminous and unstructured textual data is found on the Internet that could provide varie...
This thesis proposes a joint Information-Extraction and Classification model for document analysis i...
The field of service automation is progressing rapidly, and increasingly complex tasks are being aut...