International audienceNewspapers are documents made of news item and informative articles. They are not meant to be red iteratively: the reader can pick his items in any order he fancies. Ignoring this structural property, most digitized newspaper archives only offer access by issue or at best by page to their content. We have built a digitization workflow that automatically extracts newspaper articles from images, which allows indexing and retrieval of information at the article level. Our back-end system extracts the logical structure of the page to produce the informative units: the articles. Each image is labelled at the pixel level, through a machine learning based method, then the page logical structure is constructed up from there by...
In recent decades, major efforts to digitize historical documents led to the creation of large machi...
Abstract. Many historical newspapers are being digitized. We aim to support access to them via text ...
This work introduces a practical method for performing logical layout analysis on heterogeneous peri...
Mass digitization and the opening of digital libraries gave access to a huge amount of historical ne...
Digitisation projects preserve and make available vast quantities of historical text. Among these, n...
International audienceBackground. In recent years, libraries and archives led important digitisation...
Background. In recent years, libraries and archives led importantdigitisation campaigns that opened ...
In the analysis of a newspaper page an important step is the clustering of various text blocks into ...
Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered materi...
Background. In recent years, libraries and archives led important digitisation campaigns that opened...
The massive amounts of digitized historical documents acquired over the last decades naturally lend ...
In recent years, libraries and archives led important digitisation campaigns that opened the access ...
The primary information units in a newspaper are the articles. Article reconstruction from newspaper...
Abstract—This paper presents a novel learning based frame-work to extract articles from newspaper im...
Digitization of newspapers is of interest for many reasons including preservation of history, access...
In recent decades, major efforts to digitize historical documents led to the creation of large machi...
Abstract. Many historical newspapers are being digitized. We aim to support access to them via text ...
This work introduces a practical method for performing logical layout analysis on heterogeneous peri...
Mass digitization and the opening of digital libraries gave access to a huge amount of historical ne...
Digitisation projects preserve and make available vast quantities of historical text. Among these, n...
International audienceBackground. In recent years, libraries and archives led important digitisation...
Background. In recent years, libraries and archives led importantdigitisation campaigns that opened ...
In the analysis of a newspaper page an important step is the clustering of various text blocks into ...
Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered materi...
Background. In recent years, libraries and archives led important digitisation campaigns that opened...
The massive amounts of digitized historical documents acquired over the last decades naturally lend ...
In recent years, libraries and archives led important digitisation campaigns that opened the access ...
The primary information units in a newspaper are the articles. Article reconstruction from newspaper...
Abstract—This paper presents a novel learning based frame-work to extract articles from newspaper im...
Digitization of newspapers is of interest for many reasons including preservation of history, access...
In recent decades, major efforts to digitize historical documents led to the creation of large machi...
Abstract. Many historical newspapers are being digitized. We aim to support access to them via text ...
This work introduces a practical method for performing logical layout analysis on heterogeneous peri...