Automatic text alignment is an important problem in natural language processing. It can be used to create the data needed to train different language models. Most research about automatic summarization revolves around summarizing news articles or scientific papers, which are somewhat small texts with simple and clear structure. The bigger the difference in size between the summary and the original text, the harder the problem will be since important information will be sparser and identifying them can be more difficult. Therefore, creating datasets from larger texts can help improve automatic summarization. In this project, we try to develop an algorithm which can automatically create a dataset for abstractive automatic summarization for bi...
Books are a rich source of both fine-grained information, how a character, an object or a scene look...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2016.Today we encounter la...
Automatic summarization is the process of presenting the contents of written documents in a short, c...
We consider the unsupervised alignment of the full text of a book with a human-written summary. This...
We consider the unsupervised alignment of the full text of a book with a human-written summary. This...
Abstract. Aligning texts and their multi-document summaries is the task of determining the correspon...
Current research in automatic single-document summarization is dominated by two effective, yet naı̈v...
The technology of summarizing documents automatically is increasing rapidly and may give an answer f...
In this paper we describe an alignment system that aligns English-Hindi texts at the sentence and wo...
In this paper we describe a statistical tech-nique for aligning sentences with their translations in...
We address the problem of sentence alignment for monolingual corpora, a phenomenon distinct from ali...
Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales...
Comparable or parallel corpora are beneficial for many NLP tasks. The automatic collection of corpo...
Humans spend a large amount of time listening, watching, and reading stories. We argue that the abil...
Among the well-known accessibility services for audiovisual media are subtitling for the deaf and ha...
Books are a rich source of both fine-grained information, how a character, an object or a scene look...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2016.Today we encounter la...
Automatic summarization is the process of presenting the contents of written documents in a short, c...
We consider the unsupervised alignment of the full text of a book with a human-written summary. This...
We consider the unsupervised alignment of the full text of a book with a human-written summary. This...
Abstract. Aligning texts and their multi-document summaries is the task of determining the correspon...
Current research in automatic single-document summarization is dominated by two effective, yet naı̈v...
The technology of summarizing documents automatically is increasing rapidly and may give an answer f...
In this paper we describe an alignment system that aligns English-Hindi texts at the sentence and wo...
In this paper we describe a statistical tech-nique for aligning sentences with their translations in...
We address the problem of sentence alignment for monolingual corpora, a phenomenon distinct from ali...
Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales...
Comparable or parallel corpora are beneficial for many NLP tasks. The automatic collection of corpo...
Humans spend a large amount of time listening, watching, and reading stories. We argue that the abil...
Among the well-known accessibility services for audiovisual media are subtitling for the deaf and ha...
Books are a rich source of both fine-grained information, how a character, an object or a scene look...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2016.Today we encounter la...
Automatic summarization is the process of presenting the contents of written documents in a short, c...