Early work in the computational treatment of natural language focused on summariza-tion, and machine translation. In my research I have concentrated on the area of summariza-tion of documents in different languages. This thesis presents my work on multi-lingual text similarity. This work enables the identification of short units of text (usually sentences) that contain similar information even though they are written in different languages. I present my work on SimFinderML, a framework for multi-lingual text similarity computation that makes it easy to experiment with parameters for similarity computation and add support for other languages. An in-depth examination and evaluation of the system is performed using Arabic and English data. I a...
We present a comprehensive study of computing similarity between texts. We start from the observatio...
A measure of similarity is required to find and compare cross-lingual articles concerning a specific...
Wikipedia has been used as a source of comparable texts for a range of tasks, such as Statistical Ma...
We present a new approach for summarizing topically clustered documents from two sources, English an...
We present a new approach for summarizing clusters of documents on the same event, some of which are...
We present a new composite similarity metric that combines information from multiple linguistic indi...
The recent advances in multimedia and web-based applications have eased the accessibility to large c...
Abstract: Similarities for textual data The evaluation of similarities between textual entities (do...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
The technology of summarizing documents automatically is increasing rapidly and may give an answer f...
Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either...
Document similarity is basic for Information Retrieval. Cross Lingual (CL) similarity is important f...
Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either...
Quantifying the similarity or dissimilarity between documents is an important task in authorship att...
International audienceThis paper describes the use of second order similarities for identifying simi...
We present a comprehensive study of computing similarity between texts. We start from the observatio...
A measure of similarity is required to find and compare cross-lingual articles concerning a specific...
Wikipedia has been used as a source of comparable texts for a range of tasks, such as Statistical Ma...
We present a new approach for summarizing topically clustered documents from two sources, English an...
We present a new approach for summarizing clusters of documents on the same event, some of which are...
We present a new composite similarity metric that combines information from multiple linguistic indi...
The recent advances in multimedia and web-based applications have eased the accessibility to large c...
Abstract: Similarities for textual data The evaluation of similarities between textual entities (do...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
The technology of summarizing documents automatically is increasing rapidly and may give an answer f...
Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either...
Document similarity is basic for Information Retrieval. Cross Lingual (CL) similarity is important f...
Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either...
Quantifying the similarity or dissimilarity between documents is an important task in authorship att...
International audienceThis paper describes the use of second order similarities for identifying simi...
We present a comprehensive study of computing similarity between texts. We start from the observatio...
A measure of similarity is required to find and compare cross-lingual articles concerning a specific...
Wikipedia has been used as a source of comparable texts for a range of tasks, such as Statistical Ma...