Abstract. In this paper, we investigate strategies for automatically classifying documents in different languages thematically, geographically or according to other criteria. A novel linguistically motivated text representation scheme is presented that can be used with machine learning algorithms in order to learn classifications from pre-classified examples and then automatically classify documents that might be provided in entirely different languages. Our approach makes use of ontologies and lexical resources but goes beyond a simple mapping from terms to concepts by fully exploiting the external knowledge manifested in such resources and mapping to entire regions of concepts. For this, a graph traversal algorithm is used to explore rela...
6th Italian Information Retrieval Workshop, Cagliari, ITA, 25-/05/2015 - 26/05/2015International aud...
[[abstract]]This paper describes our work on developing a language-independent technique for discove...
Thesis (Master's)--University of Washington, 2018Sometimes, annotating data for text classification ...
In this paper, we investigate strategies for automatically classifying documents in different langua...
This article addresses the question of how to deal with text categorization when the set of document...
The number of multilingual texts in the World Wide Web (WWW) is increasing dramatically and a multil...
Cross-language Text Categorization is the task of assigning semantic classes to documents written i...
Abstract. This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which ari...
International audienceThis article deals with multilingual document indexing. We propose an indexing...
The patent describes a method and a system for generating classifiers from multilingual corpora incl...
In a multilingual scenario, the classical monolingual text categorization problem can be reformulate...
This paper deals with various methods for multilingual document categorization and informs about the...
[[abstract]]Due to the availability of a huge amount of textual data from a variety of sources, user...
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics...
Due to the globalization on the Web, many companies and institutions need to efficiently organize an...
6th Italian Information Retrieval Workshop, Cagliari, ITA, 25-/05/2015 - 26/05/2015International aud...
[[abstract]]This paper describes our work on developing a language-independent technique for discove...
Thesis (Master's)--University of Washington, 2018Sometimes, annotating data for text classification ...
In this paper, we investigate strategies for automatically classifying documents in different langua...
This article addresses the question of how to deal with text categorization when the set of document...
The number of multilingual texts in the World Wide Web (WWW) is increasing dramatically and a multil...
Cross-language Text Categorization is the task of assigning semantic classes to documents written i...
Abstract. This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which ari...
International audienceThis article deals with multilingual document indexing. We propose an indexing...
The patent describes a method and a system for generating classifiers from multilingual corpora incl...
In a multilingual scenario, the classical monolingual text categorization problem can be reformulate...
This paper deals with various methods for multilingual document categorization and informs about the...
[[abstract]]Due to the availability of a huge amount of textual data from a variety of sources, user...
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics...
Due to the globalization on the Web, many companies and institutions need to efficiently organize an...
6th Italian Information Retrieval Workshop, Cagliari, ITA, 25-/05/2015 - 26/05/2015International aud...
[[abstract]]This paper describes our work on developing a language-independent technique for discove...
Thesis (Master's)--University of Washington, 2018Sometimes, annotating data for text classification ...