Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines. Language identification is considered a solved task in many cases; however, in the case of very closely related languages, or in an unsupervised scenario (where the languages are not known in advance), performance is still poor. In this paper, we propose the Unsupervised Deep Language and Dialect Identification (UDLDI) method, which can simultaneously learn sentence embeddings and cluster assignments from short texts. The UDLDI model understands the sentence constructions of languages by applying attention to character relations which helps t...
AbstractIn this work, we present a comprehensive study on the use of deep neural networks (DNNs) for...
Language Identification (LID) is the task of automatically identifying the language of speech signal...
Native language identification (NLI) is the task of determining the native language of an author wri...
Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely rela...
This paper describes three approaches to the task of automatically identifying the language a text i...
Language identification is a simple problem that becomes much more difficult when its usual assumpti...
The world is growing more connected through the use of online communication, exposing software and h...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
Abstract. This paper describes the participation of UAIC team at the LogCLEF 2011 initiative, langua...
This work addresses the use of deep neural networks (DNNs) in automatic language identification (LID...
This article describes an unsupervised language model (LM) adaptation approach that can be used to e...
Abstract—Language Identification is the process of determining in which natural language the content...
This paper extends the work of Cavnar and Trenkle N-gram text categorization [2], enhances the study...
We present a statistical approach to text-based automatic language identification that focuses on di...
This thesis is a normative study on various approaches within native language identification (NLI), ...
AbstractIn this work, we present a comprehensive study on the use of deep neural networks (DNNs) for...
Language Identification (LID) is the task of automatically identifying the language of speech signal...
Native language identification (NLI) is the task of determining the native language of an author wri...
Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely rela...
This paper describes three approaches to the task of automatically identifying the language a text i...
Language identification is a simple problem that becomes much more difficult when its usual assumpti...
The world is growing more connected through the use of online communication, exposing software and h...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
Abstract. This paper describes the participation of UAIC team at the LogCLEF 2011 initiative, langua...
This work addresses the use of deep neural networks (DNNs) in automatic language identification (LID...
This article describes an unsupervised language model (LM) adaptation approach that can be used to e...
Abstract—Language Identification is the process of determining in which natural language the content...
This paper extends the work of Cavnar and Trenkle N-gram text categorization [2], enhances the study...
We present a statistical approach to text-based automatic language identification that focuses on di...
This thesis is a normative study on various approaches within native language identification (NLI), ...
AbstractIn this work, we present a comprehensive study on the use of deep neural networks (DNNs) for...
Language Identification (LID) is the task of automatically identifying the language of speech signal...
Native language identification (NLI) is the task of determining the native language of an author wri...