Multilingual posts can potentially affect the outcomes of content analysis on microblog platforms. To this end, language identification can provide a monolingual set of content for analysis. We find the unedited and idiomatic language of microblogs to be challenging for state-of-the-art language identification methods. To account for this, we identify five microblog characteristics that can help in language identification: the language profile of the blogger (blogger), the content of an attached hyperlink (link), the language profile of other users mentioned (mention) in the post, the language profile of a tag (tag), and the language of the original post (conversation), if the post we examine is a reply. Further, we present methods that com...
User-generated content has become a re-current resource for NLP tools and ap-plications, hence many ...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...
Abstract Multilingual posts can potentially affect the outcomes of content analysis on microblog pla...
Offering access to information in microblog posts requires successful language identification. Langu...
Offering access to information in microblog posts requires suc-cessful language identification. Lang...
We present an evaluation of “off-the-shelf ” language identification systems as applied to microblog...
Microblogging websites, such as Twitter, provide seemingly endless amount of textual information on ...
Open-source software available (Microblog Explorer: https://github.com/adbar/microblog-explorer)Inte...
Microblogging websites, such as Twitter, provide seem-ingly endless amount of textual information on...
Automatic Language Identification (LI) is a widely addressed task, but not all users (for example li...
The popularity of microblogging platforms, such as Twitter, renders them valuable real-time informat...
In recent years Twitter has become one of the largest online microblogging platforms. Microblogging ...
The popularity of microblogging platforms, such as Twitter, ren-ders them valuable real-time informa...
The anthology Microblogs global is an international study of Twitter. Fifteen researchers examined ...
User-generated content has become a re-current resource for NLP tools and ap-plications, hence many ...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...
Abstract Multilingual posts can potentially affect the outcomes of content analysis on microblog pla...
Offering access to information in microblog posts requires successful language identification. Langu...
Offering access to information in microblog posts requires suc-cessful language identification. Lang...
We present an evaluation of “off-the-shelf ” language identification systems as applied to microblog...
Microblogging websites, such as Twitter, provide seemingly endless amount of textual information on ...
Open-source software available (Microblog Explorer: https://github.com/adbar/microblog-explorer)Inte...
Microblogging websites, such as Twitter, provide seem-ingly endless amount of textual information on...
Automatic Language Identification (LI) is a widely addressed task, but not all users (for example li...
The popularity of microblogging platforms, such as Twitter, renders them valuable real-time informat...
In recent years Twitter has become one of the largest online microblogging platforms. Microblogging ...
The popularity of microblogging platforms, such as Twitter, ren-ders them valuable real-time informa...
The anthology Microblogs global is an international study of Twitter. Fifteen researchers examined ...
User-generated content has become a re-current resource for NLP tools and ap-plications, hence many ...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...
In this paper we describe how Twitter is used in various languages. We observe notable differences b...