In this paper, we address the problem of finding Named Entities in very large micropost datasets. We propose methods to generate a sample of representative microposts by discovering tweets that are likely to refer to new entities.Our approach is able to significantly speed-up the semanticanalysis process by discarding retweets, tweets without preidentifiable entities, as well similar and redundant tweets,while retaining information content.We apply the approach on a corpus of 1.4 billion microposts, using the IE services of AlchemyAPI, Calais, and Zemanta to identify more than 700, 000 unique entities. For the evaluation we compare runtime and number of entities extracted based on the full and the downscaled version of a micropostset. We ar...
Microblogs have become an invaluable source of information for the purpose of online reputation mana...
Microblogs have become an important source of information for the purpose of marketing, intelligence...
The large number of tweets generated daily is providing policy makers with means to obtain insights ...
In this paper we present an approach for extracting and linking entities from short and noisy microb...
Microposts are small fragments of social media content and a pop-ular medium for sharing facts, opin...
Social media has emerged to be an important source of informa-tion. Entity linking in social media p...
Nowadays microblogging sites, such as Twitter and Chinese Sina Weibo, have established themselves as...
Applying natural language processing for mining and intelligent information access to tweets (a form...
Microposts are small fragments of social media content and a popular medium for sharing facts, opini...
In recent years Twitter has become one of the largest online microblogging platforms. Microblogging ...
The popular microblogging service Twitter provides a vast amount of short messages that contains int...
Many applications that process social data, such as tweets, must extract entities from tweets (e.g.,...
Microblogs have become an important source of information for the purpose of marketing, intelligence...
Linking name mentions in microblog posts to a knowledge base, namely microblog entity linking, is us...
The large number of tweets generated daily is providing decision makers with means to obtain insight...
Microblogs have become an invaluable source of information for the purpose of online reputation mana...
Microblogs have become an important source of information for the purpose of marketing, intelligence...
The large number of tweets generated daily is providing policy makers with means to obtain insights ...
In this paper we present an approach for extracting and linking entities from short and noisy microb...
Microposts are small fragments of social media content and a pop-ular medium for sharing facts, opin...
Social media has emerged to be an important source of informa-tion. Entity linking in social media p...
Nowadays microblogging sites, such as Twitter and Chinese Sina Weibo, have established themselves as...
Applying natural language processing for mining and intelligent information access to tweets (a form...
Microposts are small fragments of social media content and a popular medium for sharing facts, opini...
In recent years Twitter has become one of the largest online microblogging platforms. Microblogging ...
The popular microblogging service Twitter provides a vast amount of short messages that contains int...
Many applications that process social data, such as tweets, must extract entities from tweets (e.g.,...
Microblogs have become an important source of information for the purpose of marketing, intelligence...
Linking name mentions in microblog posts to a knowledge base, namely microblog entity linking, is us...
The large number of tweets generated daily is providing decision makers with means to obtain insight...
Microblogs have become an invaluable source of information for the purpose of online reputation mana...
Microblogs have become an important source of information for the purpose of marketing, intelligence...
The large number of tweets generated daily is providing policy makers with means to obtain insights ...