The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author’s geographic location from raw text, outperforming both text regression and supervised topic models.</p
The huge amount of textual data available in digital form in today’s world increases the need for me...
Principal component analysis (PCA) and related techniques have been success-fully employed in natura...
Population counts and longitude and latitude coordinates were estimated for the 50 largest cities in...
In this paper we present a new computational technique to detect and analyze statistically significa...
Tracking how discussion topics evolve in social media and where these topics are discussed geographi...
In automatically categorizing massive corpora of text, various topic models have been applied with g...
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the re...
Geographical location is vital to geospatial applications like local search and event detection. In ...
Social media users share billions of items per year, only a small fraction of which is geotagged. We...
Twitter is often used in quantitative stud-ies that identify geographically-preferred topics, writin...
Predicting the locations of non-geotagged tweets is an active research area in geographical informat...
Trabajo presentado en el Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDi...
Abstract—Having access to content of messages sent by some given group of subscribers of a social ne...
Language in social media is rich with linguistic innovations, most strikingly in the new words and s...
Electronic social media offers new opportunities for informal communication in written language, whi...
The huge amount of textual data available in digital form in today’s world increases the need for me...
Principal component analysis (PCA) and related techniques have been success-fully employed in natura...
Population counts and longitude and latitude coordinates were estimated for the 50 largest cities in...
In this paper we present a new computational technique to detect and analyze statistically significa...
Tracking how discussion topics evolve in social media and where these topics are discussed geographi...
In automatically categorizing massive corpora of text, various topic models have been applied with g...
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the re...
Geographical location is vital to geospatial applications like local search and event detection. In ...
Social media users share billions of items per year, only a small fraction of which is geotagged. We...
Twitter is often used in quantitative stud-ies that identify geographically-preferred topics, writin...
Predicting the locations of non-geotagged tweets is an active research area in geographical informat...
Trabajo presentado en el Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDi...
Abstract—Having access to content of messages sent by some given group of subscribers of a social ne...
Language in social media is rich with linguistic innovations, most strikingly in the new words and s...
Electronic social media offers new opportunities for informal communication in written language, whi...
The huge amount of textual data available in digital form in today’s world increases the need for me...
Principal component analysis (PCA) and related techniques have been success-fully employed in natura...
Population counts and longitude and latitude coordinates were estimated for the 50 largest cities in...