Classic natural language processing resources such as the Penn Treebank (Marcus et al. 1993) have long been used both as evaluation data for many linguistic tasks and as training data for a variety of off-the-shelf language processing tools. Recent work has highlighted a gender imbalance in the authors of this text data (Garimella et al. 2019) and hypothesized that tools created with such resources will privilege users from particular demographic groups (Hovy and Søgaard 2015). Domain adaptation is typically employed as a strategy in machine learning to adjust models trained and evaluated with data from different genres. However, the present work seeks to evaluate whether domain adaptation to demographic groups such as age or gender may be ...
In this work we show how large language models (LLMs) can learn statistical dependencies between oth...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Language modeling is widely used in the field of natural language processing (NLP) for tasks that re...
Classic natural language processing resources such as the Penn Treebank (Marcus et al. 1993) have lo...
Sociodemographic factors (e.g., gender or age) shape our language. Previous work showed that incorpo...
Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Mo...
The underlying traits of our demographic group affect and shape our thoughts, and therefore surface ...
Language usage varies across different demographic factors, such as gender, age, and geographic loca...
Recent advances in deep learning have greatly improved the ability of researchers to develop effecti...
In evolutionary linguistics (not to be confused with biolinguistics) (Steels 2011), languages are co...
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender...
In this article we evaluate claims that language structure adapts to sociolinguistic environment. We...
Misrepresentation of certain communities in current datasets is causing serious disruptions in artif...
Large Language Models (LLMs) have made substantial progress in the past several months, shattering s...
University of Minnesota M.A. thesis. June 2019. Major: Speech-Language Pathology. Advisor: Benjamin ...
In this work we show how large language models (LLMs) can learn statistical dependencies between oth...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Language modeling is widely used in the field of natural language processing (NLP) for tasks that re...
Classic natural language processing resources such as the Penn Treebank (Marcus et al. 1993) have lo...
Sociodemographic factors (e.g., gender or age) shape our language. Previous work showed that incorpo...
Extra-linguistic factors influence language use, and are accounted for by speakers and listeners. Mo...
The underlying traits of our demographic group affect and shape our thoughts, and therefore surface ...
Language usage varies across different demographic factors, such as gender, age, and geographic loca...
Recent advances in deep learning have greatly improved the ability of researchers to develop effecti...
In evolutionary linguistics (not to be confused with biolinguistics) (Steels 2011), languages are co...
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender...
In this article we evaluate claims that language structure adapts to sociolinguistic environment. We...
Misrepresentation of certain communities in current datasets is causing serious disruptions in artif...
Large Language Models (LLMs) have made substantial progress in the past several months, shattering s...
University of Minnesota M.A. thesis. June 2019. Major: Speech-Language Pathology. Advisor: Benjamin ...
In this work we show how large language models (LLMs) can learn statistical dependencies between oth...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Language modeling is widely used in the field of natural language processing (NLP) for tasks that re...