International audienceThis paper discusses a method to detect statistically significant linguistic differences between corpora while factoring in possible variability within the very corpora to be compared. Specifically, we compare two small corpora of dialects of Even, Bystraja and Lamunkhin Even, in an attempt to identify morphemes that are more frequent in either of the corpora. To investigate whether this difference might be due to an over-representation of a speaker who happens to be an outlier in terms of using a particular morpheme, we use DP, a measurement of evenness of the distribution of a specific linguistic feature across subcorpora of the same corpus
The primary data on pronunciation variation – e.g., dialect atlas data – is often recorded incommens...
In this paper, I introduce methodologies to tap corpora for exploring aggregate linguistic distances...
AbstractResearch in dialectal variation allows linguists to understand the fundamental principles th...
We develop an aggregate measure of syntactic difference for automatically finding common syntactic d...
We develop an aggregate measure of syn-tactic difference for automatically finding typical syntactic...
In this paper a range of methods for measuring the phonetic distance between dialectal variants are ...
This project measures and classifies language variation. In contrast to earlier dialectology, we see...
Some of the main aims of dialectology have always been the division of a given geographic space into...
This paper is concerned with sketching future directions for corpus-based dialectology. We advocate ...
In this work, we discuss the benefits of using automatically parsed corpora to study language variat...
This paper describes a method of comparing corpora which uses frequency profiling. The method can be...
In this paper, the prototype system Vis-À-Vis to support linguists in their compari-son of regional...
The primary data on pronunciation variation — e.g., dialect atlas data — is often recorded incommens...
ABSTRACT: This paper is concerned with sketching future directions for corpus-based dialectology. We...
In the past, linguistic research was typically conducted on relatively small datasets that were spec...
The primary data on pronunciation variation – e.g., dialect atlas data – is often recorded incommens...
In this paper, I introduce methodologies to tap corpora for exploring aggregate linguistic distances...
AbstractResearch in dialectal variation allows linguists to understand the fundamental principles th...
We develop an aggregate measure of syntactic difference for automatically finding common syntactic d...
We develop an aggregate measure of syn-tactic difference for automatically finding typical syntactic...
In this paper a range of methods for measuring the phonetic distance between dialectal variants are ...
This project measures and classifies language variation. In contrast to earlier dialectology, we see...
Some of the main aims of dialectology have always been the division of a given geographic space into...
This paper is concerned with sketching future directions for corpus-based dialectology. We advocate ...
In this work, we discuss the benefits of using automatically parsed corpora to study language variat...
This paper describes a method of comparing corpora which uses frequency profiling. The method can be...
In this paper, the prototype system Vis-À-Vis to support linguists in their compari-son of regional...
The primary data on pronunciation variation — e.g., dialect atlas data — is often recorded incommens...
ABSTRACT: This paper is concerned with sketching future directions for corpus-based dialectology. We...
In the past, linguistic research was typically conducted on relatively small datasets that were spec...
The primary data on pronunciation variation – e.g., dialect atlas data – is often recorded incommens...
In this paper, I introduce methodologies to tap corpora for exploring aggregate linguistic distances...
AbstractResearch in dialectal variation allows linguists to understand the fundamental principles th...