Formulaic sequences in language use are often studied by means of the automatic identification of frequently recurring series of words, often referred to as ‘lexical bundles’, in corpora that contrast different registers, academic disciplines etc. As corpora often differ in size, a critically important assumption in this field states that the use of a normalized frequency threshold, such as 20 occurrences per million words, allows for an accurate comparison of corpora of different sizes. Yet, several researchers have argued that normalization may be unreliable when applied to frequency threshold. The study investigates this issue by comparing the number of lexical bundles identified in corpora that differ only in size. Using two complementa...
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the ...
Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
In lexical bundle research, it has been a common practice to extract and compare lexical bundles acr...
This methodological study uses an extension of the Fisher's exact test to sequences longer than two ...
Comparing frequency counts over texts or corpora is an important task in many applications and scien...
<p>Equating corpus sizes (left) resulted in average word frequencies that were comparable across lan...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
Abstract Word frequency statistics and lexical diversity measures can provide insights into discour...
Abstract. Comparing frequency counts over texts or corpora is an im-portant task in many application...
Word frequencies in natural language follow a Zipfian dis-tribution. Artificial language experiments...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
This paper examines data from English, Swedish and German in order to find a theoretical distributio...
Abstract. Comparing frequency counts over texts or corpora is an im-portant task in many application...
Zipf\u27s law states that given some text, the frequency of any word is inversely proportional to it...
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the ...
Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
In lexical bundle research, it has been a common practice to extract and compare lexical bundles acr...
This methodological study uses an extension of the Fisher's exact test to sequences longer than two ...
Comparing frequency counts over texts or corpora is an important task in many applications and scien...
<p>Equating corpus sizes (left) resulted in average word frequencies that were comparable across lan...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
Abstract Word frequency statistics and lexical diversity measures can provide insights into discour...
Abstract. Comparing frequency counts over texts or corpora is an im-portant task in many application...
Word frequencies in natural language follow a Zipfian dis-tribution. Artificial language experiments...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
This paper examines data from English, Swedish and German in order to find a theoretical distributio...
Abstract. Comparing frequency counts over texts or corpora is an im-portant task in many application...
Zipf\u27s law states that given some text, the frequency of any word is inversely proportional to it...
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the ...
Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...