Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al 2011 Proc. Nat. Acad. Sci. 108 3825). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths accordi...
Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The stru...
The procedure that predicts the mean information per letter in a long text by adding the constraint ...
We review some recent progress on the characterisation of long-range patterns of word use in languag...
Recently, it has been claimed that a linear relationship between a measure of information content an...
Based on data from a large-scale experiment with human subjects, we conclude that the logarithm of...
The results of a tabulation of word frequencies in a sample of written English are analyzed in terms...
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study o...
Brevity and frequency are two crucial factors in the processes of statistical learning. The compress...
Brevity and frequency are two crucial factors in the processes of statistical learning in language. ...
Zipf’s law of abbreviation, which posits a negative correlation between word frequency and length, i...
Written language is a complex communication signal capable of conveying information encoded in the f...
In 1935 the linguist George Kingsley Zipf made a now classic observation about the relationship bet...
We present an impossibility result, called a theorem about facts and words, which pertains to a gene...
A family of information theoretic models of communication was introduced more than a decade ago to e...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The stru...
The procedure that predicts the mean information per letter in a long text by adding the constraint ...
We review some recent progress on the characterisation of long-range patterns of word use in languag...
Recently, it has been claimed that a linear relationship between a measure of information content an...
Based on data from a large-scale experiment with human subjects, we conclude that the logarithm of...
The results of a tabulation of word frequencies in a sample of written English are analyzed in terms...
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study o...
Brevity and frequency are two crucial factors in the processes of statistical learning. The compress...
Brevity and frequency are two crucial factors in the processes of statistical learning in language. ...
Zipf’s law of abbreviation, which posits a negative correlation between word frequency and length, i...
Written language is a complex communication signal capable of conveying information encoded in the f...
In 1935 the linguist George Kingsley Zipf made a now classic observation about the relationship bet...
We present an impossibility result, called a theorem about facts and words, which pertains to a gene...
A family of information theoretic models of communication was introduced more than a decade ago to e...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The stru...
The procedure that predicts the mean information per letter in a long text by adding the constraint ...
We review some recent progress on the characterisation of long-range patterns of word use in languag...