The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as non-independence of words in co-text. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is...
We review some recent progress on the characterisation of long-range patterns of word use in languag...
Words are not isolated entities within a language. In this paper, we measure the number of choices t...
As is the case of many signals produced by complex systems, language presents a statistical structu...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
Recently, it was demonstrated that generalized entropies of order α offer novel and important opport...
Background The language faculty is probably the most distinctive feature of our species, and endows...
We estimate the n-gram entropies of natural language texts in word-length representation and find th...
The goal of this paper is to show the dependency of the entropy of English text on the subject of th...
Original paper can be found at: http://www.aisb.org.uk/publications/proceedings/aisb05/1_EELC_Final....
The language faculty is probably the most distinctive feature of our species, and endows us with a u...
The language faculty is probably the most distinctive feature of our species, ...
<p>For each language, blue bars represent the average entropy of the random ...
The relationship between the entropy of language and its complexity has been the subject of much spe...
AbstractNatural languages have very complicated structures but are highly redundant. Statistical stu...
We review some recent progress on the characterisation of long-range patterns of word use in languag...
Words are not isolated entities within a language. In this paper, we measure the number of choices t...
As is the case of many signals produced by complex systems, language presents a statistical structu...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
The choice associated with words is a fundamental property of natural languages. It lies at the hear...
Recently, it was demonstrated that generalized entropies of order α offer novel and important opport...
Background The language faculty is probably the most distinctive feature of our species, and endows...
We estimate the n-gram entropies of natural language texts in word-length representation and find th...
The goal of this paper is to show the dependency of the entropy of English text on the subject of th...
Original paper can be found at: http://www.aisb.org.uk/publications/proceedings/aisb05/1_EELC_Final....
The language faculty is probably the most distinctive feature of our species, and endows us with a u...
The language faculty is probably the most distinctive feature of our species, ...
<p>For each language, blue bars represent the average entropy of the random ...
The relationship between the entropy of language and its complexity has been the subject of much spe...
AbstractNatural languages have very complicated structures but are highly redundant. Statistical stu...
We review some recent progress on the characterisation of long-range patterns of word use in languag...
Words are not isolated entities within a language. In this paper, we measure the number of choices t...
As is the case of many signals produced by complex systems, language presents a statistical structu...